
Tamás SzirányiInstitute for Computer Science and Control of the Hungarian Academy of Sciences (MTA SZTAKI) · Machine Perception Research Laboratory (earlier Distributed Events Analysis Research Lab)
Tamás Szirányi
MSc ElEng. 1980, dr.Techn 1983, PhD 1991, DSc 2001
About
218
Publications
22,683
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,832
Citations
Introduction
Grants and funding - Current projects:
• National co-leader of ProActive (FP7, 2012-14, 340 kEu / 36 months) It integrates a host of novel technologies enabling the fusion of multi-sensor data with contextual information.
• PI: “Finding Focus of Interest in freely configured sensor networks” , Hungarian Research Fund (2013-2015, cc.110kEu)
Selected past projects:
• EDA project “MEDUSA” (“Multi sEnsor Data fusion grid for Urban Situational Awareness”, 2009-2011, 250 kEu / 30 months)
Additional affiliations
September 2004 - present
September 1992 - September 2010
September 1991 - January 2016
Education
September 1976 - August 1980
Technical University of Budapest
Field of study
- Electrical Engineering
Publications
Publications (218)
Most of the 3D LIDAR sensors used in autonomous driving have significantly lower frame rates than modern cameras equipped to the same vehicle. This paper proposes a solution to virtually increase the frame rate of the LIDARs utilizing a mono camera, making possible the monitoring of dynamic objects with fast movement in the environment. First, dyna...
In the last decade, Light Detection and Ranging (LIDAR) became a leading technology of detailed and reliable 3D environment perception. This paper gives an overview of the wide applicability of LIDAR sensors from the perspective of signal processing for autonomous driving, including dynamic and static scene analysis, mapping, situation awareness wh...
Simultaneous Localization and Mapping is widespread in both robotics and autonomous driving. This paper proposes a novel method to identify changes in maps constructed by SLAM algorithms without feature-to-feature comparison. We use ICP-like algorithms to match frames and pose graph optimization to solve the SLAM problem. Finally, we analyze the re...
Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10...
A general-purpose no-reference video quality assessment algorithm based on a long short-term memory (LSTM) network and a pretrained convolutional neural network (CNN) is introduced. Considering video sequences as a time series of deep features extracted with the help of a CNN, an LSTM network is trained to predict subjective quality scores. In cont...
In this paper, we propose a novel algorithm to compute the initial structure of pose-graph based Simultaneous Localization and Mapping (SLAM) systems. We perform a Breadth-First Search (BFS) on the graph in order to obtain multiple votes regarding the location of a certain robot position from all of its previously processed neighbors. Next, we defi...
Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content, and annotating it accurately. We present a systematic and scalable approach to create KonIQ-10k, the largest IQA dataset to date consisting of 10,0...
LIDAR sensors enable object and free-space detection for intelligent transportation systems and vehicles. This paper proposes a recognition method for LIDARs based on only a few detection planes. This method is useful especially in the case when the angular resolution of the scan is sufficient, but in the vertical direction, the planes are far from...
Environment analysis of automatic vehicles needs the detection from 3-D point cloud information. This paper addresses this task when only partial scanning data are available. Our method develops the detection capabilities of autonomous vehicles equipped with 3-D range sensors for navigation purposes. In industrial practice, the safety scanners of a...
Wireless sensor networks and ad-hoc networks are gaining popularity rapidly due to their ability to solve challenging problems and the fact that thanks to recent technological advancements it is now possible to build smarter and denser networks. For example, they serve as the
basis of the Internet of Things. Naturally, it is in the users best inter...
Wetlands play a major role in Europe’s biodiversity. Despite their importance, wetlands are suffering from constant degradation and loss, therefore, they require constant monitoring. This article presents an automatic method for the mapping and monitoring of wetlands based on the fused processing of laser scans and multispectral satellite imagery,...
This paper deals with the colorization of grayscale images. Recent papers have shown remarkable results on image colorization utilizing various deep architectures. Unlike previous methods, we perform colorization using a deep architecture and a reference image. Our architecture utilizes two parallel Convolutional Neural Networks which have the same...
This paper deals with automatic cartoon colorization. This is a hard issue, since it is an ill-posed problem that usually requires user intervention to achieve high quality. Motivated by the recent successes in natural image colorization based on deep learning techniques, we investigate the colorization problem at the cartoon domain using Convoluti...
Detecting different categories of objects in an image and video content is one of the fundamental tasks in computer vision research. Pedestrian detection is a hot research topic, with several applications including robotics, surveillance and automotive safety. We address the problem of detecting pedestrians in surveillance videos. In this paper, we...
Shadow detection is an important preprocessing task and a hot topic in computer vision. There exist numerous applications which vary in their motivations to address shadows in acquired digital images and video. For example, in video surveillance [1], [2], aerial exploitation [3], and traffic monitoring [4] shadows are usually mentioned as harmful e...
Pedestrian detection is a fundamental computer vision task with many practical applications in robotics, video surveillance, autonomous driving, and automotive safety. However, it is still a challenging problem due to the tremendous variations in illumination, clothing, color, scale, and pose. The aim of this paper to present our dynamic pedestrian...
This paper introduces a novel aerial building detection method based on region orientation as a new feature, which is used in various steps throughout the presented framework. As building objects are expected to be connected with each other on a regional level, exploiting the main orientation obtained from the local gradient analysis provides furth...
Registration of multi-modal remote sensing images is an essential and challenging task in different remote sensing applications such as image fusion and multi-temporal change detection. Mutual Information (MI) has shown to be successful similarity measure for multi-modal image registration applications, however it has some drawbacks. 1. MI surface...
In this paper we present a new method for fault extraction in seismic blocks, using marked point processes. Our goal is to increase the detection accuracy of the state of the art fault attributes by computing them on a system of objects based on an a priori knowledge about the faults.
An original curved support has been developed to describe the fa...
In this paper, we give a comparative study on three Multilayer Markov Random Field (MRF) based solutions proposed for change detection in optical remote sensing images, called Multicue MRF, Conditional Mixed Markov model, and Fusion MRF. Our purposes are twofold. On one hand, we highlight the significance of the focused model family and we set them...
Multimedia indexing systems aim at providing easy, fast and accurate access to large multimedia repositories. Research in Content-Based Multimedia Indexing covers a wide spectrum of topics in content analysis, content description, content adaptation and content retrieval. Various tools and techniques from different fields such as data indexing, mac...
Detecting changes in remote sensing images taken at different times is challenging when images' data come from different sensors. The performance of change detection algorithms based on radiometric values alone is not satisfactory and need the fusion of other features. Local similarity measures such as Mutual Information, Kullback-Leibler Divergenc...
Querying of nearest neighbour (NN) elements on large data collections is an important task for several information or content retrieval tasks. In the paper Local Hash-indexing tree (LHI-tree) is introduced, which is a disk-based index scheme that uses RAM for quick space partition localization and hard disks for the hash indexing. When large collec...
We present a new technique for detecting objects thrown over a critical area of interest in a video sequence made by a monocular camera. Our method was developed to run in real time in an outdoor surveillance system. Unlike others, we use an optical flow based motion detection and tracking system to detect the object's trajectories and for paraboli...
Classifying segments and detecting changes in terrestrial areas are important and time-consuming efforts for remote sensing image analysis tasks, including comparison and retrieval in repositories containing multitemporal remote image samples for the same area in very different quality and details. We propose a multilayer fusion model for adaptive...
This paper presents a method based on graph behaviour analysis for the evaluation of descriptor graphs (applied to image/video datasets) for descriptor performance analysis and ranking. Starting from the Erdős-Rényi model on uniform random graphs, the paper presents results of investigating random geometric graph behaviour in relation with the appe...
Abstract. Recently the observation of surveillanced areas scanned by multi-camera systems is getting more and more popular. The newly developed sensors give new opportunities for exploiting novel features.
Using the information gained from a conventional camera we have data about the colours, the shape of objects and the micro-structures; and we ha...
In this paper, we introduce a complex approach on 4D reconstruction of dynamic scenarios containing multiple walking pedestrians. The input of the process is a point cloud sequence recorded by a rotating multi-beam Lidar sensor, which monitors the scene from a fixed position. The output is a geometrically reconstructed and textured scene containing...
In this demo we present a system for creation and
visualization of mixed reality by combining the spatio-temporal model of a real outdoor environment with the models of people acting in a studio. We use a LIDAR sensor to measure a scene with walking pedestrians, detect and track them, then reconstruct the static scene part. The scene is then modifi...
In this article we address the problem of visual people localization, based on the detection of their feet. Localization is based on searching cone intersections. The altitude of location is also retrieved, which eliminates the need of planar ground - which is a common restriction in the related literature. We found that positions can be computed a...
In this paper we introduce a graph clustering method based on dense bipartite subgraph mining. The method applies a mixed graph model (both standard and bipartite) in a three-phase algorithm. First a seed mining method is applied to find seeds of clusters, the second phase consists of refining the seeds, and in the third phase vertices outside the...
This paper reports on a pilot system for reconstruction and visualisation of complex spatio-temporal scenes by integrating two different types of data: outdoor 4D data measured by a rotating multi-beam LIDAR sensor, and 4D models of moving actors obtained in a 4D studio. A typical scenario is an outdoor scene with multiple walking pedestrians. The...
In the field of multi-view people localization, only a few works consider a non-planar ground surface. In this article we introduce a framework for collecting ground truth data in such case, we show characterization of specific errors and introduce a method to automatically merge multiple ground truth data generated by different users to form a mor...
This letter addresses the automatic detection of urban area in remotely sensed images. As manual administration is time consuming and unfeasible, researchers have to focus on automated processing techniques, which can handle various image characteristics and huge amount of data. The applied method extracts feature points in the first step, which is...
Classifying segments and detection of changes in terrestrial areas are important and time-consuming efforts for remote-sensing image repositories. Some country areas are scanned frequently (e.g. year-by-year) to spot relevant changes, and several repositories contain multi-temporal image samples for the same area in very different quality and detai...
Demonstration will focus on the content based retrieval of Wikipedia images (Hungarian version). A mobile application for iOS will be used to gather images and send directly to the crossmodal processing framework. Searching is implemented in a high performance hybrid index tree with total ~500k entries. The hit list is converted to wikipages and or...
The aim of this paper is to exploit orientation information of an urban area for extracting building contours without shape templates.
Unlike using shape templates, these given contours describe more variability and reveal the fine details of the building outlines, resulting
in a more accurate detection process, which is beneficial for many tasks,...
The goal of our new e-science platform is to support collaborative research communities by providing a simple solution to jointly develop semantic- and media search algorithms on common and challenging datasets processed by novel feature extractors. Querying of nearest neighbor (NN) elements on large data collections is an important task for severa...
We show a method to detect accurate 3D position of people from multiple views, regardless of the geometry of the ground. In our new method we search for intersections of 3D primitives (cones) to find positions of feet. The cones are computed by back-projecting ellipses covering feet in input images. Instead of computing complex intersection body, w...
In this paper, we propose a probabilistic approach for foreground segmentation in 360°-view-angle range data sequences, recorded by a rotating multi-beam Lidar sensor, which monitors the scene from a fixed position. To ensure real-time operation, we project the irregular point cloud obtained by the Lidar, to a cylinder surface yielding a depth imag...
We introduce a novel approach for saliency detection where we fuse perceptional saliency with machine saliency in a statistical approach. The improvement of our fused algorithm against other methods is presented. Human saliency is recorded from human eye movement during free view training. The transition movements caused by the saccades are evaluat...
We present a near-real-time visual-processing approach for automatic airborne target detection and classification. Detection is based on fast and robust background modeling and shape extraction, while recognition of target classes is based on shape and texture-fused querying on a-priori built real datasets. The presented approach can be used in def...
The paper presents a random graph based analysis approach for evaluating descriptors based on pairwise distance distributions on real data. Starting from the Erdős-Rényi model the paper presents results of investigating random geometric graph behaviour in relation with the appearance of the giant component as a basis for choosing descriptors based...
Current development of security sensor networks and their processing algorithms use pre-recorded or abstract data streams for testing, often missing important ground truth for validation. This paper proposes a simulation-based test bed, presenting an approach to use commercial off the shelf virtual reality environments to create adaptive simulation...
This paper presents a distributed multi sensor data processing and fusion system providing sophisticated surveillance capabilities in the urban environment. The system enables visual/non-visual event detection, situation assessment, and semantic event-based reasoning for force protection and civil surveillance applications. The novelties lie in the...
A mixed graph theoretic model is proposed for finding communities in a social network. Information on the habits (shopping habits, free time activities) is considered to be known at least for part of the society. The presented model is based on applying parallelly a standard and a bipartite graph. Compared to previous methods, the introduced algori...
This paper introduces a novel method to detect structural changes between MRI scans, without using prior knowledge. After a simple registration step, the method calculates a difference image, based on modified Harris saliency function, which is then used to define change candidates. Localization step filters out false hits with local contour descri...
The goal of this paper is to extract automatically the building contours regardless of shape. By extracting these contours, detection
results will be more accurate, giving useful information about urban area, which is important for many tasks, like map updating and
disaster management. First, we extract local feature points from the image, based on...
Deformable active contour (snake) models are efficient tools for object boundary detection. Existing alterations of the traditional gradient vector flow (GVF) model have reduced sensitivity to noise, parameters and initial location, but high curvatures and noisy, weakly contrasted boundaries cause difficulties for them.This paper introduces two Har...
In this paper we address the problem of estimating the horizontal vanishing line, making use of motion statistics derived from a video sequence. The computation requires the satisfying of a number of corresponding object's height measurement; and in our approach these are extracted using motion statistics. These easy-to-compute statistics enable ac...
In this paper a new decomposition method is introduced that splits the image into geometric (or cartoon) and texture parts. Following a total variation based preprocesssing, the core of the proposed method is an anisotropic diffusion with an orthogonality based parameter estimation and stopping condition. The quality criterion is defined by the the...
This paper presents the first steps towards an automated image and video feature descriptor evaluation framework, based on several points of view. First, evaluation of distance distributions of images and videos for several descriptors are performed, then a graph-based representation of database contents and evaluation of the appearance of the gian...
We address the statistical inference of saliency features in the images based on human eye-tracking measurements. Training videos were recorded by a head-mounted wearable eye-tracker device, where the position of the eye fixation relative to the recorded image was annotated. From the same video records, artificial saliency points (SIFT) were measur...
The present paper addresses the cartoon/texture decomposition task, offering theoretically clear solutions for the main issues of adaptivity, structure enhancement and the quality criterion of the goal function. We apply Anisotropic Diffusion with a Total Variation based adaptive parameter estimation and automatic stopping condition. Our quality me...
In this paper we introduce a novel surveillance system, which uses 3D information extracted from multiple cameras to detect, track and re-identify people. The detection method is based on a 3D Marked Point Process model using two pixel-level features extracted from multi-plane projections of binary foreground masks, and uses a stochastic optimizati...
Parametric active contours are efficient tools for boundary detection. However, existing external-energy-inspired methods have difficulties when detecting high curvature, noisy or low contrasted contours and they often suffer from initialization sensitivity. To address these issues, this paper introduces Harris-based Vector Field Convolution (HVFC)...
In this paper we present a method for reconstructing static scene viewed through thick smoke using multiple images. Based on spatiotemporal statistical approach our method works well on noisy videos containing swirling smoke. We apply statistical analysis on regions of color input images, and show the way to reconstruct scene by transforming images...
This paper presents visual detection and recognition of flying targets
(e.g. planes, missiles) based on automatically extracted shape and
object texture information, for application areas like alerting,
recognition and tracking. Targets are extracted based on robust
background modeling and a novel contour extraction approach, and object
recognition...
The stopping condition is a common problem for non-regularised deconvolution methods. Introduced is an automatic procedure for estimating the ideal stopping point based on a new measure of independence, checking an orthogonality criterion of the estimated signal and its gradient at a given iteration. An effective lower bound estimate than the conve...
In this paper we introduce a novel method for building localization and 2D outline extraction in remotely sensed images. A robust Marked Point Process (MPP) model attempts to detect and separate the individual building segments and gives a rough rectangular estimation about the geometry of each entity. The refinement of the detection is achieved by...
We introduce a new method for image segmentation tasks by using dense subgraph mining algorithms. The main advantage of the present solution is to treat the out-of-focus, noise and corruption problems in one unified framework, by introducing a theoretically new image segmentation method based on graph manipulation. This demonstrated development is...
In this paper, we present an algorithm for estimating the eye movement looking at a picture. Our solution is based on human data measured by a wearable eye tracker device which is able to record the user's eye movement during the record. From the same video streams, measuring the artificial salient points on the image by machine vision algorithms,...
The paper introduces a novel methodology to find changes in remote sensing image series. Some remotely sensed areas are scanned
frequently to spot relevant changes, and several repositories contain multi-temporal image samples for the same area. The
proposed method finds changes in images scanned by a long time-interval difference in very different...