Pavel Zezula

Pavel Zezula
Masaryk University | MUNI · Faculty of Informatics

Professor

About

256
Publications
27,328
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,283
Citations

Publications

Publications (256)
Chapter
In the context of contemporary data, the processing of information is crucial. This paper proposes an extension to the traditional database relational algebra, which enriches the data model and provides additional complex-data operations. Specifically, we focus on analytical operators from the areas of data mining and similarity search, such as fre...
Chapter
Recent pose-estimation methods enable digitization of human motion by extracting 3D skeleton sequences from ordinary video recordings. Such spatio-temporal skeleton representation offers attractive possibilities for a wide range of applications but, at the same time, requires effective and efficient content-based access to make the extracted data r...
Article
Full-text available
Multi-subject tracking in crowded videos is an established yet challenging research direction in computer vision and information processing. High applicability of multi-subject tracking is demonstrated in smart cities (e.g., public safety, crowd management, urban planning), autonomous driving vehicles, robotic vision, or psychology (e.g., social in...
Article
Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triplet of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be determined using the triangle inequality property. Obviously, tightness of the bounds is crucial for efficiency...
Article
Full-text available
The popular task of 3D human action recognition is almost exclusively solved by training deep-learning classifiers. To achieve high recognition accuracy, input 3D actions are often pre-processed by various normalization or augmentation techniques. However, it is not computationally feasible to train a classifier for each possible variant of trainin...
Chapter
With the growth of structured graph data, the analysis of networks is an important topic. Community mining is one of the main analytical tasks of network analysis. Communities are dense clusters of nodes, possibly containing additional information about a network. In this paper, we present a community-detection approach, called FIMSIM, which is bas...
Chapter
Contemporary challenges for efficient similarity search include complex similarity functions, the curse of dimensionality, and large sizes of descriptive features of data objects. This article reports our experience with a database of protein chains which form (almost) metric space and demonstrate the following extreme properties. Evaluation of the...
Article
Full-text available
With the development of motion capture technologies, 3D action recognition has become a popular task that finds great applicability in many areas, such as augmented reality, human–computer interaction, sports, or healthcare. On the other hand, the acquisition of 3D human skeleton data is an expensive and time-consuming process, mainly due to the hi...
Article
With the increasing availability of human motion data captured in the form of 2D or 3D skeleton sequences, more complex motion recordings need to be processed. In this paper, we focus on similarity-based indexing and efficient retrieval of motion episodes — medium-sized skeleton sequences that consist of multiple semantic actions and correspond to...
Chapter
This chapter focuses on data searching, which is nowadays mostly based on similarity. The similarity search is challenging due to its computational complexity, and also the fact that similarity is subjective and context dependent. The authors assume the metric space model of similarity, defined by the domain of objects and the metric function that...
Article
Full-text available
Digitization of human motion using skeleton representations offers exciting possibilities for a large number of applications but, at the same time, requires innovative techniques for their effective and efficient processing. Content-based processing of skeleton data has developed rapidly in recent years, focusing mainly on specialized prototypes wi...
Chapter
Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triple of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be set as the difference and the sum of these two already known distances, due to the triangle inequality rule of th...
Chapter
In data science, the process of development focuses on the improvement of methods for individual data analytical tasks. However, their combination is not properly researched. We believe that this situation is caused by a missing framework, that would focus solely on data analytical tasks, instead of complicated transformation between individual met...
Article
Scalable similarity search in metric spaces relies on using the mathematical properties of the space in order to allow efficient querying. Most important in this context is the triangle inequality property, which can allow the majority of individual similarity comparisons to be avoided for a given query. However many important metric spaces, typica...
Preprint
The popular task of 3D human action recognition is almost exclusively solved by training deep-learning classifiers. To achieve a high recognition accuracy, the input 3D actions are often pre-processed by various normalization or augmentation techniques. However, it is not computationally feasible to train a classifier for each possible variant of t...
Chapter
There is a growing amount of human motion data captured as a continuous 3D skeleton sequence without any information about its semantic partitioning. To make such unsegmented and unlabeled data efficiently accessible, we propose to transform them into a text-like representation and employ well-known text retrieval models. Specifically, we partition...
Article
Full-text available
Motion capture data digitally represent human movements by sequences of 3D skeleton configurations. Such spatio-temporal data, often recorded in the stream-based nature, need to be efficiently processed to detect high-interest actions, for example, in human-computer interaction to understand hand gestures in real time. Alternatively, automatically...
Chapter
The complexity of contemporary data warrants a need for better analysing tools in investigative areas. Human processing of data is no longer a viable option. We present an architecture of a novel universal system for analysis of graph-structured data, where data-mining and similarity-search operators can be used to discover or search for unknown in...
Chapter
Full-text available
Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations...
Conference Paper
Motion capture technologies can digitize human movements into a discrete sequence of 3D skeletons. Such spatio-temporal data have a great application potential in many fields, ranging from computer animation, through security and sports to medicine, but their computerized processing is a difficult problem. The objective of this tutorial is to expla...
Conference Paper
Motion capture data are digital representations of human movements in form of 3D trajectories of multiple body joints. To understand the captured motions, similarity-based processing and deep learning have already proved to be effective, especially in classifying pre-segmented actions. However, in real-world scenarios motion data are typically capt...
Conference Paper
Motion capture technologies digitize human movements by tracking 3D positions of specific skeleton joints in time. Such spatio-temporal multimedia data have an enormous application potential in many fields, ranging from computer animation, through security and sports to medicine, but their computerized processing is a difficult problem. In this pap...
Article
This article addresses the problem of matching the most similar data objects to a given query object. We adopt a generic model of similarity that involves the domain of objects and metric distance functions only. We examine the case of a large dataset in a complex data space, which makes this problem inherently difficult. Many indexing and searchin...
Conference Paper
Motion capture technologies digitize human movements by tracking 3D positions of specific skeleton joints in time. Such spatio-temporal data have an enormous application potential in many fields, ranging from computer animation, through security and sports to medicine, but their computerized processing is a difficult problem. The recorded data can...
Chapter
Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with differe...
Chapter
Real-time recommendation is a necessary component of current social applications. It is responsible for suggesting relevant newly published data to the users based on their preferences. By representing the users and the published data in a metric space, each user can be recommended with their k nearest neighbors among the published data, i.e., the...
Conference Paper
An important functionality of current social applications is real-time recommendation, which is responsible for suggesting relevant published data to the users based on their preferences. By representing the users and the published data in a metric space, each user can be recommended with their k nearest neighbors among the published data. We consi...
Article
Full-text available
Motion capture data describe human movements in the form of spatio-temporal trajectories of skeleton joints. Intelligent management of such complex data is a challenging task for computers which requires an effective concept of motion similarity. However, evaluating the pair-wise similarity is a difficult problem as a single action can be performed...
Article
Full-text available
Multimedia information is becoming an ubiquitous part of our lives, which brings an equally ubiquitous need for efficient multimedia retrieval. One of the possible solutions to this problem is to attach text descriptions to multimedia data objects, thus allowing users to utilize traditional text search mechanisms. Search-based annotation techniques...
Article
Motion capture data digitally represent human movements by sequences of body configurations in time. Subsequence searching in long sequences of such spatio-temporal data is difficult as query-relevant motions can vary in execution speeds and styles and can occur anywhere in a very long data sequence. To deal with these problems, we employ a fast an...
Chapter
Current era of digital data explosion calls for employment of content-based similarity search techniques, since traditional searchable metadata like annotations are not always available. In our work, we focus on a scenario where the similarity search is used in the context of stream processing, which is one of the suitable approaches to deal with h...
Conference Paper
In order to accelerate efficiency of similarity search, compact bit-strings compared by the Hamming distance, so called sketches, have been proposed as a form of dimensionality reduction. To maximize the data compression and, at the same time, minimize the loss of information, sketches typically have the following two properties: (1) each bit divid...
Conference Paper
Content-based similarity search techniques have been employed in a variety of today applications. In our work, we aim at the scenario when the similarity search is applied in the context of stream processing. In particular, there is a stream of query objects which need to be evaluated. Our goal is to be able to cope with the rate of incoming query...
Conference Paper
Current era of digital data explosion calls for employment of content-based similarity search techniques since traditional searchable metadata like annotations are not always available. In our work, we focus on a scenario where the similarity search is used in the context of stream processing, which is one of the suitable approaches to deal with hu...
Conference Paper
Motion capture data digitally represent human movements by sequences of body configurations in time. Subsequence matching in such spatio-temporal data is difficult as query-relevant motions can vary in lengths and occur arbitrarily in a very long motion. To deal with these problems, we propose a new subsequence matching approach which (1) partition...
Chapter
Large-scale data management and retrieval in complex domains such as images, videos, or biometrical data remains one of the most important and challenging information processing tasks. Even after two decades of intensive research, many questions still remain to be answered before working tools become available for everyday use. In this work, we foc...
Article
Full-text available
The current explosion of multimedia data is significantly increasing the amount of potential knowledge. However, to get to the actual information requires to apply novel content-based techniques which in turn require time consuming extraction of indexable features from the raw data. In order to deal with large datasets, this task needs to be parall...
Conference Paper
Motion capture data digitally represent human movements by sequences of body configurations in time. Searching in such spatio-temporal data is difficult as query-relevant motions can vary in lengths and occur arbitrarily in the very long data sequence. There is also a strong requirement on effective similarity comparison as the specific motion can...
Conference Paper
Efficient object retrieval based on a generic similarity is one of the fundamental tasks in the area of information retrieval. We propose an enhancement for techniques that use the distance-based model of similarity. This enhancement is based on sketches–compact bit strings compared by the Hamming distance which represent data objects from the orig...
Article
The rapid growth of unstructured data, commonly denoted as the Big Data challenge, requires new technologies that are capable of dealing with complex data objects such as multimedia. In this work, the authors focus on the content-based retrieval approach, which is able to organize such data by exploiting the similarity of data content. In particula...
Conference Paper
A lot of multimedia data are being created nowadays, which can only be searched by content since no searching metadata are available for them. To make the content search efficient, similarity indexing structures based on the metric-space model can be used. In our work, we focus on a scenario where the similarity search is used in the context of str...
Conference Paper
Though searching is already the most frequently used application of information technology today, similarity approach to searching is increasingly playing more and more important role in construction of new search engines. In the last twenty years, the technology has matured and many centralized, distributed, and even peer-to-peer architectures hav...
Chapter
Many current applications need to organize data with respect to mutual similarity between data objects. A typical general strategy to retrieve objects similar to a given sample is to access and then refine a candidate set of objects. We propose an indexing and search technique that can significantly reduce the candidate set size by combination of s...
Conference Paper
Nowadays, a lot of data are produced every second and they need to be processed immediately. Processing such unbounded streams of data is often run in a distributed environment in order to achieve high throughput. The challenge is the ability to predict the performance-related characteristics of such applications. Knowledge of these properties is e...
Conference Paper
Capturing human movement activities through various sensor technologies is becoming more and more important in entertainment, film industry, military, healthcare or sports. The Microsoft Kinect is an example of low-cost capturing technology that enables to digitize human movement into a 3D motion representation. However, the accuracy of this repres...
Conference Paper
The objective of face retrieval is to efficiently search an image database with detected faces and identify such faces that belong to the same person as a query face. Unlike most related papers, we concentrate on both retrieval effectiveness and efficiency. High retrieval effectiveness is achieved by proposing a new fusion approach which integrates...
Conference Paper
The rapid development of motion capturing technologies has caused a massive usage of human motion data in a variety of fields, such as computer animation, gaming industry, medicine, sports and security. These technologies produce large volumes of complex spatio-temporal data which need to be effectively compared on the basis of similarity. In contr...
Conference Paper
Full-text available
The importance of automatic image annotation as a tool for handling large amounts of image data has been recognized for several decades. However, working tools have long been limited to narrow-domain problems with a few target classes for which precise models could be trained. With the advance of similarity searching, it now becomes possible to emp...
Conference Paper
Nowadays, a lot of data is produced every second and it needs to be processed immediately. Processing such unbounded streams of data is often applied in a distributed environment in order to achieve high throughput. There is a challenge to predict the performance-related characteristics of such applications. Knowledge of these properties is essenti...
Conference Paper
Full-text available
To be presented at SISAP '15. We present an efficiency evaluation of similarity search tech- niques applied on visual features from deep neural networks. Our test collection consists of 20 million 4096-dimensional descriptors (320GB of data). We test approximate k-NN search using several techniques, specifically FLANN library (a popular in-memory...
Article
The development of motion capturing devices poses new challenges in the exploitation of human-motion data for various application fields, such as computer animation, visual surveillance, sports, or physical medicine. Recently, a number of approaches dealing with motion data have been proposed, suggesting characteristic motion features to be extract...
Conference Paper
Analysis of contemporary Big Data collections require an effective and efficient content-based access to data which is usually unstructured. This first implies a necessity to uncover descriptive knowledge of complex and heterogeneous objects to make them findable. Second, multimodal search structures are needed to efficiently execute complex simila...
Article
Full-text available
Sub-image content-based similarity search forms an important operation in current image archives since it provides users with images that contain a query image as their part. Such a search can conveniently be implemented using the bag-of-features model. Its integral part is a construction of visual vocabulary. Most existing algorithms to create a v...
Conference Paper
Full-text available
The current explosion of data accelerated evolu tion of various content-based indexing techniques that allow to efficiently search in multimedia data such as images. However, indexable features must be first extracted from the raw images before the indexing. This necessary step can be very time consuming for large datasets thus parallelization is d...
Article
Analysis of contemporary Big Data collections require an effective and efficient content-based access to data which is usually unstructured. This first implies a necessity to uncover descriptive knowledge of complex and heterogeneous objects to make them findable. Second, multimodal search structures are needed to efficiently execute complex simila...
Article
Full-text available
In spite of the development of content-based data management, text-based searching remains the primary means of multimedia retrieval in many areas. Automatic creation of text metadata is thus a crucial tool for increasing the findability of multimedia objects. Search-based annotation tools try to provide content-descriptive keywords by exploiting w...
Article
Full-text available
This paper constitutes an extension to the report on DISA-MU team participation in the ImageCLEF 2014 Scalable Concept Image Annotation Task as published in [3]. Specifically, we introduce a new similarity search component that was implemented into the system, report on the results achieved by utilizing this component, and analyze the influence of...
Conference Paper
Full-text available
This paper presents an annotation tool developed by the DISA Laboratory for the ImageCLEF 2014 Scalable Concept Image An-notation challenge. Our solution exploits the search-based annotation paradigm and utilizes several sources of semantic information to deter-mine the relevance of candidate concepts. Rather than relying on the quality of training...
Conference Paper
Full-text available
Many current applications need to organize data with respect to mutual similarity between data objects. Generic similarity retrieval in large data collections is a tough task that has been drawing researchers' attention for two decades. A typical general strategy to retrieve the most similar objects to a given example is to access and then refine a...
Conference Paper
Full-text available
The development of motion capturing devices like Microsoft Kinect poses new challenges in the exploitation of human-motion data for various application fields, such as computer animation, visual surveillance, sports or physical medicine. In such applications, motion segmentation is recognized as one of the most fundamental steps. Existing methods u...
Article
The general trend in data management is to outsource data to 3rd party systems that would provide data retrieval as a service. This approach naturally brings privacy concerns about the (potentially sensitive) data. Recently, quite extensive research has been done on privacy-preserving outsourcing of traditional exact-match and keyword search. Howev...
Conference Paper
Full-text available
Effective management of multimedia data is becoming vital for success in the modern era of omnipresent data. Summarization tools, which allow users to quickly get the gist of a given data collection and have proven their usefulness in text domain, are now gaining popularity also in multimedia processing. However, existing algorithms provide visual-...
Chapter
The Encyclopedia of Databases, a comprehensive work, provides easy access to relevant information on all aspects of very large databases. This encyclopedia features alphabetical organization of concepts covering main areas of very large databases. These 1000 entries offer convenient access to information in the field of databases with definitions a...
Conference Paper
Analysis of human motion data is an important task in many research fields such as sports, medicine, security, and computer animation. In order to fully exploit motion databases for further processing, effective and efficient retrieval methods are needed. However, such task is difficult primarily due to complex spatio-temporal variances of individu...
Conference Paper
Full-text available
Unprecedented amounts of digital data are becoming available nowadays, but frequently the data lack some semantic information necessary to effectively organize these resources. For images in particular, textual annotations that represent the semantics are highly desirable. Only a small percentage of images is created with reliable annotations, ther...