A. Cavallaro

Queen Mary, University of London, Londinium, England, United Kingdom

Are you A. Cavallaro?

Claim your profile

Publications (115)110.69 Total impact

  • Source
    E. Sariyanidi, H. Gunes, A. Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: Face images in a video sequence should be registered accurately before being analysed, otherwise registration errors may be interpreted as facial activity. Subpixel accuracy is crucial for the analysis of subtle actions. In this paper we present PSTR (Probabilistic Subpixel Temporal Registration), a framework that achieves high registration accuracy. .... .....
    Asian Computer Vision Conference (ACCV'14), Singapore; 11/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The choice of the most suitable fusion scheme for smart cam-era networks depends on the application as well as on the available computational and communication resources. In this paper we discuss and compare the resource requirements of five fusion schemes, namely centralised fusion, flooding, consensus, token passing and dynamic clustering. The Ex-tended Information Filter is applied to each fusion scheme to perform target tracking. Token passing and dynamic clus-tering involve negotiation among viewing nodes (cameras observing the same target) to decide which node should per-form the fusion process whereas flooding and consensus do not include this negotiation. Negotiation helps limiting the number of participating cameras and reduces the required resources for the fusion process itself but requires additional communication. Consensus has the highest communication and computation costs but it is the only scheme that can be applied when not all viewing nodes are connected directly and routing tables are not available.
    8th ACM / IEEE International Conference on Distributed Smart Cameras (ICDSC 2014); 11/2014
  • Source
    Evangelos Sariyanidi, Hatice Gunes, Andrea Cavallaro
    IEEE Transactions on Pattern Analysis and Machine Intelligence 10/2014; · 4.80 Impact Factor
  • Source
    Sophia Bano, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: a b s t r a c t We propose a framework for the automatic grouping and alignment of unedited multi-camera User-Generated Videos (UGVs) within a database. The proposed framework ana-lyzes the sound in order to match and cluster UGVs that capture the same spatio-temporal event and estimate their relative time-shift to temporally align them. We design a descrip-tor derived from the pairwise matching of audio chroma features of UGVs. The descriptor facilitates the definition of a classification threshold for automatic query-by-example event identification. We evaluate the proposed identification and synchronization framework on a database of 263 multi-camera recordings of 48 real-world events and compare it with state-of-the-art methods. Experimental results show the effectiveness of the proposed approach in the presence of various audio degradations. Ó 2014 Published by Elsevier Inc.
    Information Sciences 08/2014; · 3.64 Impact Factor
  • S.F. Tahir, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: Networks of smart cameras share large amounts of data to accomplish tasks such as reidentification. We propose a feature-selection method that minimizes the data needed to represent the appearance of objects by learning the most appropriate feature set for the task at hand (person reidentification). The computational cost for feature extraction and the cost for storing the feature descriptor are considered jointly with feature performance to select cost-effective good features. This selection allows us to improve intercamera reidentification while reducing the bandwidth that is necessary to share data across the camera network. We also rank the selected features in the order of effectiveness for the task to enable a further reduction of the feature set by dropping the least effective features when application constraints require this adaptation. We compare the proposed approach with state-of-the-art methods on the iLIDS and VIPeR datasets and show that the proposed approach considerably reduces network traffic due to intercamera feature sharing while keeping the reidentification performance at an equivalent or better level compared with the state of the art.
    IEEE Transactions on Circuits and Systems for Video Technology 08/2014; 24(8):1362-1374. · 1.82 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Consensus-based target tracking in camera networks faces three major problems: non-linearity in the measurement model, temporary lack of measurements (naivety) due to the limited field of view (FOV) and redundancy in the iterative exchange of information. In this paper we propose two consensus-based distributed algorithms for non-linear systems using the Extended Information Filter as underlying filter to handle the non-linearity in the camera measurement model. The first algorithm is an Extended Information Consensus Filter (EICF) that overcomes the effect of naivety and non-linearity without requiring knowledge of other nodes in the network. The second algorithm is an Extended Information Weighted Consensus Filter (EIWCF) that overcomes all the three major problems (naivety, redundancy and non-linearity) but requires knowledge of the number of cameras (Nc) in the network. The basic principle of these algorithms is weighting node estimates based on their covariance information. When Nc is not available, EICF can be used at the cost of not handling the redundancy problem. Simulations with highly maneuvering targets show that the two proposed distributed non-linear consensus filters outperform the related state of the art by achieving higher accuracy and faster convergence to the centralised estimates computed by simultaneously considering the information from all the nodes.
    17th International Conference on Information Fusion (FUSION), 2014, Salamanca, Spain; 07/2014
  • Juan C. SanMiguel, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an approach for determining the temporal consistency of particle filters in video tracking based on model validation of their uncertainty over sliding windows. The filter uncertainty is related to the consistency of the dispersion of the filter hypotheses in the state space. We learn an uncertainty model via a mixture of Gamma distributions whose optimum number is selected by modified information-based criteria. The time-accumulated model is estimated as the sequential convolution of the uncertainty model. Model validation is performed by verifying whether the output of the filter belongs to the convolution model through its approximated cumulative density function. Experimental results and comparisons show that the proposed approach improves both precision and recall of competitive approaches such as Gaussian-based online model extraction, bank of Kalman filters and empirical thresholding. We combine the proposed approach with a state-of-the-art online performance estimator for video tracking and show that it improves accuracy compared to the same estimator with manually tuned thresholds while reducing the overall computational cost.
    Computer Vision and Image Understanding 07/2014; · 1.23 Impact Factor
  • Syed Fahad Tahir, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose an object matching approach aimed at smartphone cameras that exploits the well-known concept of local sets of features for object representation. We also enable the temporal alignment of cameras by exploiting the frames of detected objects to group objects appeared in the same time interval for the assignment within each camera. The proposed approach does not need training thus making it suitable for matching during short temporal intervals. We use both outdoor and indoor datasets for the evaluation, and show that the proposed method reduces up to 95% the amount of information to be stored and communicated.
    ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 05/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a hybrid personalized summarization framework that combines adaptive fast-forwarding and content truncation to generate comfortable and compact video summaries. We formulate video summarization as a discrete optimization problem, where the optimal summary is determined by adopting Lagrangian relaxation and convex-hull approximation to solve a resource allocation problem. To trade-off playback speed and perceptual comfort we consider information associated to the still content of the scene, which is essential to evaluate the relevance of a video, and information associated to the scene activity, which is more relevant for visual comfort. We perform clip-level fast-forwarding by selecting the playback speeds from discrete options, which naturally include content truncation as special case with infinite playback speed. We demonstrate the proposed summarization framework in two use cases, namely summarization of broadcasted soccer videos and surveillance videos. Objective and subjective experiments are performed to demonstrate the relevance and efficiency of the proposed method.
    IEEE Transactions on Multimedia 02/2014; 16(2):455-469. · 1.75 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Camera networks that reconfigure while performing multiple tasks have unique requirements, such as concurrent task allocation with limited resources, the sharing of data among fields of view across the network, and coordination among heterogeneous devices.
    Computer 01/2014; 47(5):67-73. · 1.68 Impact Factor
  • Tahir Nawaz, Fabio Poiesi, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: To evaluate multi-target video tracking results, one needs to quantify the accuracy of the estimated target-size and the cardinality error as well as measure the frequency of occurrence of ID changes. In this paper we survey existing multi-target tracking performance scores and, after discussing their limitations, we propose three parameter-independent measures for evaluating multi-target video tracking. The measures take into account target-size variations, combine accuracy and cardinality errors, quantify long-term tracking accuracy at different accuracy levels, and evaluate ID changes relative to the duration of the track in which they occur. We conduct an extensive experimental validation of the proposed measures by comparing them with existing ones and by evaluating four state-of-the-art trackers on challenging real-world publicly-available datasets. The software implementing the proposed measures is made available online to facilitate their use by the research community.
    IEEE Transactions on Image Processing 11/2013; · 3.20 Impact Factor
  • Fabio Poiesi, Riccardo Mazzon, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a generic online multi-target track-before-detect (MT-TBD) that is applicable on confidence maps used as observations. The proposed tracker is based on particle filtering and automatically initializes tracks. The main novelty is the inclusion of the target ID in the particle state, enabling the algorithm to deal with unknown and large number of targets. To overcome the problem of mixing IDs of targets close to each other, we propose a probabilistic model of target birth and death based on a Markov Random Field (MRF) applied to the particle IDs. Each particle ID is managed using the information carried by neighboring particles. The assignment of the IDs to the targets is performed using Mean-Shift clustering and supported by a Gaussian Mixture Model. We also show that the computational complexity of MT-TBD is proportional only to the number of particles. To compare our method with recent state-of-the-art works, we include a postprocessing stage suited for multi-person tracking. We validate the method on real-world and crowded scenarios, and demonstrate its robustness in scenes presenting different perspective views and targets very close to each other.
    Computer Vision and Image Understanding 10/2013; 117(10):1257-1272. · 1.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose to use local Zernike Moments (ZMs) for facial affect recog-nition and introduce a representation scheme based on performing non-linear encoding on ZMs via quantization. Local ZMs provide a useful and compact description of im-age discontinuities and texture. We demonstrate the use of this ZM-based representa-tion for posed and discrete as well as naturalistic and continuous affect recognition on standard datasets, and show that ZM-based representations outperform well-established alternative approaches for both tasks. To the best of our knowledge, the performance we achieved on CK+ dataset is superior to all results reported to date.
    British Machine Vision Conference (BMVC), Bristol, UK; 09/2013
  • Source
    Fan Chen, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method for detecting group interactions for groups of varying number of objects. We model each object as a moving agent with a direction-aware interest map and group interactions as mutual interests between objects. After grouping objects into unit interactions individually in each frame, we solve the temporal association problem by tracking group interaction over consecutive frames. Optimal grouping is obtained by finding the maximum weight spanning tree of a directed graph formed by objects and their potential interactions. Experimental results show that our method obtained around 80% recalling rates on two publicly available datasets.
    International Conference on Acoustics, Speech, and Signal Processing; 05/2013
  • Source
    Riccardo Mazzon, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: Tracking across non-overlapping cameras is a challenging open problem in video surveillance. In this paper, we propose a novel target re-identification method that models movements in non-observed areas with a modified Social Force Model (SFM) by exploiting the map of the site under surveillance. The SFM is developed with a goal-driven approach that models the desire of people to reach specific interest points (goals) of the site such as exits, shops, seats and meeting points. These interest points work as attractors for people movements and guide the path predictions in the non-observed areas. We also model key regions that are potential intersections of different paths where people can change the direction of motion. Finally, the predictions are linked to the trajectories observed in the next camera view where people reappear. We validate our multi-camera tracking method on the challenging i-LIDS dataset from the London Gatwick airport and show the benefits of the Multi-Goal Social Force Model.
    Neurocomputing. 01/2013; 100:41–50.
  • Source
    Andrea Cavallaro, Andres Kwasinski
    [Show abstract] [Hide abstract]
    ABSTRACT: Presents an editorial for this issue of IEEE Signal Processing Magazine.
    IEEE Signal Processing Magazine 01/2013; 30(1):4-4. · 3.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This Special Issue offers an overview of ongoing research on intelligent video surveillance (IVS) techniques, and brings together cutting-edge research work on security and privacy problems with respect to technological, behavioral, legal, and cultural aspects. We received 34 submissions and each submission was rigorously reviewed by at least two experts in the related fields based on the criteria of originality, significance, quality, and clarity. Eventually, 12 papers were accepted for the Special Issue, spanning a variety of topics including privacy protection, background modeling, tracking, action/activity analysis, and crowd behavior perception. The papers constituting this issue are then briefly summarized.
    IEEE Transactions on Information Forensics and Security 01/2013; 8(10):1559-1561. · 1.90 Impact Factor
  • Fabio Poiesi, Riccardo Mazzon, Andrea Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a generic online multi-target track-before-detect (MT-TBD) that is applicable on confidence maps used as observations. The proposed tracker is based on particle filtering and automatically initializes tracks. The main novelty is the inclusion of the target ID in the particle state, enabling the algorithm to deal with unknown and large number of targets. To overcome the problem of mixing IDs of targets close to each other, we propose a probabilistic model of target birth and death based on a Markov Random Field (MRF) applied to the particle IDs. Each particle ID is managed using the information carried by neighboring particles. The assignment of the IDs to the targets is performed using Mean-Shift clustering and supported by a Gaussian Mixture Model. We also show that the computational complexity of MT-TBD is proportional only to the number of particles. To compare our method with recent state-of-the-art works, we include a postprocessing stage suited for multi-person tracking. We validate the method on real-world and crowded scenarios, and demonstrate its robustness in scenes presenting different perspective views and targets very close to each other.
    Computer Vision and Image Understanding 01/2013; 117(10):1257–1272. · 1.23 Impact Factor
  • R. Mazzon, F. Poiesi, A. Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a method to detect and track interacting people by employing a framework based on a Social Force Model (SFM). The method embeds plausible human behaviors to predict interactions in a crowd by iteratively minimizing the error between predictions and measurements. We model people approaching a group and restrict the group formation based on the relative velocity of candidate group members. The detected groups are then tracked by linking their interaction centers over time using a buffered graph-based tracker. We show how the proposed framework outperforms existing group localization techniques on three publicly available datasets, with improvements of up to 13% on group detection.
    Advanced Video and Signal Based Surveillance (AVSS), 2013 10th IEEE International Conference on; 01/2013
  • T Nawaz, A Cavallaro
    [Show abstract] [Hide abstract]
    ABSTRACT: The absence of a commonly adopted performance evaluation framework is hampering advances in the design of effective video trackers. In this paper, we present a single-score evaluation measure and a protocol to objectively compare trackers. The proposed measure evaluates tracking accuracy and failure, and combines them for both summative and formative performance assessment. The proposed protocol is composed of a set of trials that evaluate the robustness of trackers on a range of test scenarios representing several real-world conditions. The protocol is validated on a set of sequences with a diversity of targets (head, vehicle, person) and challenges (occlusions, background clutter, pose changes, scale changes) using six state-of-the-art trackers, highlighting their strengths and weaknesses on more than 187000 frames. The software implementing the protocol and the evaluation results are made available online and new results can be included, thus facilitating the comparison of trackers.
    IEEE Transactions on Image Processing 11/2012; · 3.20 Impact Factor

Publication Stats

850 Citations
110.69 Total Impact Points

Institutions

  • 2004–2014
    • Queen Mary, University of London
      • • Centre for Cell Signalling
      • • School of Electronic Engineering and Computer Science
      Londinium, England, United Kingdom
  • 2010
    • Universidad Autónoma de Madrid
      • High Technical College
      Madrid, Madrid, Spain
    • Stanford University
      Palo Alto, California, United States
  • 2005–2009
    • University of London
      Londinium, England, United Kingdom
    • WWF United Kingdom
      Londinium, England, United Kingdom
  • 1999–2004
    • École Polytechnique Fédérale de Lausanne
      • Laboratoire de traitement des signaux
      Lausanne, VD, Switzerland
  • 2000
    • Eawag: Das Wasserforschungs-Institut des ETH-Bereichs
      Duebendorf, Zurich, Switzerland