Conference PaperPDF Available

An Unsupervised Approach to Anonymous Crowd Monitoring

Authors:
  • Graphcore Ltd.
  • ICSRiM - University of Leeds

Abstract

With over 4.2 million CCTV cameras in the UK alone, it would be useful have to an automated system to monitor over wide, open spaces. In such areas, we can observe emergent behaviour in crowd movements, often changing dynamicall over time as crowds form and disperse. Trained operators can often notice trouble the moment, if not before, it happens. Unfortunately, the high number of cameras watching over the public has generated a feeling of unease within the populus as people feel increasingly that their privacy is being invaded. We propose a system that, using an offline, unsupervised learning process, will anonymously detect patterns of motion within a scene and describe it as usual or unusual. The system is trained on footage of the scene, recorded using a single camera. Flow is detected using the KLT tracker [3] to accumulate, at chosen granularity, 'track-lets' [1] of elemental motion. These tracks are quantised and their distribution in spatial and temporal windows around each pixel clustered to generate acceptable patterns. In scenes with changing behaviour, there may be several candidates at each position. During testing, similarly generated patterns are tested for plausibility by proximity to acceptable clusters. Early results show promise and may be tuned via various parameters of the system.
An Unsupervised Approach to Anonymous
Crowd Monitoring
Ian Hales, Roger Boyle, Kia Ng
School of Computing,
University of Leeds, LS2 1HE
{i.j.hales06, r.d.boyle, k.c.ng}@leeds.ac.uk
March 29, 2010
1 Abstract
With over 4.2 million CCTV cameras in the UK alone [2], it would be useful have
to an automated system to monitor over wide, open spaces. In such areas, we can
observe emergent behaviour in crowd movements, often changing dynamicall over time
as crowds form and disperse. Trained operators can often notice trouble the moment,
if not before, it happens. Unfortunately, the high number of cameras watching over the
public has generated a feeling of unease within the populus as people feel increasingly
that their privacy is being invaded.
We propose a system that, using an offline, unsupervised learning process, will
anonymously detect patterns of motion within a scene and describe it as usual or un-
usual. The system is trained on footage of the scene, recorded using a single camera.
Flow is detected using the KLT tracker [3] to accumulate, at chosen granularity, ‘track-
lets’ [1] of elemental motion.
These tracks are quantised and their distribution in spatial and temporal windows
around each pixel clustered to generate acceptable patterns. In scenes with changing
behaviour, there may be several candidates at each postion. During testing, similarly
generated patterns are tested for plausibility by proximity to acceptable clusters.
Early results show promise and may be tuned via various parameters of the system.
References
[1] Hannah M. Dee, David C. Hogg, and Anthony G. Cohn. Scene modelling and
classification using learned spatial relations. In Kathleen S. Hornsby, Christophe
Claramunt, Michel Denis, G´
erard Ligozat, Kathleen S. Hornsby, Christophe Clara-
munt, Michel Denis, and G´
erard Ligozat, editors, COSIT, volume 5756 of Lecture
Notes in Computer Science, pages 295–311. Springer, 2009.
[2] Michael McCahill and Clive Norris. Cctv in london. Report to the European
Commission Fifth Framework RTD as part of UrbanEye: on the threshold of the
urban panopticon, 2002.
[3] J. Shi and C. Tomasi. Good features to track. Proceedings of the Conference on
Computer Vision and Pattern Recognition, pages 593–600, June 1994.
1
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
This paper describes a method for building visual scene models from video data using quantized descriptions of motion. This method enables us to make meaningful statements about video scenes as a whole (such as “this video is like that video”) and about regions within these scenes (such as “this part of this scene is similar to this part of that scene”). We do this through unsupervised clustering of simple yet novel motion descriptors, which provide a quantized representation of gross motion within scene regions. Using these we can characterise the dominant patterns of motion, and then group spatial regions based upon both proximity and local motion similarity to define areas or regions with particular motion characteristics. We are able to process scenes in which objects are difficult to detect and track due to variable frame-rate, video quality or occlusion, and we are able to identify regions which differ by usage but which do not differ by appearance (such as frequently used paths across open space). We demonstrate our method on 50 videos making up very different scene types: indoor scenarios with unpredictable unconstrained motion, junction scenes, road and path scenes, and open squares or plazas. We show that these scenes can be clustered using our representation, and that the incorporation of learned spatial relations into the representation enables us to cluster more effectively. EPSRC
No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the world is still hard. We propose a feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world. These methods are based on a new tracking algorithm that extends previous Newton-Raphson style search methods to work under affine image transformations. We test performance with several simulations and experiments. 1 Introduction IEEE Conference on Computer Vision and Pattern Recognition (CVPR94) Seattle, June 1994 Is feature tracking a solved problem? The extensive studies of image correlation [4], [3], [15], [18], [7], [17] and sum-of-squared-difference (SSD...
Cctv in london Report to the European Commission Fifth Framework RTD as part of UrbanEye: on the threshold of the urban panopticon
  • Michael Mccahill
Michael McCahill and Clive Norris. Cctv in london. Report to the European Commission Fifth Framework RTD as part of UrbanEye: on the threshold of the urban panopticon, 2002.
Report to the European Commission Fifth Framework RTD as part of UrbanEye: on the threshold of the urban panopticon
  • Michael Mccahill
  • Clive Norris
Michael McCahill and Clive Norris. Cctv in london. Report to the European Commission Fifth Framework RTD as part of UrbanEye: on the threshold of the urban panopticon, 2002.
Good features to track
  • J Shi
  • C Tomasi
J. Shi and C. Tomasi. Good features to track. Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 593-600, June 1994.