Content uploaded by Synh Ha
Author content
All content in this area was uploaded by Synh Ha on Jun 03, 2016
Content may be subject to copyright.
This research is funded by International University, VNU-HCM under grant number T2014-05-IT
DISORDER DETECTION APPROACH TO BACKGROUND
MODELING IN TRAFFIC SURVEILLANCE SYSTEM
Tien Phuoc Nguyen*, Duong Nguyen- Ngoc Tran*, Tu Kha Huynh, Synh Viet-
Uyen Ha
School of Computer Science Engineering, International University -
Vietnam National University
Ho Chi Minh, Vietnam
Corresponding author: hvusynh@hcmiu.edu.vn
Received September 10, 2014
ABSTRACT
This paper proposes a new background subtraction technique based on entropy to detect the
disordered frames (DFs). By the means of recognizing and removing all DFs from background
modeling algorithm, the proposed method intends to be implemented for traffic surveillance system
in crowded urban scenes. In experiment section, the comparative tests and quantitative tests,
indicating the experiment results in typical dataset of traffic congestion, corroborate the advantages
of new method.
Keywords. Background subtraction, background modeling, entropy, disorder detection, Gaussian
mixture model, traffic congestion, surveillance system.
1. INTRODUCTION
Video monitoring, optical motion capture, multimedia application is an active field for
researcher. In this research area, background subtraction is an important technique to firstly model
the background and then to detect the moving objects in the scene captured by a static camera.
Although this concept is simple, it has been actually a challenge to any current methods to handle
the outdoor conditions and illumination changes in monitoring systems. As background modeling is
considered as a low-level process, requiring small resource and less computation towards advanced
tasks such as vehicle tracking, classification, number plate recognition, the result of background is
expected to be accurate and adaptive.
In crowded urban, traffic congestion is a severe challenge to any current background modeling
algorithm. In this context, slowly moving vehicles or the pedestrians with high density can make the
scene chaotic and the background ruined. Furthermore, the impact from outdoor illumination also
affects the background result. Generally speaking, the input frames suffering from those effects are
called disordered frames (or images) and must be eliminated.
To deal with this problem, frame difference is known as the first strategy which subtracts the
background from each input frame; the background, maintained during the process, represents the
stable scene after removing all non-stationary elements. After that, a considerable amount of study
has been made to constitute an advanced background modeling algorithm. Among these research
attempts, Mixture of Gaussian (MOG) [1] is the most widely discussed method. When a single
Gaussian density function [24] is not capable enough to deal with dynamic scenes, MOG allows to
model several features for each pixel and also solves the challenge of environmental variation. The
replacements of probability density function (PDF) were proposed [4, 8] using Student’s t mixture
model and Dirichlet Distributions to show better adaption for distributions of complicated PDF.
To speed up the computation of MOG algorithm, CUDA (Nvidia’s parallel computing
architecture) was applied with a support from a special GPU hardware [3]. Hierarchical quad-tree
structure is also introduced [5] to sample the portion of pixels in each image. A recursive algorithm
improving very fast and simple approximation was proposed in [26, 27] based on sigma-delta filter.
The recent improvement of this technique was published in [28] with confidence measurement
which controls the updating period of background model. This idea prevents the background of
being degraded under slow or congested traffic conditions.
The recent techniques, however, show their trade-offs between performance and efficiency to
compromise with traffic congestion and outdoor effects. Therefore, an implementation of
surveillance system becomes more strenuous to find out a proper background modeling algorithm
running on harsh traffic conditions.
For this purpose, we propose a technique that does not appeal the system to process all image
sequences from input by using disorder detection approach. Concerning on the performance
efficiency, this method applies entropy function (EF) in MOG and disorder removal framework to
neglect the DFs so that the needless computation of those frames is dramatically diminished.
2. MATERIALS AND METHODS
There exist some egregious shortcomings, however, encountering the challenges of traffic
context (vehicle parked, traffic chaos, etc.) in previous work - MOG [1]. By using moving average
filter for maintenance process, the learning rate α controlling the adaption of the model sensitively
affects to the stability of background result. If the learning is high, the stationary objects are allowed
to turn into background quickly; if the learning rate is too low, the background will slowly recover
from variation. Therefore, so long as the traffic chaos occurs on the scenes, the foreground
unintentionally becomes the part of background. The background will be aggravated its quality if
those situations tend to continue without cease.
In this section, we aim to recognize the DF and to remove them effectively, thereby improving
the performance and accuracy of MOG algorithm.
2.1. Entropy function determining the disordered frame
Under high variation of each pixel in the scene impacted by environment and traffic condition,
the weight of each component in the mixture must be taken into account. By Stauffer and Grimson
[1], the background and foreground are split:
1
argmin k
k
BT
where
1...K
,
T
is the background threshold (0.6 by default) [2], and
is the weight
associated with a component as its probability. The scene consists not only the background and
foreground but also the noise and illumination, the number of components in each mixture can take
up to K (often 3 to 5). The high weighted components would be either background or foreground,
and the rest components are considered noise. However, MOG will be confused by detecting the
portion between background and foreground components if several equal distributions appear in the
mixture. In that case the weights of components had been varying uncertainly; if this happens for
most of pixel in the frame the traffic congestion can occur. One evidence should be concerned is that
if a mixture contains less low weighted components and a high weighted component, it will be
positively classified as background and we can infer that this mixture is stable. In opposite, the
mixture containing more medium weighted components is confusing about whether it is background
or foreground.
To find out the proper measurement for DFs, we apply the Entropy Function (EF) in each
pixel, considering the weight of each component as its probability. Since the number of distributions
in mixture are discrete, we evaluate the complexity of the pixel by:
,,
1
( ) ln
KXt
t i t i t
i
X
where
,
11
Xt
K
it
i
The higher entropy value the mixture gains, the more confusing the pixel is. Nonetheless, one
pixel has not enough clue to jump into conclusion that whether its image
t
is disordered or not. If a
large amount of pixel becomes chaotic simultaneously, the image can probably be disordered.
Therefore, to obtain an accurate measure that ensures the spatial consistency, we have to compute
the aggregate entropy values
()
t
of all pixels at time t in the frame (n is number of pixels on the
scene):
1
1
( ) ( )
n
t i t
iX
n
The DF is recognized by:
()
t
ID
where
I
is threshold (0.3 – 0.5) measuring the portion of disordered and stable frame.
2.2. Disorder removal approach for the surveillance system
In traffic context, if the system detects a disordered frame, the congestion is going to occur for
a period of time. It refers that the next frames captured from the camera must be regarded as
disordered. Figure 2.a shows the entropy values calculated from a dataset where the frames of high
entropy indicate the DFs. In this case, DFs are still processed in background model. If the DF is
excluded one by one from updating process, the essential information would be lost since the
mixture of each pixel in the frame remains unchanged. Consequently, it is unable to estimate the
period in which the congestion begins and ends.
As the second contribution, the proposed framework (figure 1) introduces 3-state process to
remove the disordered frames efficiently. The main process follows 3 states:
- Updating state allows the MOG algorithm to model and compute the entropy value for each
input image. If
()
t
I
reaches the threshold
I
, the system will change to suspending
state.
- Suspending state ignores S frames (input manually or 100 by default) from image sequence
and return to Training state when suspending period is terminated.
- Training state simplifies the mixture model by cutting down the low weighted components,
and only retains 2 highest weighted components. The new PDF is observed:
2
,1
1 1 , 1 , 1
1
( ) ( , , )
it
t t i t i t
i
P X X
where
1,
1, 1
1, 2,
t
t
tt
and
2,
2, 1
1, 2,
t
t
tt
Then, background is remodeled with a sample of M frames (input manually or 25 by default)
from input sequence. The entropy value of the Mth frame will be recomputed; if it is greater than
I
,
the system returns to suspending state, otherwise, updating state. In the new mixture, two highest
distributions, interpreted as former background and potential background, are retained to create new
model in training state; the low weighted components, probably considered as noise or unessential
components, are cut down by the system.
Figure 2.b shows the demonstration of disorder removal framework by dismissing all
disordered frames in MOG algorithm. The suspending state occurred when congestion was detected,
updating state maintained background model when the frames were stable. In this example, traffic
congestion happened 3 times, and the approach successfully removed them.
(a)
(b)
Figure 2. Applying to a dataset of 1700 frames. (a) entropy calculated without disorder
removal framwork. (b) entropy calculated using new approach with
I
= 0.45 and default
parameter of MOG.
Image
Sequence
available
Suspending
State
Updating
State
Training
State
No No
Update
Background
Model
Compute
Entropy
Sample
counter = 0
Create new
Background
Model
Yes
Sample
counter ++
Sample
counter =
M+1
Switch to
Updating State
Switch to
Suspending
State
Switch to
Training State
No
Yes
No
Yes
Yes
Yes
counter ++
counter = S
Yes
Yes
Yes
No
No
Start
End
No
Figure 1. Flowchart of disorder removal method
Suspending
Updating
3. SIMULATION RESULTS AND ANALYSIS
3.1. Comparative Experiments
The comparative experiments of proposed method with respect to several current excellent
methods are provided in this section. The chosen algorithms for the simulation are: Fuzzy Adaptive
SOM (FAS) [7], Advanced MOG [2], Fuzzy Choquet Integral (F-C) [22], Pixel-based adaptive
segmenter (PBAS) [11], Independent multi-model background subtraction (IMBS) [30]. The
parameters of each algorithm are setup according to author’s propositions.
Datasets such as CDnet [26] and BMC 2012 [27] are often used in recent publications.
However, in order to evaluate the performance in traffic context of crowded urban, all experiments
will demonstrate with the dataset collected by IU DIP [28]. Most of the videos were filmed in Ho
Chi Minh City including different scenarios such as light switch, bootstrap, traffic congestion,
slowly moving motorbike, bus occupying the scene, etc.
Input
Ground truth
FAS
F-Choquet
MOG
IMBS
New method
Figure 3. Comparative results of the detection masks with respect to ground-truth
segmentation
Figure 3 shows a sample of background models of 4 methods at the moment when the
congestion occurs. The results are indicated by binary masks in which the moving object is white
pixels. Since none of algorithms implements shadow removal mechanism, the ground-truth
intentionally includes the internal cast shadow to the foreground of each objects. As can be seen in
the figure above, the best background in the results is from the proposed method. When the chaos
happens, especially the bus occupying the whole scene in second column, other former methods
revealed their drawback that the foreground becomes a part of background. FAS produced ghost of
moving vehicles in the results. The third column shows that MOG and Fuzzy Choquet were highly
affected by the traffic congestion with the unexpected background when a huge object crosses the
scene.
3.2. Quantivative Performance
A more technically accurate performance was provided in [26]. Not all methods can perform
well in all presence of challenge; therefore, researchers used precision (PR), recall (RC), False
Negative rate (FN-r), False Positive rate (FP-r), accuracy (AC), Wrong Classification (PWC), and
Fitness (Fm) to estimate the accuracy of their experiment results. The measurements is obtained by
comparing the binary image with the ground-truth. In this experiment, we apply Frame Difference
algorithm with default parameters to produce the binary mask.
Table 1 compares the results of quantitative test in the dataset in order to evaluate the
performance of each algorithms adequately. FP-rate, AC and Fm are important and depicted in
Figure 4. The lower the FP-rate the better; AC and Fm otherwise. As can be seen in the bar chart,
the gap of FP-rate between algorithms is not satisfactory, but AC and Fm from new method are
higher than others. These evaluation consolidate that the proposed method give better result over
other algorithms as comparative test.
TP
FP
TN
FN
PR
RC
FN-r
FP-r
AC
PWC
Fm
FAS
56841
12814
185985
51560
0.816
0.524
0.476
0.06
0.79
0.21
0.638
F-C
47262
24337
185919
49682
0.66
0.487
0.512
0.116
0.76
0.24
0.56
MOG
52317
18877
180054
55952
0.735
0.483
0.516
0.095
0.756
0.244
0.583
IMBS
45329
25865
151911
84095
0.64
0.35
0.65
0.145
0.642
0.358
0.452
New
47378
24846
227651
7325
0.656
0.866
0.134
0.098
0.895
0.105
0.746
Table 1. Quantitative evolution of the foreground segmentation.
0
0.2
0.4
0.6
0.8
1
FAS F-C MOG IMBS New
FP-rate
AC
Fm
Figure 4. Bar chart for Quantitative evolution of the foreground segmentation on FP-rate, AC
and Fm.
3.3. Speed Performance
In this section, the chosen algorithms were implemented in the same hardware system with 2.5
Ghz Intel Core i5 CPU and DDR3 4GB. The library we used is openCV [29] with dataset of
800x480 pixels. FPS is computed by dividing the total frame of image sequence by the amount of
time processed.
In figure 5, the measure is an average result from our dataset in crowded urban. It can be seen
from the data that the new approach stood out on performance by about a third over its prior work –
MOG. Other algorithms tend to process all frames in image sequences and obtain result slowly
because of the lack of ability to reduce worn out frames as our method does.
0
2
4
6
8
10
12
14
16
18
FAS FAS FC MOG PBAS IMBS New
Figure 5. Speed performance (frame/s) on the dataset.
4. DISCUSSION AND CONCLUSION
This paper has introduced a new method with a different view in the background subtraction
process. To come up with this issue, we have explored more aspects of information in statistical
model with entropy function. By determining and eliminating the worn out images, the disorder
detection approach wisely selects proper images to model the background rather than all image
captured from the camera. The outcome acquires both precision and speed performance which are
critical in this research area. The experiment tests proved the advantages of our method over the
current algorithms that can be applied in real practice of traffic congestion in large cities. For
surveillance system, this mechanism can also be implemented in either real-time process or parallel
programming using GPU.
REFERENCES
1. Stauffer C, Grimson W. Adaptive background mixture models for real – time tracking. Proc
IEEE Conf on Comp Vision and Patt Recog (CVPR 1999)1999; 246 - 252.
2. Zivkovic Z. Improved adaptive Gaussian mixture model forbackground subtraction. Int Conf
Pattern Recognition (ICPR 2004) , 2004, 2: 28- 31.
3. V. Pham, P. Vo, H. Vu Thanh, B. Le Hoai, “GPU Implementation of Extended Gaussian
Mixture Model for Background Subtraction”, IEEE International Conference on Computing
and Telecommunication Technologies, RIVF 2010, Vietnam National University, November
2010.
4. D. Mukherjee, J.Wu, “Real-time Video Segmentation using Student’s t Mixture Model”,
International Conference on Ambient Systems, Networks and Technologies, ANT 2012;
pages 153-160, 2012.
5. J. Park, A. Tabb, and A. C. Kak. Hierarchical data structure for real-time background
subtraction. InProceedings of IEEE ICIP, 2006.
6. Shannon, Claude E. (July/October 1948). "A Mathematical Theory of Communication". Bell
System Technical Journal 27.
7. Lucia Maddalena, Alfredo Petrosino A fuzzy spatial coherence-based approach to
background/foreground separation for moving object detection Journal Neural Computing
and Applications Volume 19, Issue 2 , pp 179-186 2010-03-01.
8. N. Bouguila and D. Ziou. A Dirichlet Process Mixture of Generalized Dirichlet Distributions
for Proportional Data Modeling. IEEE Transactions on Neural Networks, 2010.
9. W. Fan, N. Bouguila, “Online variational learning of finite Dirichlet mixture models”,
Evolving Systems, January 2012.
10. Y. He, D. Wang, M. Zhu, “Background subtraction based on nonparametric Bayesian
estimation”, International Conference Digital Image Processing, July 2011.
11. M. Hofmann, P.Tiefenbacher, G. Rigoll, "Background Segmentation with Feedback: The
Pixel-Based Adaptive Segmenter", IEEE Workshop on Change Detection, CVPR 2012, June
2012.
12. O. Barnich, M. Van Droogenbroeck, "ViBe: a powerful random technique to estimate the
background in video sequences", International Conference on Acoustics, Speech, and Signal
Processing, ICASSP 2009, pages 945-948, April 2009.
13. C. Wang, J. W. Paisley, and D. M. Blei. Online Variational Inference for the Hierarchical
Dirichlet Process. Journal of Machine Learning Research - Proceedings Track, 15:752{760,
2011.
14. C. Guyon, T. Bouwmans, E. Zahzah, “Robust Principal Component Analysis for Background
Subtraction: Systematic Evaluation and Comparative Analysis”, INTECH, Principal
Component Analysis, Book 1, Chapter 12, page 223-238, March 2012.
15. F. Seidel, C. Hage, and M. Kleinsteuber, “pROST - A Smoothed Lp-norm Robust Online
Subspace Tracking Method for Realtime Background Subtraction in Video”, Machine Vision
and Applications, Special Issue on Background Modeling for Foreground Detection in Real-
World Dynamic Scenes, December 2013.
16. N. Wang, D. Yeung, “Bayesian Robust Matrix Factorization for Image and Video
Processing”, International Conference on Computer Vision, ICCV 2013, 2013.
17. H. Wang and D. Suter, “A consensus-based method for tracking: Modelling background
scenario and foreground appearance,” Pattern Recognition, vol. 40, no. 3, pp. 1091–1105,
2007.
18. R. Davies, L. Mihaylova, N. Pavlidis, I. Eckley, “The effect of recovery algorithms on
compressive sensing background subtraction”, Workshop Sensor Data Fusion: Trends,
Solutions, and Applications, 2013.
19. J. Milla, S. Toral, M Vargas, F. Barrero, “Dual-rate background subtraction approach for
estimating traffic queue parameters in urban scenes”, Intelligent Transport Systems, IET ,
Volume 7, No.1, pages 122,130, March 2013.
20. W. Chiu, D. Tsai, “Moving/motionless foreground object detection using fast statistical
background updating”, Imaging Science Journal, Volume 61, Issue 2, pages 252-267, 2013.
21. F. El Baf, T. Bouwmans, B. Vachon, “Type-2 Fuzzy Mixture of Gaussians Model:
Application to Background Modeling”, International Symposium on Visual Computing, ISVC
2008, pages 772-781, Las Vegas, USA, December 2008.
22. F. El Baf, T. Bouwmans, B. Vachon, “Foreground Detection using the Choquet Integral”,
International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS
2008, pages 187-190, Klagenfurt, Austria, May 2008.
23. P. Gorur and B. Amrutur. Speeded up gaussian mixture model algorithm for background
subtraction. InIEEE International Conference on Advanced Video and Signal-Based
Surveillance (AVSS), pages 386–391, Sept. 2011.
24. C. Wren, A. Azarbayejani, T. Darrell, A. Pentland, “Pfinder : Real-Time Tracking of the
Human Body”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 19,
No. 7, pages 780-785, July 1997.
25. A. Mittal and N. Paragios, “Motion-based background subtraction using adaptive kernel
density estimation,” inProc. IEEE Conf. Comput. Vis.Pattern Recog., 2004, pp. 302–309.
26. (2014). 1st IEEE Change Detection Workshop, in Conjunction with CVPR[Online].
Available: http: www.changedetection.net
27. BMC 2012 Background Models Challenge Dataset. http: bmc.univ-bpclermont.fr
28. Traffic Dataset in HCM City by IU DIP:
https://drive.google.com/folderview?id=0Bz7M5VkTUJxCZ2NJSlRTR1RzQk0&usp=sharing
29. Open source Computer Vision. Available: http:www.opencv.org
30. N. Goyette, P.-M. Jodoin, F. Porikli, J. Konrad, and P. Ishwar, changedetection.net: A new
change detection benchmark dataset, in Proc. IEEE Workshop on Change Detection
(CDW’12) at CVPR’12, Providence, RI, 16-21 Jun., 2012.