[Show abstract][Hide abstract] ABSTRACT: Automated multimedia surveillance systems capture, process and analyze multimedia data coming from heterogeneous sensors. These systems are often designed to support (semi-) automatic decision making, such as generating an alarm in response to a surveillance event, as well as providing useful information to human decision makers to ensure public safety. Various tools and techniques from different fields such as Computer Vision, Pattern Recognition, and Multimedia Computing have contributed to the success of such systems.Although there has been significant progress in the field of multimedia surveillance research, we still face situations when the system is unable to detect critical events, wrongly identifies individuals or generates false alarms leading to undesired consequences. Hence, the goal of this special issue is to bring forward recent advancements in automated multimedia surveillance for improved public safety. More specifically, it reports the state-of-the-art techniques, met
Multimedia Tools and Applications 11/2014; · 1.01 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Design and implementation of an effective surveillance system is a challenging task. In practice, a large number of CCTV cameras are installed to prevent illegal and unacceptable activities where a human operator observes different camera views and identifies various alarming cases. But reliance on the human operator for real-time response can be expensive as he may be unable to pay full attention to all camera views at the same time. Moreover, the complexity of a situation may not be easily perceivable by the operator for which he might require additional support in response to an adverse situation. In this paper, we present a Decision Support Engine (DSE) to select and schedule most appropriate camera views that can help the operator to take an informed decision. For this purpose, we devise a utility based approach where the utility value changes based on automatically detected events in different surveillance zones, event co-relation, and operator’s feedback. In addition to the selected camera views, we propose to synthetically embed extra information around the camera views such as event summary and suggested action plan to globally perceive the current situation. The experimental results show the usefulness of the proposed decision support system.
Multimedia Tools and Applications 11/2014; · 1.01 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In this paper, a novel logo watermarking technique with key concept is proposed using fractional wavelet packet transform (FrWPT), non-linear chaotic map and singular value decomposition (SVD). The core idea is to use biometrically generated keys in the embedding process of gray-scale watermark. Therefore, this paper first proposes a method for generating keys from biometrics efficiently. Then the host image is first randomized with the help of non-linear chaotic map followed by the embedding in the FrWPT domain by modifying the singular values of the randomized image. Further, in order to enhance the security, an authentication key is formed to authenticate the watermarked image. Finally, a reliable extraction process is proposed to extracted watermark from the possibly attacked authenticate watermarked image. The security, attack and comparative analysis confirm high security, efficiency and robustness of the proposed watermarking technique. Further, an efficient solution is also proposed to deal with the ambiguous situations created by SVD in watermarking.
Expert Systems with Applications 08/2014; 41(10):4563–4578. · 1.85 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Huge amounts of video are being recorded every day by surveillance systems. Since video is capable of recording and preserving an enormous amount of information which can be used in many applications, it is worth examining the degree of privacy loss that might occur due to public access to the recorded video. A fundamental requirement of privacy solutions is an understanding and analysis of the inference channels than can lead to a breach of privacy. Though inference channels and privacy risks are well studied in traditional data sharing applications (e.g., hospitals sharing patient records for data analysis), privacy assessments of video data have been limited to the direct identifiers such as people’s faces in the video. Other important inference channels such as location (Where), time (When), and activities (What) are generally overlooked. In this paper we propose a privacy loss model that highlights and incorporates identity leakage through multiple inference channels that exist in a video due to what, when, and where information. We model the identity leakage and the sensitive information separately and combine them to calculate the privacy loss. The proposed identity leakage model is able to consolidate the identity leakage through multiple events and multiple cameras. The experimental results are provided to demonstrate the proposed privacy analysis framework.
Multimedia Tools and Applications 01/2014; · 1.01 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Advances in cloud computing have allowed volume rendering tasks, typically done by volume ray-casting, to be outsourced to cloud data centers. The availability of volume data and rendered images (which can contain important information such as the disease information of a patient) to a third-party cloud provider, however, presents security and privacy challenges. This paper addresses these challenges by proposing a secure cloud-based volume ray-casting framework that distributes the rendering tasks among the data centers and hides the information that is exchanged between the server and a data center, between two data centers, and between a data center and the client by using Shamir's secret sharing, such that none of the data centers has enough information to know the secret data and/or rendered image. Experiments and analyses show that our framework is highly secure and requires low computation cost.
Proceedings of the 2013 IEEE International Conference on Cloud Computing Technology and Science - Volume 01; 12/2013
[Show abstract][Hide abstract] ABSTRACT: In this article, a novel logo watermarking scheme is proposed based on wavelet frame transform, singular value decomposition and automatic thresholding. The proposed scheme essentially rectifies the ambiguity problem in the SVD-based watermarking. The core idea is to randomly upscale the size of host image using reversible random extension transform followed by the embedding of logo watermark in the wavelet frame domain. After embedding, a verification phase is casted with the help of a binary watermark and toral automorphism. At the extraction end, the binary watermark is first extracted followed by the verification of watermarked image. The logo watermark is extracted if and only if the watermarked image is verified. The security, attack and comparative analysis confirm high security, efficiency and robustness of the proposed watermarking system.
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP). 12/2013; 10(1).
[Show abstract][Hide abstract] ABSTRACT: In this work, we analyze some of the existing secret image sharing methods and show that they do not possess semantic security, a property of many secure systems. We propose a new method based on the threshold secret sharing scheme for images in the compressed and uncompressed domains. Our method generates minimal share sizes with similar computational cost to previous methods, yet it is computationally secure and satisfies the semantic security property.
IEEE International Conference on Semantic Computing; 09/2013
[Show abstract][Hide abstract] ABSTRACT: Despite the fact that performance improvements have been reported in the last years, semantic concept detection in video remains a challenging problem. Existing concept detection techniques, with ontology rules, exploit the static correlations among primitive concepts but not the dynamic spatiotemporal correlations. The proposed method rewards (or punishes) detected primitive concepts using dynamic spatiotemporal correlations of the given ontology rules and updates these ontology rules based on the accuracy of detection. Adaptively learned ontology rules significantly help in improving the overall accuracy of concept detection as shown in the experimental result.
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP). 05/2013; 9(2).
[Show abstract][Hide abstract] ABSTRACT: Outsourcing the storage and processing of multimedia data to cloud data centers (CDCs) is becoming increasingly common. However, the use of third party CDCsraises security and privacy concerns. In this paper, we present a Shamir's Secret Sharing (SSS) based method to obtain enhanced digital images by performing low pass filtering (LPF)on the obfuscated (or encrypted) images over cloud. The proposed scheme is novel in performing averaging (division)operations involved in the spatial convolution of a mask overan image in the encrypted domain (ED) to get the same filtered resultant image as when the filtering operation is performed in the plain text domain (PD). Security and performance analyses corroborate that our scheme is information theoretically secure and transmission efficient (as compared to other schemes), respectively.
Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on; 01/2013
[Show abstract][Hide abstract] ABSTRACT: Secret image sharing is a method for distributing a secret image amongst n data stores, each storing a shadow image of the secret, such that the original secret image can be recovered only if any k out of the n shares is available. Existing secret image sharing schemes, however, do not support scaling and cropping operations on the shadow image, which are useful for zooming on large images. In this paper, we propose an image sharing scheme that allows the user to retrieve a scaled or cropped version of the secret image by operating directly on the shadow images, therefore reducing the amount of data sent from the data stores to the user. Results and analyses show that our scheme is highly secure, requires low computational cost, and supports a large number of scale factors with arbitrary crop.
Multimedia and Expo (ICME), 2013 IEEE International Conference on; 01/2013
[Show abstract][Hide abstract] ABSTRACT: Current surveillance systems record an enormous amount of video footage everyday. This video contains events and activities of real life which are useful in many applications. In this paper, we explore privacy preserving publication surveillance video footage, which requires robust privacy modelling and selection of appropriate data transformation function. Traditional privacy protection methods only consider implicit channels of privacy loss (such as facial information), ignoring other implicit channels. The proposed privacy model consolidates the identity leakage through both implicit and explicit channels. To choose data transformation function, we propose computational models for privacy loss and utility loss and study the tradeoff between these two. Experiments show that the hybrid data transformation method (using a combination of quantisation and blurring) provides the best tradeoff between privacy and utility. Furthermore, applying blurring first and then quantising gives better results.
Int. J. of Trust Management in Computing and Communications. 01/2013; 1(1):23 - 51.
[Show abstract][Hide abstract] ABSTRACT: Outsourcing the tasks of medical data visualization to cloud centers presents new security challenges. In this paper, we propose a framework for cloud-based remote medical data visualization that protects the security of data at the cloud centers. To achieve this, we integrate the cryptographic secret sharing with pre-classification volume ray-casting and propose a secure volume ray-casting pipeline that hides the color-coded information of the secret medical data during rendering at the data centers. Results and analysis show the utility of the proposed framework.
Proceedings of the 20th ACM international conference on Multimedia; 10/2012
[Show abstract][Hide abstract] ABSTRACT: Privacy is a big concern in current video surveillance systems. Due to privacy issues, many strategic places remain unmonitored leading to security threats. The main problem with existing privacy protection methods is that they assume availability of accurate region of interest (RoI) detectors that can detect and hide the privacy sensitive regions such as faces. However, the current detectors are not fully reliable, leading to breaches in privacy protection. In this paper, we propose a privacy protection method that adopts adaptive data transformation involving the use of selective obfuscation and global operations to provide robust privacy even with unreliable detectors. Further, there are many implicit privacy leakage channels that have not been considered by researchers for privacy protection. We block both implicit and explicit channels of privacy leakage. Experimental results show that the proposed method incurs 38% less distortion of the information needed for surveillance in comparison to earlier methods of global transformation; while still providing near-zero privacy loss.
[Show abstract][Hide abstract] ABSTRACT: Surveillance and monitoring systems generally employ a large number of cameras to capture people's activities in the environment. These activities are analyzed by hosts (human operators and/or computers) for threat detection. Threat detection is a target centric task in which the behavior of each target is analyzed separately, which requires a significant amount of human attention and is a computationally intensive task for automatic analysis. In order to meet the real-time requirements of surveillance, it is necessary to distribute the video processing load over multiple hosts. In general, cameras are statically assigned to the hosts; we show that this is not a desirable solution as the workload for a particular camera may vary over time depending on the number of targets in its view. In the future, this uneven distribution of workload will become more critical as the sensing infrastructures are being deployed on the cloud. In this paper, we model the camera workload as a function of the number of targets, and use that to dynamically assign video feeds to the hosts. Experimental results show that the proposed model successfully captures the variability of the workload, and that the dynamic workload assignment provides better results than a static assignment.
IEEE Transactions on Multimedia 01/2012; 14(3):555-562. · 1.75 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Current surveillance systems consist of large numbers of cameras. The video feeds from cameras are automatically processed for threat detection, which is a computationally in tensive task. In order to meet the real-time requirements of surveillance, we need to distribute the video processing over multiple computers. Generally the cameras are statically as signed to the processors; we show that this is not a desirable solution as the workload for a particular camera may vary over time depending on the number of the targets in its view. In future, this uneven distribution of workload will become more critical as the sensing infrastructures are being deployed on the cloud. In this work, we model the camera workload as a function of the number of targets, and use that to dynamically assign video feeds to the processors. Experimental results show that the proposed model successfully captures the variability of the workload, and that dynamic workload assignment provides better results than a static assignment.
Multimedia and Expo (ICME), 2011 IEEE International Conference on; 08/2011
[Show abstract][Hide abstract] ABSTRACT: Current sensor-based monitoring systems use multiple sensors in order to identify high-level information based on the events that take place in the monitored environment. This information is obtained through low-level processing of sensory media streams, which are usually noisy and imprecise, leading to many undesired consequences such as false alarms, service interruptions, and often violation of privacy. Therefore, we need a mechanism to compute the quality of sensor-driven information that would help a user or a system in making an informed decision and improve the automated monitoring process. In this article, we propose a model to characterize such quality of information in a multisensor multimedia monitoring system in terms of certainty, accuracy/confidence and timeliness. Our model adopts a multimodal fusion approach to obtain the target information and dynamically compute these attributes based on the observations of the participating sensors. We consider the environment context, the agreement/disagreement among the sensors, and their prior confidence in the fusion process in determining the information of interest. The proposed method is demonstrated by developing and deploying a real-time monitoring system in a simulated smart environment. The effectiveness and suitability of the method has been demonstrated by dynamically assessing the value of the three quality attributes with respect to the detection and identification of human presence in the environment.
[Show abstract][Hide abstract] ABSTRACT: Significant growth of multimedia content on the World Wide Web (or simply ‘Web’) has made it an essential part of peoples
lives. The web provides enormous amount of information, however, it is very important for the users to be able to gauge the
trustworthiness of web information. Users normally access content from the first few links provided to them by search engines
such as Google or Yahoo!. This is assuming that these search engines provide factual information, which may be popular due
to criteria such as page rank but may not always be trustworthy from the factual aspects. This paper presents a mechanism
to determine trust of websites based on the semantic similarity of their multimedia content with already established and trusted
websites. The proposed method allows for dynamic computation of the trust level of websites of different domains and hence
overcomes the dependency on traditional user feedback methods for determining trust. In fact, our method attempts to emulate
the evolving process of trust that takes place in a user’s mind. The experimental results have been provided to demonstrate
the utility and practicality of the proposed method.
Multimedia Tools and Applications 01/2011; · 1.01 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Large-scale multimedia surveillance installations usually consist of a number of spatially distributed video cameras that are installed in a premise and are connected to a central control station, where human operators (e.g., security personnel) remotely monitor the scene images captured by the cameras. In the majority of these systems the ratio of human operators to the number of camera views is very low. This potentially raises the problem that some important events may be missed. Studies have shown that a human operator can effectively monitor only four camera views. Moreover, the visual attention of human operator drops below the acceptable level while performing the task of visual monitoring. Therefore, there is a need for the selection of the four most relevant camera views at a given time instant. This paper proposes a human-centric approach to solve the problem of dynamically selecting and scheduling the four best camera views. In the proposed approach we use a feedback camera to observe the human monitoring the surveillance camera feeds. Using this information, the system computes the operator’s attention to the camera views to automatically determine the importance of events being captured by the respective cameras. This real-time non-invasive relevance feedback is then augmented with the automatic detection of events to compute the four best feeds. The experiments show the effectiveness of the proposed approach by improving the identification of important events occurring in the environment.
Multimedia Tools and Applications 01/2011; 51:697-721. · 1.01 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Public transport safety is an important issue that has recently gained large attention, especially with the rise of violence happening abroad. To avoid such incidents and to perform post-incident investigations, many buses today are equipped with surveillance cameras. These cameras are usually installed at important places such as doors, the front and the middle of the bus. This camera placement is often performed manually based on human intuition and knowledge; however, there is no scientific basis to justify: 1) how many cameras would be sufficient, and 2) where they should be placed, to increase the area of coverage at a minimum cost. In this paper we present this as an optimization problem and propose a method to compute the approximate coverage of a camera inside the 3D bus model. The utility of proposed method is demonstrated for a single camera setup.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 01/2011