Conference PaperPDF Available

Compression Noise Based Video Forgery Detection

Authors:

Abstract and Figures

Intelligent video editing techniques can be used to tamper videos such as surveillance camera videos, defeating their potential to be used as evidence in a court of law. In this paper, we propose a technique to detect forgery in MPEG videos by analyzing the frame's compression noise characteristics. The compression noise is extracted from spatial domain by using a modified Huber Markov Random Field (HMRF) as a prior for image. The transition probability matrices of the extracted noise are used as features to classify a given video as single compressed or double compressed. The experiment is conducted on different YUV sequences with different scale factors. The efficiency of our classification is observed to be higher relative to the state of the art detection algorithms.
Content may be subject to copyright.
COMPRESSION NOISE BASED VIDEO FORGERY DETECTION
Hareesh Ravi, A.V. Subramanyam, Gaurav Gupta, B. Avinash Kumar
Electronics and Communication Engineering
Indraprastha Institute of Information Technology
Department of Electronics and Information Technology
New Delhi, India
ABSTRACT
Intelligent video editing techniques can be used to tamper
videos such as surveillance camera videos, defeating their po-
tential to be used as evidence in a court of law. In this paper,
we propose a technique to detect forgery in MPEG videos
by analyzing the frame’s compression noise characteristics.
The compression noise is extracted from spatial domain by
using a modified Huber Markov Random Field (HMRF) as
a prior for image. The transition probability matrices of the
extracted noise are used as features to classify a given video
as single compressed or double compressed. The experiment
is conducted on different YUV sequences with different scale
factors. The efficiency of our classification is observed to be
higher relative to the state of the art detection algorithms.
Index TermsVideo Forgery Detection, Double Quan-
tization Noise, Markov Process
1. INTRODUCTION
Video cameras and surveillance systems are being increas-
ingly used in today’s world and many of these systems uti-
lize MPEG (MPEG-2 and MPEG-4) encoding for compress-
ing the captured video. In order to forge the videos cap-
tured using these systems, an adversary has to decompress it
first, forge the video and re-compress the forged video while
saving. As this is often the case, double compression can
identify a video forgery. The detection of forgery is also of
paramount importance for Law Enforcement Agencies dur-
ing forensic investigation as they need to verify the integrity
of a video in question, which could be a potential evidence.
Although video forgery requires high levels of sophistication,
some convincing forgeries have been pointed out in literature
[1]. Figure 1 shows a kind of forgery where from a video,
certain frames are deleted in such a way that there is only one
person shown walking along the corridor when there were ac-
tually two. Several forgery detection techniques have been
proposed till date [2 - 10]. In the technique proposed in [2]
the basic idea is that, in a recompressed video the statistics
of quantized or inverse quantized coefficients exhibit a de-
viation from that of original video. In [3, 4], noise charac-
teristics are used to detect forgery. In [5], the authors de-
tect double compression by capturing empty bins exhibited
Fig. 1.Top two rows (in clockwise direction) : fames of an original
video clearly showing two men on the hall. Bottom two rows: frames
of the forged video where only one man is seen walking in the hall
in the distribution of quantized coefficients in a recompressed
video. Wang et al [6] also proposed detection of MPEG-4
video double compression by Markov modeling of difference
of DCT coefficients. The techniques [7, 8] are also based on
similar principles while [9, 10] proposed forgery localization
techniques. However, these [2, 5, 6] techniques have limita-
tions over the relationship between the scaling factors used
for first and second compression. In order to overcome the
limitations of aforementioned references, we use compression
noise for detection of double compression thereby detecting
forgery.The compression noise present in spatial domain in a
video has been shown to be correlated [11]. When a single
compressed video is re-compressed, the correlation of spa-
tial domain noise is disturbed. This phenomenon can be ef-
fectively captured using Markov process and can be used for
forgery detection by detecting double compression.
In this paper, we propose a video forgery detection
scheme by detecting double compression. A block diagram
of the scheme used is given in Figure 2. In order to extract
compression noise from a given video frame, we use a modi-
fied HMRF prior model [11]. The prior model is modified in
order to incorporate the effect of compression. Since, Markov
statistics has been proven to be a distinguishable feature for
single and double compression in JPEG images [12] and
Fig. 2.Block diagram of authentication using the proposed scheme
MPEG videos [6] and also for steganographic images [13],
we model the extracted noise as a first order Markov process.
Noise from each frame of a video is divided into 16 ×16
blocks and Transition probability matrix (TPM) for each
block in 8 directions is obtained. The 8 TPMs are linearly
combined to get a single TPM and the resulting 18-D feature
is used for training and testing using Support Vector Machine
(SVM). The detection unit is a single clip of a sequence of 10
clips.
The rest of the paper is organized in the following way.
Section 2 gives the quantization process and related works
while Section 3 gives the actual proposed scheme of forgery
detection. In Section 4 the experimental setup, obtained re-
sults and comparison with other methods are given. Also, the
advantage of this method over the others is discussed. Section
5 contains the conclusion and future work.
2. RELATED WORKS
Let Zbe a frame of a video in spatial domain, Y(in frequency
domain) be the transformed coefficients that we get after ap-
plying block based DCT matrix Hfor compressing frame Z.
Then,
Z=HY Y=HTZ(1)
If the quantization operator on the DCT coefficients is repre-
sented as Q[·]then the quantized DCT coefficients are given
by Yq=Q[Y]. The quantized or compressed frame in spa-
tial domain can be obtained by inverse-DCT of the quantized
DCT coefficients as Zq=HTYqThe quantization error in
the spatial domain and frequency domain can generally be
represented as,
eZ=ZZqand eY=YYq(2)
respectively. The 2D representation of the error in spatial do-
main is given as
eZ=
M1
X
i=o
N1
X
j=o
Hi,j (Yq[i, j]Y[i, j ]) (3)
The main parameter that is needed to model this error term or
noise is the variances of individual frequency coefficients and
the covariance matrix. Let the covariance matrix in frequency
domain be represented as Key, a diagonal matrix, whose di-
agonal elements are individual frequency domain coefficients0
variances given by σ2
ey(i,j). The covariance matrix in the spa-
tial domain will then be [11]
Kez =E[(ZqZ)(ZqZ)T] = HTKeyH(4)
3. PROPOSED SCHEME
3.1. Noise Extraction
The noise extraction process is explained as follows. The pa-
rameters of quantization error [11], as derived in Section 2 can
be used probabilistically to remove compression artifacts. In
this removal technique, the quantization error becomes a like-
lihood term that ensures that the final frame estimate agrees
well with the observed data. A maximum a posteriori(MAP)
criterion is used for estimating the denoised image as,
ˆ
Z= arg max
Z
p(Z|Zq)(5)
= arg max
Z
p(Z)p(Zq|Z)(6)
where ˆ
Zis the final frame estimate after removing the com-
pression noise. Equation (6) considers a priori term and a
maximum likelihood term. The likelihood can be determined
from eq (2) as Zq=Z+eZ,Zq|Zwhich is a Gaussian ran-
dom variable with mean Zand auto covariance Kez. Uni-
form frequency domain model [11] is used for the likelihood
term. The prior model will be based on Huber Markov Ran-
dom Field (HMRF) wherein the Huber function is as follows,
p(Z) = 1
GexpλX
cC
ρT(dt
cZ)) (7)
Where G is a normalizing constant,λis a regularization pa-
rameter, cis a local group of pixels called cliques and C is
the set of all such cliques which depends on neighbourhood
structure of the Markov random field. The Huber function
ρT(·)is defined as,
ρT(u) = (lu2,|u|<=T,
l(T2+ 2T(|u| − T)),|u|> T (8)
where,
l=(1Z(m, n) : m, n 6∈ S,
1.5otherwise (9)
where, we introduce las weight in order to incorporate
the effect of single compression on a frame and m, n is the
indices of the frame Zof dimension MxN.Sis the set of pix-
els which belong to the border pixels in each 8x8 block. dt
c
extracts the differences between a pixel and its neighbors so
that,
p(Z) = 1
Gexpλ
M1
X
n=0 X
mNn
ρT(Z[n]Z[m])(10)
Where Nnis the index set of neighbors for the nth pixel, and
M is the number of pixels in the frame. Now eq (6) can be
written as
ˆ
Z= arg max
Z(1
Gexpλ
M1
X
n=0 X
mNn
ρT(Z[n]Z[m])
1
2π|Kez|1/2exp1
2eT
ZKez1eZ)
(11)
In order to maximize eq (6), the argument of exp(.)in eq (11)
should be minimized. This is performed using the method
given in [11], and subsequently the noise is extracted. Let the
resulting denoised frame be Znand the input frame be Z, then
the compression noise Cnpresent in the compressed frame is
the difference between Znand Z.
3.2. Markov Feature Extraction
The noise Cncan be modeled as a first order Markov Pro-
cess such that, P rXt+1 =P r(Xt+1 |Xt), where Xt+1 is the
present state and Xtis the previous state. The features that we
use to represent this noise is the Transition Probability Matrix
(TPM). Cnis divided into non overlapping blocks of 16 ×16
elements. Each block is used separately to extract TPM. The
sign of each value in a block is obtained as,
Cn(i, j) =
0,Cn(i, j) = N egative
1,Cn(i, j) = Z ero
2,Cn(i, j) = P ositiv e
(12)
This provides us with three different states to model a Markov
chain. The transition probability between each of the three
obtained states is calculated in each of the eight directions
considering 8-connected neighbourhood. The probability
along right direction for each element is obtained by the
following condition,
P
u,v =Pr(Cni,j+1 =u|Cni,j =v)(13)
where, u, v [0,2],and u, v Z. Similarly, the probabilities
can be obtained for other directions. The size of each TPM
will be 3×3since there are only three states and 9 possible
transitions. Totally there will be 8 TPMs for each 16 ×16
block of a frame with 3×3transition probabilities which is a
large data.
In order to reduce the dimensionality of the obtained fea-
tures, the TPMs along the top, bottom, left and right direc-
tions are averaged to get F1, as shown in eq (14). Similarly,
the TPMs along all the diagonals are averaged to get F2, eq
(15), resulting in only 2 TPMs per block per frame of a video.
The two TPMs are concatenated to get the final feature which
is 18-D, and for an M×Nframe size, the dimension of the
feature vector for the frame is M/16 ×N/16 ×18. The direc-
tion of the arrows below show the direction along which the
TPMs are calculated.
F1=1
4(F+F+F+F)(14)
F2=1
4(F%+F&+F.+F-)(15)
4. EXPERIMENTAL RESULTS
The video files are obtained from various open sources [14]
in 4 : 2 : 0 Common Intermediate Format (CIF) of resolution
352×288. Sixteen different sequences of 300 frames each are
taken and are encoded using ‘ffmpeg’ MPEG encoder. Details
of MPEG-2 video detection are given here and that of MPEG-
4 is discussed in Section 4.3. The encoding sequence is con-
sidered as “IPPPP” and these 5 frames constituted a single
Group Of Picture (GOP). All the clips were first encoded in
Variable Bit Rate mode with Quality scale factor (QF ) rang-
ing from 2 to 15. In order to simulate forgery, 28 frames
are deleted from the middle (frames 221 to 248) of single
compressed videos. These videos are then compressed again
with different scaling factors. Each YUV sequence is divided
into 10 clips of 30 frames or 6 GOPs each. Totally 160 clips
are considered for each scale factor pair (single compression
scale factor QF1and double compression scale factor QF2)
resulting in 160*162 (total number of pairs) = 25920 clips. In
Table1, values for scale factors such as 5, 7, 8, 11 and 12 are
not given due to space constraints and to include a broader
range of values. However, these values also give results simi-
lar to those given in the table.
4.1. Classification
For each compression pair as given in Table 1, the total num-
ber of samples available is 320 (160 each for single and dou-
ble). 50% of the total samples was trained using SVM with
linear kernel and other parameters being set to default [15].
The testing samples constituted the other 50% of the total
samples. It was ensured that a sequence if present in the
training sample will not be a part of the testing samples. The
experiment was repeated for 10 times by changing the train-
ing and testing samples each time maintaining the 50-50 ratio.
Each frame is considered for classification and based on a vot-
ing mechanism, when the number of frames classified as au-
thentic/single compressed in a given clip is above a threshold
th= 0.5 ([6]), then the clip is classified as single compressed.
Similarly, the clip is classified as forged when the number of
frames classified as forged/double compressed is above th.
4.2. Performance Comparison
Classification accuracy for each compression pair is given in
Table 1. Here, the accuracy is given as (TPR + TNR)/2,
where TPR is the ratio of classified forged clips to that of
total number of forged clips. TNR is the ratio of classified
authentic clips to that of total number of authentic clips. It is
observed that the accuracy is more than 95% except for very
few pairs like 9-10,13-14,14-15 and 14-13 but are still con-
siderably higher. Further, it is also observed that the accuracy
is 100% for most of the pairs in the lower left of the Table 1
as well as for a few in the upper right.
QF1\QF22 3 4 6 9 10 13 14 15
2 x 94 95 97 98.9 99.4 100 100 100
3 97 x 95.8 96.3 98.5 99.7 100 100 100
4 96 95.3 x 93 96.8 97.3 100 100 100
6 99.4 96.5 94 x 97.6 97.4 100 100 100
9 100 99.2 98.6 95.4 x 83.4 90.2 97.9 100
10 100 100 97.5 97.5 94.6 x 88 93.6 95
13 100 100 100 100 95.2 93.8 x 82.8 82.4
14 100 100 100 100 100 94.2 85.3 x 75.6
15 100 100 100 100 100 100 84.5 85.8 x
Table 1. Accuracy Rate for Various Compression Pairs
In [2],[6],[7],[8], the authors point out that detection
becomes harder when the double compression scale factor
QF2is a multiple of single compression scale factor QF1.
Proposed method is able to classify a given video sequence
as authentic or forged irrespective of whether it was com-
pressed with QF2that is an an odd/even multiple of QF1.
Comparison of classification accuracy between the proposed
method for both MPEG-4 and MPEG-2 videos and the previ-
ous methods such as [2] for MPEG-2 and [6] for MPEG-4 is
given in Table 2. It is clearly evident from Table 2, that the
proposed scheme gives significant improvement in case of
odd multiple case. In case of even multiple, the performance
is better than that of [2, 6].
ROC curve for the proposed method of classification is
given in Figure 3. Here, FPR is the ratio of classified au-
thentic clips as forged to that of total number of forged. The
plot shows that high TPR can be achieved while maintain-
ing very low FPR. The better performance of our proposed
scheme is because, the characteristics of the noise extracted
from the single compressed frame differs from that of double
compressed frame. This difference in spatial noise charac-
teristics is effectively extracted by Markov modeling of the
noise.
4.3. Discussion
The proposed scheme is also tested on YUV sequences
when the encoding was done in the standard sequence i.e.
‘IBBPBBPBBPBB’. Also, apart from deleting a certain num-
ber of frames from a video sequence, forgeries such as ‘copy
paste’, ‘scaling’, ‘interchanging GOPs / certain frames ran-
domly’ is also considered. Since it is the double compression
that is being detected, changes in the forgery type would
theoretically give similar accuracy while detecting forgery.
In addition we tested our algorithm for videos that were
compressed with MPEG-4 part-2 encoder and found that the
detection accuracy is similar to that of Table 1. Further, as
the proposed technique detects double compression based
on compression noise, it can detect forgery in videos that
were encoded using any of the MPEG coding techniques like
MPEG-4 part-10.
Method Proposed Markov for DCT First digit
(MPEG-2/4) coefficients [6] statistics
(MPEG-4) [6, 2]
(MPEG-2)
Odd Multiple 98.98% 51.53% 50.32%
Even Multiple 98.62% 96.28% 59.46%
Table 2.Detection accuracy comparison
Fig. 3.ROC Curve showing True positive and False positive rate
for three different scaling factors and the average.
5. CONCLUSION
An efficient method to detect forgeries in video by detect-
ing double compression is proposed. The effectiveness of
this method is three fold. First, the detection accuracy rate
is above 95% for all scale factors and in most of the cases, the
efficiency is as high as 100%. Second, modeling of compres-
sion noise as Markov process clearly characterizes the form
of compression which is single or double. It also detects dou-
ble compression in both MPEG-2 and MPEG-4 videos. This
is also validated through experimental results. Further, the
proposed algorithm performs better than most of the present
techniques. In future works, we want to perform the localiza-
tion of tampering in a video. This localization can be in terms
of GOP, frames or as small as a macroblock.
6. ACKNOWLEDGEMENT
The authors thank the ‘Cybersecurity Education and Research
Centre (CERC)’, IIITD for partially funding this work.
7. REFERENCES
[1] http://mine.csie.ncu.edu.tw/core/en/group video, “Mul-
timedia information networking laboratory, online.
[2] W. Chen and Y. Shi, “Detection of double mpeg com-
pression based on first digit statistics,” LNCS, Digital
Watermarking, vol. 5450, pp. 16–30, 2009.
[3] C.C. Hsu, T.Y. Hung, C.W. Lin, and C.T. Hsu, “Video
forgery detection using correlation of noise residue, in
Proc. 10th IEEE Workshop on Multimedia Signal Pro-
cessing, 2008, pp. 170–174.
[4] M. Kobayashi, T. Okabe, and Y. Sato, “Detecting video
forgeries based on noise characteristics, LNCS, Ad-
vances in Image and Video Technology, vol. 5414, pp.
306–317, 2009.
[5] W. Wang and H. Farid, “Exposing digital forgeries in
video by detecting double quantization,” in Proc. 11th
ACM workshop on Multimedia and Security, 2009, pp.
39–48.
[6] Xinghao Jiang, Wan Wang, Tanfeng Sun, and Yun Q.
Shi, “Detection of double compression in mpeg-4
videos based on markov statistics, IEEE Signal pro-
cessing letters, vol. 20, pp. 447–450, May 2013.
[7] T.-F. Sun, W.Wang, and X.-H. Jiang, “Exposing
video forgeries by detecting mpeg double compression,
in Proc. IEEE International conference on Acoustic
Speech Signal Processing, 2012, pp. 1389–1392.
[8] Y.-Q. Shi B. Li and J.-W. Huang, “Detecting doubly
compressed jpeg images by using mode based first digit
features,” in Proc. IEEE International Workshop on
Multimedia Signal Processing (MMSP), October 2008,
pp. 730–735.
[9] Paolo Bestagini, Simone Milani, Marco Tagliasacchi,
and Stefano Tubaro, “Local tampering detection in
video sequences,” in Proc. IEEE 15th International
Workshop on Multimedia Signal Processing (MMSP),,
2013, pp. 488–493.
[10] AV Subramanyam and Sabu Emmanuel, “Video forgery
detection using hog features and compression proper-
ties,” in Proc. IEEE 14th International Workshop on
Multimedia Signal Processing (MMSP),, 2012, pp. 89–
94.
[11] M.A Robertson and R.L. Stevenson, “Dct quantization
in compressed images,” IEEE Transactions on Circuits
and Systems for Video Technology, vol. 13, pp. 27–38,
January 2005.
[12] C.-H. Chen, Y.-Q. Shi, and W. Su, A machine learn-
ing based scheme for double jpeg compression detec-
tion,” in Proc. IEEE International Conference on Pat-
tern Recognition, 2008, pp. 1814–1817.
[13] T. Pevny, P. Bas, and J. Fridrich, “Steganalysis by sub-
tractive pixel adjacency matrix, IEEE Transactions on
Information Forensics and Security,, vol. 5, pp. 215–
224, June 2010.
[14] www.xiph.org and www.trace.eas.asu.edu/yuv/, “video
test media, derf’s collection and yuv sequences,” online.
[15] Chih-Chung Chang and Chih-Jen Lin, “LIBSVM: A
library for support vector machines,ACM Transactions
on Intelligent Systems and Technology, vol. 2, pp. 27:1–
27:27, 2011.
... InDong et al. (2012) the authors used a motion-compensated edge artifact (MCEA) difference between adjoining frames and inspect whether there are any spikes in the Fourier transform domain for forgery detection. The authors inRavi et al. (2014) used a modified Huber Markov random field (HMRF) to extract compression noise from the spatial domain. The transition probability matrices of the noise were used as features to classify a video. ...
Article
Full-text available
With the explosive advancements in smartphone technology, video uploading/downloading has become a routine part of digital social networking. Video contents contain valuable information as more incidents are being recorded now than ever before. In this paper, we present a comprehensive survey on information extraction from video contents and forgery detection. In this context, we review various modern techniques such as computer vision and different machine learning (ML) algorithms including deep learning (DL) proposed for video forgery detection. Furthermore, we discuss the persistent general, resource, legal, and technical challenges, as well as challenges in using DL for the problem at hand, such as the theory behind DL, CV, limited datasets, real-time processing, and the challenges with the emergence of ML techniques used with the Internet of Things (IoT)-based heterogeneous devices. Moreover, this survey presents prominent video analysis products used for video forensics investigation and analysis. In summary, this survey provides a detailed and broader investigation about information extraction and forgery detection in video contents under one umbrella, which was not presented yet to the best of our knowledge.
... Furthermore, whenever a video sequence is compressed twice, it is possible to observe some peculiar noise patterns. In [42] the authors propose a firstorder Markov statistics for the differences between quantized discrete cosine transform (DCT) coefficients along different directions, while the solution in [43] employs a modified Huber Markov random field (HMRF) model; these methods permit assessing whether a whole sequence of frames is authentic or tampered (e.g., compressed twice) but, differently from the proposed one, do not allow to precisely localize a forgery operation. ...
Article
Full-text available
Forgery operations on video contents are nowadays within the reach of anyone, thanks to the availability of powerful and user-friendly editing software. Integrity verification and authentication of videos represent a major interest in both journalism (e.g., fake news debunking) and legal environments dealing with digital evidence (e.g., courts of law). While several strategies and different forensics traces have been proposed in recent years, latest solutions aim at increasing the accuracy by combining multiple detectors and features. This paper presents a video forgery localization framework that verifies the self-consistency of coding traces between and within video frames by fusing the information derived from a set of independent feature descriptors. The feature extraction step is carried out by means of an explainable convolutional neural network architecture, specifically designed to look for and classify coding artifacts. The overall framework was validated in two typical forgery scenarios: temporal and spatial splicing. Experimental results show an improvement to the state of the art on temporal splicing localization as well as promising performance in the newly tackled case of spatial splicing, on both synthetic and real-world videos.
... Then in the second stage, several forgeries such as partial manipulation, video alternation and upscale-crop are identified by computing the scalar factor and correlation coefficient. Video forgery detection technique is proposed by Ravi et al. [87] for frame deletion and copy-move forgery by identifying double compression. The compression noise is used as a feature which is extracted from the video frames by the modified Huber Markov Random Field (HMRF) prior model. ...
Article
Full-text available
Digital videos are one of the most widespread forms of multimedia in day to day life. These are widely transferred over social networking websites such as Facebook, Instagram, WhatsApp, YouTube, etc. through the Internet. Availability of modern and easy to use editing tools have facilitated the modification of the contents of the digital videos. Therefore, it has become an essential concern for the legitimacy, trustworthiness, and authenticity of these digital videos. Digital video forgery detection aims to identify the manipulations in the video and to check its authenticity. These techniques can be divided into active and passive techniques. In this paper, a comprehensive survey on video forgery detection using passive techniques have been presented. The primary goal of this survey is to study and analyze the existing passive video forgery detection techniques. Firstly, the preliminary information required for understanding video forgery detection is presented. Later, a brief survey of existing passive video forgery detection techniques based on the features, forgery identified, datasets used, and performance parameters detail along with their limitations are reviewed. Then, anti-forensics strategy and deepfake detection in the video are discussed. After that, standard benchmark video forgery datasets and the generalized architecture for passive video forgery detection techniques are discussed. Finally, few open challenges in the field of passive video forgery detection are also described.
... Similarly, the simultaneous presence of traces related to incompatible coding parameters or formats is investigated in several papers [8,40,41]. Furthermore, whenever a video sequence is compressed twice, it is possible to observe some peculiar noise patterns: in [42] the authors propose a first-order Markov statistics for the differences between quantized discrete cosine transform (DCT) coefficients along different directions, while the solution in [43] employs a modified Huber Markov random field (HMRF) model. These methods enables to assess whether a whole sequence of frames is authentic or tampered (e.g. ...
Preprint
Forgery operations on video contents are nowadays within the reach of anyone, thanks to the availability of powerful and user-friendly editing software. Integrity verification and authentication of videos represent a major interest in both journalism (e.g., fake news debunking) and legal environments dealing with digital evidence (e.g., a court of law). While several strategies and different forensics traces have been proposed in recent years, latest solutions aim at increasing the accuracy by combining multiple detectors and features. This paper presents a video forgery localization framework that verifies the self-consistency of coding traces between and within video frames, by fusing the information derived from a set of independent feature descriptors. The feature extraction step is carried out by means of an explainable convolutional neural network architecture, specifically designed to look for and classify coding artifacts. The overall framework was validated in two typical forgery scenarios: temporal and spatial splicing. Experimental results show an improvement to the state-of-the-art on temporal splicing localization and also promising performance in the newly tackled case of spatial splicing, on both synthetic and real-world videos.
... They proposed that MPEG compression introduces various block artifacts into unlike frames. Therefore, when a number of frames are removed from an MPEG video file and the file is re-compressed, the block artifacts informed by the previous compression rest and influence the average of block artifact intensity of the re-compressed one, which provides evidence of tampering.Ravi et al[64] proposed a technique to detect forgery in MPEG videos by using the Huber Markov Random Field Model. Their method analyzes the frame's compression noise properties which are extracted from spatial domain. ...
Thesis
In recent years due to advancement in video and image editing tools, it has become increasingly easy to modify multimedia content. Doctored videos are very difficult to identify through visual examination as artifacts left behind by processing steps are subtle and cannot be easily captured visually. Therefore, the integrity of digital videos can no longer be taken for granted and these are not readily acceptable as a proof-of-evidence in a court-of-law. Hence, identifying the authenticity of videos has become an important field of information security. This thesis presents an approach to detect and temporally localize video forgery based on the correlation of noise residue using the Discrete Wavelet Transformation (DWT) algorithm for de-noising. The proposed algorithm is tested on public datasets, such as SULFA, which are used for performance evaluation. The results show that the approach is effective against manipulation techniques. In addition, it detects and localizes tampered frames in a video with high accuracy. Keywords: Discrete Wavelet Transformation, Video Forensic, Video Forgery, correlation of noise residue.
Preprint
In this paper, we review recent work in media forensics for digital images, video, audio (specifically speech), and documents. For each data modality, we discuss synthesis and manipulation techniques that can be used to create and modify digital media. We then review technological advancements for detecting and quantifying such manipulations. Finally, we consider open issues and suggest directions for future research.
Article
Full-text available
With the advent of Internet, images and videos are the most vulnerable media that can be exploited by criminals to manipulate for hiding the evidence of the crime. This is now easier with the advent of powerful and easily available manipulation tools over the Internet and thus poses a huge threat to the authenticity of images and videos. There is no guarantee that the evidences in the form of images and videos are from an authentic source and also without manipulation and hence cannot be considered as strong evidence in the court of law. Also, it is difficult to detect such forgeries with the conventional forgery detection tools. Although many researchers have proposed advance forensic tools, to detect forgeries done using various manipulation tools, there has always been a race between researchers to develop more efficient forgery detection tools and the forgers to come up with more powerful manipulation techniques. Thus, it is a challenging task for researchers to develop h a generic tool to detect different types of forgeries efficiently. This paper provides the detailed, comprehensive and systematic survey of current trends in the field of image and video forensics, the applications of image/video forensics and the existing datasets. With an in-depth literature review and comparative study, the survey also provides the future directions for researchers, pointing out the challenges in the field of image and video forensics, which are the focus of attention in the future, thus providing ideas for researchers to conduct future research.
Article
Video copy-move forgery detection (VCMFD) is a significant and greatly challenging task due to a variety of difficulties, including a huge amount of video information, diverse forgery types, rich forgery objects, and homogenous forgery sources. These difficulties raise four unresolved key challenges in VCMFD: i) ineffective detection in some popular forgery cases; ii) inefficient matching in processing numerous video pixels with hundred-dimensional features under dozens of matching iterations; iii) high false positive (FP) in detecting forgery videos; iv) low trade-off of efficiency and effectiveness in filling forgery region, and even failing in indicating forgeries at the pixel level. In this paper, a novel VCMFD method is proposed to address these issues: i) an innovatively improved SIFT structure that can address the thorough feature extraction in all video copy-move forgery cases; ii) a novel fast keypoint-label matching (FKLM) algorithm is proposed that creates some keypoint-label groups so that every high-dimensional feature is assigned into one of these groups. As a result, matching of video pixels can be directly done on a small number of keypoint-label groups only, leading to a nearly 500% raise in matching efficiency; iii) a new coarse-to-fine filtering relying on intrinsic attributes of exact keypoint-matches is designed to more effectively reduce the false keypoint-matches; iv) the adaptive block filling relying on true keypoint-matches contributes to the accurate and efficient suspicious region filling, even at the pixel level. Finally, the suspicious region locations with the forgery vision persistence concept indicate forgery videos. Compared to the state-of-art methods, the experiments show that our proposed method achieves the best detection accuracy, lowest FP, and improved at least 16% and 8% of F1 scores on the GRIP 2.0 dataset and a combination of SULFA 2.0 & REWIND datasets. Furthermore, the proposed method is with low computational time (4.45 s/Mpixels), which is about 1/2-1/3 times of the latest DFMI-BM (8.02 s/Mpixels) and PM-2D (13.1 s/Mpixels) methods.
Article
This paper proposes a video copy-move forgery detection method to effectively address inter/intra-frame forgeries both at the frame and pixel level. First, a unified moment framework is proposed to extract multi-dimensional dense moment features from the video effectively. Second, a novel feature representation method takes each feature sub-map index to represent its every dimensional feature and then concatenates to a 9-digit dense moment feature index. Third, an inter-frame best match algorithm is proposed to search the 9-digit dense moment feature index of each pixel to find its best matches. All the best matches construct the best match map. Fourth, an inter-frame post-processing algorithm identifies the inter-frame forgery video in the best match map firstly and then indicates the corresponding inter-frame forgery regions. Otherwise, the intra-frame post-processing algorithm re-searches the best match of every pixel in each independent frame and then indicates the intra-frame forgery regions. If the video does not belong to the intra-frame forgeries, the video is determined as a genuine one. The experimental results show that the proposed method is effective at addressing the forensics of the genuine/forgery video and locating the inter/intra-frame copy-move forgeries both at the frame and pixel level.
Conference Paper
Full-text available
In this paper, we propose a novel video forgery detection technique to detect the spatial and temporal copy paste tampering. It is a challenge to detect the spatial and temporal copy-paste tampering in videos as the forged patch may drastically vary in terms of size, compression rate and compression type (I, B or P) or other changes such as scaling and filtering. In our proposed algorithm, the copy-paste forgery detection is based on Histogram of Oriented Gradients (HOG) feature matching and video compression properties. The benefit of using HOG features is that they are robust against various signal processing manipulations. The experimental results show that the forgery detection performance is very effective. We also compare our results against a popular copy-paste forgery detection algorithm. In addition, we analyze the experimental results for different forged patch sizes under varying degree of modifications such as compression, scaling and filtering.
Article
Full-text available
With the spread of powerful and easy-to-use video editing software, digital videos are exposed to various forms of tampering. Nowadays, a considerable proportion of surveillance systems and video cameras have built-in MPEG-4 codec. Therefore, the detection of double compression in MPEG-4 videos as a first step in video forensics research is of significance. In this paper, Markov based features are adopted to detect double compression artifacts, which imply that the original video may have been interpolated. The advantages and limitations of double MPEG-4 compression detection are analyzed. Experimental results have demonstrated that our scheme outperforms most existing methods.
Conference Paper
Full-text available
Double JPEG compression detection is of significance in digital forensics. We propose an effective machine learning based scheme to distinguish between double and single JPEG compressed images. Firstly, difference JPEG 2D arrays, i.e., the difference between the magnitude of JPEG coefficient 2D array of a given JPEG image and its shifted versions along various directions, are used to enhance double JPEG compression artifacts. Markov random process is then applied to modeling difference 2-D arrays so as to utilize the second-order statistics. In addition, a thresholding technique is used to reduce the size of the transition probability matrices, which characterize the Markov random processes. All elements of these matrices are collected as features for double JPEG compression detection. The support vector machine is employed as the classifier. Experiments have demonstrated that our proposed scheme has outperformed the prior arts.
Conference Paper
Full-text available
It is a challenge to prove whether or not a digital video has been tampered with. In this paper, we propose a novel approach to detection of double MEPG compression which often occurs in digital video tampering. The doubly MPEG compressed video will demonstrate different intrinsic characteristics from the MPEG video which is compressed only once. Specifically, the probability distribution of the first digits of the non-zero MPEG quantized AC coefficients will be disturbed. The statistical disturbance is a good indication of the occurrence of double video compression, and may be used as tampering evidence. Since the MPEG video consists of I, P and B frames and double compression may occur in any or all of these different types of frames, the first digit probabilities in frames of these three types are chosen as the first part of distinguishing features to capture the changes caused by double compression. In addition, the fitness of the first digit distribution with a parametric logarithmic law is tested. The statistics of fitting goodness are selected as the second part of the distinguishing features. We propose a decision rule using group of pictures (GOP) as detection unit. The proposed detection scheme can effectively detect doubly MPEG compressed videos for both variable bit rate (VBR) mode and constant bit rate (CBR) mode.
Conference Paper
Full-text available
The recent development of video editing techniques enables us to create realistic synthesized videos. Therefore using video data as evidence in places such as a court of law requires a method to detect forged videos. In this paper we propose an approach to detect suspicious regions in video recorded from a static scene by using noise character- istics. The image signal contains irradiance-dependent noise where the relation between irradiance and noise depends on some parameters; they include inherent parameters of a camera such as quantum efficiency and a response function, and recording parameters such as exposure and elec- tric gain. Forged regions from another video camera taken under different conditions can be differentiated when the noise characteristics of the re- gions are inconsistent with the rest of the video.
Conference Paper
Video sequences are often believed to provide stronger forensic evidence than still images, e.g., when used in lawsuits. However, a wide set of powerful and easy-to-use video authoring tools is today available to anyone. Therefore, it is possible for an attacker to maliciously forge a video sequence, e.g., by removing or inserting an object in a scene. These forms of manipulation can be performed with different techniques. For example, a portion of the original video may be replaced by either a still image repeated in time or, in more complex cases, by a video sequence. Moreover, the attacker might use as source data either a spatio-temporal region of the same video, or a region taken from an external sequence. In this paper we present the analysis of the footprints left when tampering with a video sequence, and propose a detection algorithm that allows a forensic analyst to reveal video forgeries and localize them in the spatio-temporal domain. With respect to the state-of-the-art, the proposed method is completely unsupervised and proves to be robust to compression. The algorithm is validated against a dataset of forged videos available online.
Conference Paper
In this paper, an improved video tampering detection model based on MPEG double compression is proposed. Double compression will import disturbance into Discrete Cosine Transform (DCT) coefficients, reflecting in the violation of the parametric logarithmic law for first digit distribution of quantized Alternating Current (AC) coefficients. A 12-D feature can be extracted from each group of pictures (GOP) and machine learning framework is adopted to enhance the detection accuracy. Furthermore, a novel approach with a serial Support Vector Machine (SVM) architecture to estimate original bit rate scale in doubly compressed video is proposed. Experiments demonstrate higher accuracy and effectiveness.
Article
LIBSVM is a library for support vector machines (SVM). Its goal is to help users to easily use SVM as a tool. In this document, we present all its imple-mentation details. For the use of LIBSVM, the README file included in the package and the LIBSVM FAQ provide the information.
Article
We describe a technique for detecting double quantization in digital video that results from double MPEG compression or from combining two videos of different qualities (e.g., green-screening). We describe how double quantization can in-troduce statistical artifacts that while not visible, can be quantified, measured, and used to detect tampering. This technique can detect highly localized tampering in regions as small as 16 × 16 pixels.
Conference Paper
In this paper, we utilize the probabilities of the first digits of quantized DCT (discrete cosine transform) coefficients from individual AC (alternate current) modes to detect doubly compressed JPEG images. Our proposed features, named by mode based first digit features (MBFDF), have been shown to outperform all previous methods on discriminating doubly compressed JPEG images from singly compressed JPEG images. Furthermore, combining the MBFDF with a multi-class classification strategy can be exploited to identify the quality factor in the primary JPEG compression, thus successfully revealing the double JPEG compression history of a given JPEG image.