Conference PaperPDF Available

Source camera attribution using stabilized video

Authors:

Abstract and Figures

Although PRNU (Photo Response Non-Uniformity)-based methods have been proposed to verify the source camera of a non-stabilized video, these methods may not be adequate for stabilized videos. The use of video stabilization has been increasing in recent years with the development of novel stabilization software and the availability of stabilization in smart-phone cameras. This paper presents a PRNU-based source camera attribution method for out-of-camera stabilized video (i.e., stabilization applied after the video is captured). The scheme can (i) automatically determine if a given video is stabilized, (ii) calculate the fingerprint from a stabilized video, and (iii) effectively correlate the fingerprint computed from a stabilized video (i.e., anonymous video) with a fingerprint computed from another stabilized or non-stabilized video (i.e., a known video). Furthermore, experimental results show that the source camera of an anonymous non-stabilized video can be verified using a fingerprint computed from a set of images.
Content may be subject to copyright.
Source Camera Attribution Using Stabilized Video
Samet Taspinar
Center for Cyber Security
New York University Abu Dhabi
Abu Dhabi, UAE
Email: st89@nyu.edu
Manoranjan Mohanty
Center for Cyber Security
New York University Abu Dhabi
Abu Dhabi, UAE
Email: mm8372@nyu.edu
Nasir Memon
Department of Computer Science and Engineering
New York University
New York, USA
Email: memon@nyu.edu
Abstract—Although PRNU (Photo Response Non-Uniformity)-
based methods have been proposed to verify the source camera
of a non-stabilized video, these methods may not be adequate
for stabilized videos. The use of video stabilization has been
increasing in recent years with the development of novel stabi-
lization software and the availability of stabilization in smart-
phone cameras. This paper presents a PRNU-based source
camera attribution method for out-of-camera stabilized video
(i.e., stabilization applied after the video is captured). The scheme
can (i) automatically determine if a given video is stabilized,
(ii) calculate the fingerprint from a stabilized video, and (iii)
effectively correlate the fingerprint computed from a stabilized
video (i.e., anonymous video) with a fingerprint computed from
another stabilized or non-stabilized video (i.e., a known video).
Furthermore, experimental results show that the source camera
of an anonymous non-stabilized video can be verified using a
fingerprint computed from a set of images.
I. INTRODUCTION
PRNU (Photo Response Non-Uniformity) based source
camera attribution is an effective method to verify the camera
used to capture an image [1] [2]. This technique is based on
the PRNU noise pattern of the camera that results due to the
non-uniform response of the individual pixels of the camera
sensor to light intensity. Using this technique, a fingerprint of
the camera is first computed from the estimated PRNU noise
of a set of images taken by the camera. Then this fingerprint is
correlated with the estimated PRNU noise of the query image
to determine if the camera has taken the image.
Extending PRNU-based method to video (by considering
the video frames as individual images) is not straightfor-
ward, and can present multiple challenges. One of the main
challenges is video stabilization techniques that employ an
affine transformation [3] [4] on individual frames and result
in misalignment of individual pixels across frames. That is the
(i, j)th pixel in two different frames may not correspond to the
same sensor element. PRNU-based camera attribution cannot
be made using misaligned frames, as neither a camera fin-
gerprint can be effectively computed from misaligned frames
nor the PRNU noise of a misaligned frame can be effectively
correlated with a fingerprint [5]–[7].
The occurrence of stabilized videos is increasing given
the increasing adoption of video stabilization in smart-phone
cameras. Major smart-phone manufacturers, such as Samsung,
Apple, and LG, are providing video stabilization in their latest
smart-phone models. Similarly, video stabilization software,
such as Adobe Warp Stabilizer VFX, Youtube Stabilizer,
FFMPEG deshaker etc., are also available to stabilize a video
outside the camera (i.e., stabilizing an already captured video).
This paper presents a PRNU-based source camera attribu-
tion method for stabilized videos that can (i) automatically
determine if a video is stabilized, (ii) compute the cam-
era fingerprint from a stabilized video, and (iii) correlate
the fingerprint of a stabilized video (i.e, anonymous video)
with the fingerprint of another non-stabilized/stabilized video
(i.e, known video). We consider software-based out-of-camera
stabilization, e.g., FFMPEG deshaker. Given a video, we
automatically check its stability by correlating the fingerprint
computed from a set of frames of the video with the fingerprint
computed from another disjoint set of frames of the same
video. The video is considered as stabilized if the correlation
result is below a threshold. We compute the fingerprint of
a stabilized video by aligning the misaligned frames using
an inverse affine transformation (which is a combination of
inverse rotation and inverse translation). Inverse affine trans-
formation is also used to correlate the fingerprint computed
from a stabilized video with the fingerprint computed from
another non-stabilized/stabilized video.
The rest of this paper is organized as follows. Section II
provides an overview of existing video centric PRNU-based
source camera attribution method. In Section III, we discuss
our solution, and in Section IV, we present experimental
results. Section V concludes with a discussion on future work.
II. RELATED WORK AND BACKGROUND
In this section, we first discuss PRNU-based source camera
attribution, and then discuss video stabilization.
A. PRNU-Based Camera Attribution
PRNU-based source camera attribution is a well studied
problem [1] [2]. After the seminal work in [1], significant
research has been carried out to improve this scheme [8]–
[11], and use it for different purposes [12]–[14]. PRNU-based
source attribution is based on the fact that the sensor output I
from a camera can be modeled as
I=I(0) +I(0)X+ψ,
where I(0) is the noise-free output, Xis the PRNU noise
that represents the camera fingerprint, and ψis combination
of additional noise, such as readout noise, shot noise, dark
current, and quantization noise etc. Using a denoising filter
978-1-5090-1138-4/16/$31.00 2016 IEEE 2016 IEEE International Workshop on Information Forensics and Security (WIFS)
F(such as a Wiener filter) and a set of images (or video
frames) of a camera, we can estimate the camera fingerprint
by first getting the noise residual (i.e., the estimated PRNU)
of the ith image Fi=Ii−F(Ii), and then combining the
noise residuals of all the images. For determining if a specific
camera has taken an anonymous image, we can first obtain its
noise residual using F, and then correlate the noise residual
with the camera fingerprint.
There has been some work dedicated to PRNU-based cam-
era attribution from a video [15]. Mo Chen et al. [16] first ex-
tended PRNU-based approach to camcorder videos. They used
Normalized Cross Correlation (NCC) to correlate fingerprints
calculated from two videos, as the videos may be subject to
translation shift, e.g., due to letterboxing. To compensate for
the blockiness artifacts introduced by heavy video compression
(such as MPEG-x and H26-x compression), they discard the
boundary pixels of a block (e.g., a JPEG block). In [17],
McCloskey proposed a confidence weighting scheme that can
improve PRNU estimation from a video by minimizing the
contribution from regions of the scene that are likely to distort
PRNU noise (e.g., excluding high frequency content). Chuang
et al. [18] studied PRNU-based source camera identification
problem with a focus on smart-phone cameras. Since smart-
phones are subject to strong compression, they considered only
I-frames for fingerprint calculation and correlation. Chen et
al. [19] proposed a method to find PRNU noise from wirelessly
streamed videos, which are subject to blocking and blurring. In
their method, they divided a video frame into multiple blocks,
and did not consider the blocks having significant blocking
or blurring artifacts. Houten et al. [20] focused on finding
PRNU noise from Youtube videos. Hyun et al. [21] proposed
Minimum Average Correlation Energy (MACE) filter in the
place of NCC to correlate noisy frames of low quality videos.
Compared to NCC-based methods, their approach achieved
up to 10% improvement in true positive rates. However, these
schemes did not consider stabilized videos.
Using NCC, H¨
oglund et al. [22] proposed a method that
can correlate a frame of a stabilized video with the fingerprint
calculated from a non-stabilized video. Their method assumed
that stabilization is performed only using translation (i.e.,
pixel shifts), although stabilization can require more complex
operations, such as an affine transformation [3] (which can
be a combination of translation, rotation, and scaling). In this
paper, we consider that stabilization is performed using a com-
bination of translation and rotation, as these two operations are
used by a typical stabilizer, e.g., the FFMPEG deshaker.
B. Video Stabilization
Video stabilization removes the effect of shakes, jitters,
etc. from a video, that results when the camera is subject
to unintentional motion (e.g., due to shaking of hand). A
video can be stabilized either using an optical stabilizer or
a digital stabilizer. An optical stabilizer uses a moveable
lens or a moveable camera sensor to compensate for unin-
tentional camera motion. Either of these moveable camera
parts can adjust themselves to counter unintentional camera
Fig. 1: Video Stabilization Pipeline. This figure is a modified
version of a figure that appeared in [23].
motion. It should be noted that optical stabilization has limited
effect on PRNU-based source camera attribution [22]. On
the other hand, digital stabilization (also called electronic
stabilization) geometrically transforms (e.g., rotates and/or
translates) a frame with respect to a reference frame such that
the unintentional motion between the frames is removed. In
practice, digital stabilization is used more often, as optical
stabilization is expensive. As opposed to optical stabilization,
digital stabilization can have significant impact on PRNU-
based source camera attribution. Hence in this paper we focus
on digital stabilization and in the rest of the paper, we will
mean digital stabilization when we use the term stabilization.
A video can be digitally stabilized either internally when
it is being captured, or externally after it has been captured.
In the case of the former, stabilization software provided by
the camera manufacturer is integrated into the video capturing
software. In the case of the latter, stabilization is done by a
standard third-party software (e.g., FFMPEG deshaker) that
is not part of the video capturing pipeline. In this paper, we
focus on the latter.
An out-of-camera video stabilization process contains three
major stages: camera motion estimation, motion smoothing,
and motion correction (Figure 1) [3] [23]. In the motion es-
timation step, the global inter-frame motion between adjacent
frames of a non-stabilized video is modeled from the optical
flow vectors of the frames using an affine transformation. In
the motion smoothing step, unintentional translations and/or
rotations are filtered out from the global motion vectors using a
low pass filter. Finally, in the motion correction step, stabilized
video is created by shifting and/or rotating frames according
to the parameters in the filtered motion vector. Since different
frames can shift and/or rotate using different parameters, pixels
across frames can be misaligned with each other with respect
to the underlying sensor array.
III. OUR APPROACH
In this section, we discuss a source camera attribution
scheme for out-of-camera stabilized video.
(a) Fingerprint computation from a video.
(b) Correlation.
Fig. 2: Overview of scheme
Figure 2 shows the overall scheme which involves both (i)
computing the camera fingerprint from a known stabilized
video, and/or (ii) correlating the fingerprint computed from
an anonymous video (which can also be stabilized) with the
fingerprint computed from a known video. Given a video,
we first determine if the video is stabilized or not. If it is
stabilized, then the fingerprint is computed by aligning the
frames of the video. To align frames, we perform an inverse
affine transformation that inverts the rotation and translation
of a frame. The fingerprint computed from a stabilized video
can also be misaligned with the fingerprint calculated from
another stabilized/non-stabilized video. Thus, when correlating
the fingerprint from a stabilized video with a fingerprint from
another stabilized/non-stabilized video, we perform an inverse
affine transformation on the stabilized fingerprint.
We consider only the I-frames of a video and discard P-
frames and B-frames, as they are highly compressed and can
contain significant artifacts [24]. The I-frames are extracted
in uncompressed form as further compression can degrade
their quality. For example, when extracting I-frames using
FFMPEG, we extract the frames in BMP form rather than
JPEG form.
A. Determining if a video is stabilized
To determine if a given video is stabilized or not, we utilize
the fact that the frames of a stabilized video are misaligned
whereas the frames of a non-stabilized video are aligned. In
our method, we select the first nI-frames of the given video,
and then divide the frames into two independent sets of n
2
frames each (e.g., the first set contains first n
2frames and
the second set contains the remaining frames). We compute
two fingerprints from two independent sets, and correlate the
fingerprints using PCE (Peak-To-Correlation Energy). If the
correlation result is below a threshold, we conclude that the
given video is stabilized. Otherwise, the video is considered
as non-stabilized.
ALGORITHM 1: Computing fingerprint from a stabilized video
Data: nI-frames {F1,F
2,...,F
n}of a stabilized video, maximum
angle maxθ, minimum angle minθ, angle increment δθ, and
correlation threshold T
Result: Fingerprint of the camera that had taken the video
refP RN U = PRNU noise of F1;
for 2indo
curP RN U = PRNU noise of Fi;
maxCR =0;
for φ=minθ;φmaxθ;φ=φ+δθ do
rtPRN U = rotation of curP RN U by φ;
curCR = NCC of refP RN U and rtP RNU;
if curCR > maxCR then
maxCR =curCR;
end
end
if maxCR Tthen
crpP RN U = crop curP RN U to size of refP RN U;
refP RN U =refPRNU +cr pP RNU ;
else
discard curP RN U ;
end
end
refP RN U is camera fingerprint;
Fig. 3: Rotation and translation of ith frame w.r.t the first
frame. The angle and shift can be different for different i.
B. Computing fingerprint from a stabilized video
We compute the camera fingerprint from nI-frames of a
stabilized video by first realigning the frames using an inverse
affine transformation, and then computing the fingerprint from
the realigned frames. Our approach selects one of the frames
as the reference frame, and iteratively aligns all the other
frames with this reference frame by performing inverse affine
transformations on the other frames. This iteration is run n1
times. Initially, the PRNU noise of the reference frame is
assumed to be the reference PRNU noise. In the ith iteration,
the PRNU noise of ith aligned frame is added to the reference
PRNU noise if the correlation of both the noise exceeds a
threshold, otherwise the frame is discarded. At the end, the
reference PRNU noise is taken as the camera fingerprint. An
outline of our approach is provided in Algorithm 1.
With an affine transformation, the coordinate (x, y)ofa
pixel of frame Fof a video could have moved from the
coordinate (x,y
) of the same frame in the pre-stabilized
version of that video such that x, y, 1=x,y
,1×AZ
T1,
where A=cos θsin θ
sin θcos θrepresents rotation, T=t1t2
represents translation, and Z=0
0.The rotation angle θand
the translation shift Tcan be different for different frames.
Fig. 4: Increase of an image dimension due to rotation.
Thus, a frame Fican be rotated by angle θiand translated by
shift Tiwith respect to the reference frame F1(where θiand
Tican be different for different frames) (Figure 3), leading to
the misalignment of the frames. To align these two frames, we
must rotate Fiby angle θiand translate Fiby Ti. Note that
the angle θiand the shift Ticannot be known from a stabilized
video as this information is not retained after stabilization.
To perform the inverse affine transformation, we employ
a brute force method that exhaustively searches the rotation
angle and translation shift between the reference frame F1and
ith frame Fi. Our approach rotates the PRNU noise of Fiusing
a number of different angles, and for each particular angle,
translates the rotated PRNU for multiple shift parameters. For
each angle, the PRNU of Fiis correlated with the PRNU
of F1using NCC. Note that NCC performs correlations for
multiple translations. If the maximum correlation result among
all NCC correlations is larger than a threshold, then the angle
and shift for the maximum correlation are considered as the
inverse rotation angle and inverse translation shift between F1
and Fi. In the case of a typical stabilizer, the angle of rotation
φbetween two frames does not exceed a threshold maxθ and
is not less than a threshold minθ. For example, FFMPEG
deshaker rotates a frame by minimum 2and maximum 2.
Thus, we begin our search from φ=minθ, and increase the
angle by δθ till we reach φ=maxθ.
Note that our process does not guarantee that the exact
inverse angle θican be found. Rather it finds an angle close
to θi. For this angle, the PRNU of Fiis roughly aligned
with the PRNU of F1.
However, a rotated PRNU noise can have larger dimension
than the PRNU noise of the reference frame (i.e., the non-
rotated PRNU noise). For example, if a 1024 ×1024 noise is
rotated by 2, the rotated PRNU has 1060 ×1060 dimension.
For an image, Figure 4 shows how a rotation can increase the
image size. The rotated PRNU matrix therefore cannot be used
in computing the fingerprint as this matrix cannot be added to
the PRNU matrix of the reference frame. We overcome this
issue by cropping the rotated PRNU to the size of the reference
PRNU (Figure 4 shows how a rotated image is cropped to the
size of the reference image). The cropped position is found by
correlating the rotated PRNU with the reference PRNU using
NCC (the cropped position that has the highest correlation
value).
Note that we compute the fingerprint from a non-stabilized
video using nI-frames of the video. The fingerprint is
calculated using standard techniques (as discussed in Sec-
tion II).Furthermore, we do not perform inverse affine trans-
formation on the frames of the non-stabilized video as they
are already aligned.
C. Correlating fingerprints
We correlate the fingerprint computed from a stabilized
video (anonymous video) with the fingerprint computed from
another non-stabilized/stabilized video (known video) using
inverse affine transformation. Since the fingerprint from the
stabilized video is misaligned with the fingerprint from the
non-stabilized/stabilized video (due to affine transformation
on the stabilized video), we rotate the fingerprint from the
stabilized video using a number of different angles. For
each angle, we perform NCC between the rotated fingerprint
from the stabilized video and the fingerprint from the non-
stabilized/stabilized video (similar to the approach discussed in
Section III-B). If the maximum NCC result (i.e., the maximum
of the NCC results performed for different rotation angles)
exceeds a threshold, we conclude that the given stabilized
video and the known non-stabilized/stabilized video are taken
by the same camera.
Note that for correlating the fingerprint computed from
a non-stabilized video with the fingerprint computed from
another non-stabilized video, we do not need inverse affine
transformation. For this correlation, we use PCE.
IV. EXPERIMENTS
In this section, we first provide experimental results of the
proposed method that can attribute a camera using a stabilized
video. Then, we show how the camera of an non-stabilized
video can be attributed by correlating the fingerprint of the
video with the scaled and/or cropped fingerprint computed
from a set of images.
Experiments were performed on a PC powered by Intel
Xeon E52637 v23.50 GHz processor and 32 GB of
RAM. We implemented our stability-check and inverse affine
transformation schemes using MATLAB R2013a on Windows
7(64 bits) platform. In our implementation, the image noise
extraction, PCE computation, and NCC computation was done
based on previous work [1]. We set a threshold of 60 for PCE
and a threshold of 100 for NCC. The PCE was used to find
out if a video is stabilized and to correlate fingerprints from
two non-stabilized videos, and the NCC was used to correlate
the fingerprint from a stabilized video with the fingerprint
from another stabilized/non-stabilized video. We set a higher
threshold for NCC as inverse affine transformation can add
extra noise to the PRNU noise extracted from a frame of a
stabilized video.
In our experiments, we used 13 smart-phone cameras of five
brands. From these cameras, we obtained 50 non-stabilized
low textured videos of one minute each. The dimension of
each video was 1920 ×1080. We then stabilized each video
using FFMPEG deshaker. Thus, at the end, we had a total
100 (50 non-stabilized and 50 stabilized) videos from 13
cameras. Using FFMPEG, we extracted I-frames of each
video, and considered at most 50 I-frames for our experiments.
TABLE I: TPR of Stability Check
Type of Video #Videos #True Classification TPR
Non-stabilized 50 50 1.00
Stabilized 50 44 0.88
TABLE II: TPR and FPR of Video to Video Correlation
#Correlations #Match TPR #Correlations FPR
NN 172 170 0.99 242 0.0
NS 113 103 0.83 137 0.0
SS 48 31 0.65 98 0.0
For correlating a fingerprint computed from a non-stabilized
video with a fingerprint computed from a set of images, we
considered 25 images from each camera.
A. Camera Attribution Using Stabilized Video
1) Stability check: Table I shows the true positive rate
(TPR) of the stability check method. As shown in the table,
our scheme correctly classified a non-stabilized video as non-
stabilized and a stabilized video as stabilized in 100% and
88% of the cases respectively. Thus, our scheme wrongly
classified a stabilized video as non-stabilized in 12% cases. For
a stabilized video, we obtain this error as some frames in a first
set of frames (which are used to obtain the first fingerprint in
our stability check method) could have been transformed with
the same parameters that were used to affine transform some
frames of a second disjoint set of frames (from which second
fingerprint is generated). For example, the first frame of the
first set and the last frame of the second set could haven been
rotated with the same angle 2and translated with the same
shift (5,5). Thus when the first fingerprint calculated from
the first set of frames is correlated with the second fingerprint
calculated from the second set of frames, the PCE result can be
higher than the threshold (that is used to check the stability).
As a result, the stabilized video can be wrongly classified as
non-stabilized.
2) Correlation of two videos: For this experiment, we
created fingerprints from both stabilized and non-stabilized
videos, and correlated them using three possible ways: NN
(the fingerprint from a non-stabilized video is correlated with
the fingerprint from a non-stabilized video), NS (the fingerprint
from a non-stabilized video is correlated with the fingerprint
from a stabilized video), and SS (the fingerprint from a
Fig. 5: Correlation of two noisy frames.
(a) Line Skipping (high-
lighted pixels removed)
(b) Binning (grouped pixels combined)
Fig. 6: Line Skipping vs Binning.
stabilized video is correlated with the fingerprint from a
stabilized video). This experiment did not consider stabilized
videos that were wrongly classified as non-stabilized in the
stability check experiment. Table II shows the TPR and FPR
(false positive rate) for each of the three cases.
Note that the significant error rate when correlating a finger-
print from a stabilized video with the fingerprint from another
non-stabilized/stabilized video is due to the fact that the NCC
between the extracted noise (i.e., estimated PRNU noise) of
two frames of a noisy video (e.g., textured) may not actually
align the frames. The extracted noise of a heavily compressed
noisy frame can contain significantly higher random noise
factors other than PRNU noise. Thus, when two such extracted
noise patterns are correlated, these random noise factors can
significantly contribute to the correlation, and a false high
correlation results at a position where the frames are not
actually aligned. For example, Figure 5 shows how false
alignment can be found due to the random noise factor for
two frames of a noisy non-stabilized video. In the ideal case,
the correlation between the extracted noise of two frames of
the non-stabilized video must produce the highest peak at (0,0)
position (as both the frames are aligned). However, in Figure 5,
none of the 20 highest correlation results appear at the desired
(0,0) position.
B. Camera Attribution of a Video Using Fingerprint from a
Set of Images
In this section, we show how the fingerprint computed from
a non-stabilized video can be effectively correlated with the
scaled and/or cropped fingerprint from a set of images.
We know that the resolution of a video taken by a camera
is typically lower than the resolution of the images taken by
the same camera. The resolution of the video is typically
lowered by using a sensor resizing trick, such as cropping,
line-skipping (which removes a column/row of pixels), and
binning (which combines neighboring pixels) [25] (Figure 6).
Line skipping is similar to nearest neighbor scaling, and
binning is similar to bilinear scaling.
To correlate the fingerprint calculated from the video with
the fingerprint calculated from the images, we scale and/or
crop the fingerprint from the images. Our brute force im-
plementation correlates the fingerprints by first scaling (for
different scale factors), then cropping (using NCC), and finally
scaling and cropping (scaling with multiple scale factors,
and using NCC for each scale factor) the fingerprint from
the images until the correlation result is above the NCC
TABLE III: Operation required on the fingerprint
Camera Name Crop Scale Scale&Crop
Nexus 5 No No Yes
Samsung S6 Edge No Ye s Ye s
OnePlus One No Yes Yes
Apple-iPhone 6 No No Ye s
Samsung-GT-N7000 No Yes Yes
Sony Experia E2333 No No No
Lenovo P1a42 No No No
Samsung Galaxy E7 No Yes Yes
Samsung GT S7562 No Yes Yes
Samsung GT S7270 No Yes Yes
Nexus 6 No No No
threshold or the search space is exhausted. As a bilinear scaled
fingerprint can be used in the correlation where a nearest
neighbor scaled fingerprint is required [26], we scale the
fingerprint using bilinear scaling. If the correlation result is
above a threshold, we consider that the video and the images
are taken by the same camera. Otherwise, we consider that the
video and the images are taken by different cameras.
Table III shows the operation performed on the fingerprint
calculated from a set of images of a camera to effectively
correlate the fingerprint calculated from the images with the
fingerprint calculated from a video of the same camera. Out
of 13 cameras, our approach can successfully attribute 10
cameras.
V. C ONCLUSION
In this paper, we propose a PRNU-based source camera
attribution method that can attribute the camera of an out-
of-camera digitally stabilized video. Our method can auto-
matically determine if a given video is stabilized, compute
the fingerprint of a stabilized video, and effectively correlate
the fingerprint calculated from a stabilized video with the
fingerprint computed from another non-stabilized/stabilized
video. Experimental results from 100 videos taken by 13
cameras showed that our approach has a true positive rate
of 83% and 65% when a stabilized video is correlated with
a non-stabilized video and a stabilized video respectively. In
future, we plan to extend our work both to lower the error rate
in attribution and to attribute the camera of a noisy stabilized
video.
REFERENCES
[1] J. Luk´
aˇ
s, J. Fridrich, and M. Goljan, “Digital camera identification from
sensor pattern noise,” IEEE Transactions on Information Forensics and
Security, vol. 1, no. 2, pp. 205–214, 2006.
[2] H. T. Sencar and N. Memon, Digital image forensics: There is more to
a picture than meets the eye. New York, USA: Springer, 2013.
[3] Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y. Shum, “Full-
frame video stabilization with motion inpainting,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 28, pp. 1150–1163,
July 2006.
[4] P. Rawat and J. Singhai, “Review of motion estimation and video
stabilization techniques for hand held mobile video,” Signal & Image
Processing : An International Journal, vol. 2, June 2011.
[5] S. Bayram, H. T. Sencar, and N. Memon, “Seam-carving based
anonymization against image & video source attribution,” in IEEE
Workshop on Multimedia Signal Processing, 2013, pp. 272–277.
[6] A. Karak¨
uc¸ ¨
uk, A. E. Dirik, H. T. Sencar, and N. Memon, “Recent
advances in counter PRNU based source attribution and beyond,IS&T
Media Watermarking, Security, and Forensics, vol. 9409, April 2015.
[7] J. Entrieri and M. Kirchner, “Patch-based desynchronization of digital
camera sensor fingerprints,” in IS&T Media Watermarking, Security, and
Forensics, 2016.
[8] Y. Sutcu, S. Bayram, H. T. Sencar, and N. Memon, “Improvements on
sensor noise based source camera identification,” in IEEE International
Conference on Multimedia and Expo, 2007, pp. 24–27.
[9] C. T. Li and Y. Li, “Color-decoupled photo response non-uniformity for
digital image forensics,” IEEE Transactions on Circuits and Systems for
Video Technology, vol. 22, no. 2, pp. 260–271, 2012.
[10] G. Chierchia, S. Parrilli, G. Poggi, C. Sansone, and L. Verdoliva, “On
the influence of denoising in PRNU based forgery detection,” in ACM
Multimedia in Forensics, Security and Intelligence, 2010, pp. 117–122.
[11] C. T. Li, “Source camera identification using enhanced sensor pattern
noise,” IEEE Transactions on Information Forensics and Security, vol. 5,
no. 2, pp. 280–287, 2010.
[12] R. Caldelli, I. Amerini, F. Picchioni, and M. Innocenti, “Fast image
clustering of unknown source images,” in IEEE International Workshop
on Information Forensics and Security, 2010, pp. 1–5.
[13] an Luk´
aˇ
s, J. Fridrich, and M. Goljan, “Detecting digital image forgeries
using sensor pattern noise,” vol. 6072, 2006, pp. 60720Y–60 720Y:11.
[14] G. Chierchia, G. Poggi, C. Sansone, and L. Verdoliva, “A Bayesian-MRF
approach for PRNU-based image forgery detection,” IEEE Transactions
on Information Forensics and Security, vol. 9, no. 4, pp. 554–567, 2014.
[15] S. Milani, M. Fontani, and P. B. et. al., “An overview on video forensics,
Signal Processing Systems, vol. 1, pp. 1–18, June 2012.
[16] M. Chen, J. Fridrich, M. Goljan, and J. Lukas, “Source digital camcorder
identification using sensor photo response non-uniformity,” in SPIE
Electronic Imaging, 2007, pp. 1G–1H.
[17] S. McCloskey, “Confidence weighting for sensor fingerprinting,” in IEEE
CVPR Workshops, 2008, pp. 1–6.
[18] W.-H. Chuang, H. Su, and M. Wu, “Exploring compression effects for
improved source camera identification using strongly compressed video,
in IEEE International Conference on Image Processing, 2011, pp. 1953–
1956.
[19] S. Chen, A. Pande, K. Zeng, and P. Mohapatra, “Video source iden-
tification in lossy wireless networks,” in IEEE INFOCOM, 2013, pp.
215–219.
[20] W. van Houten and Z. Geradts, “Using sensor noise to identify low
resolution compressed videos from Youtube,” in IAPR International
Workshop on Computational Forensics, 2009, pp. 104–115.
[21] D.-K. Hyun, C.-H. Choi, and H.-K. Lee, Camcorder identification for
heavily compressed low resolution videos, Dordrecht, 2012, pp. 695–
701.
[22] T. H¨
oglund, P. Brolund, and K. Norell, “Identifying camcorders using
noise patterns from video clips recorded with image stabilisation,” in
International Symposium on Image and Signal Processing and Analysis,
2011, pp. 668–671.
[23] N. Ejaz, W. Kim, S. I. Kwon, and S. W. Baik, “Video stabilization by
detecting intentional and unintentional camera motions,” in IEEE Inter-
national Conference on Intelligent Systems, Modelling and Simulation,
2012, pp. 312–316.
[24] M. Goljan, M. Chen, P. Comesa˜
na, and J. J. Fridrich, “Effect of
compression on sensor-fingerprint based camera identification,” in Media
Watermarking, Security, and Forensics, 2016, pp. 1–10.
[25] www.eoshd.com, “Nikon d800s 36mp sensor line skips,
http://www.eoshd.com/2012/04/classified-no-longer-how-the-nikon-
d800s-36mp- sensor-line- skips-for- 1080p/, online; accessed 6 June
2016.
[26] S. Taspinar, M. Mohanty, and N. Memon, “PRNU based source attri-
bution with a collection of seam-carved images,” in IEEE International
Conference on Image Processing, Phonix, USA, 2016.
... On the flip side, many of these new processing steps interfere with the underlying assumptions at the core of PRNU-based camera identification, which requires strictly that the probe image and the camera fingerprint are spatially aligned with respect to the camera sensor elements. Techniques such as lens distortion correction (Goljan and Fridrich 2012), electronic image stabilization (Taspinar et al. 2016), or high dynamic range imaging (Shaya et al. 2018) have all been found to impede camera identification if not accounted for through spatial resynchronization. Robustness to low resolution and strong compression is another concern (van Houten and Geradts 2009; Chuang et al. 2011;Goljan et al. 2016;Altinisik et al. 2020, i. a.) that has been gaining more and more practical relevance due the widespread sharing of visual media through online social networks (Amerini et al. 2017;Meij and Geradts 2018). ...
... The source attribution of video data, and in particular the effects of electronic image stabilization (EIS), have arguably been the heavyweight in this research domain, and we direct readers to Chap. 5 of this book for a dedicated exposition. In the context of this chapter, it suffices to say that EIS effectively introduces gradually varying spatial misalignment to sequences of video frames, which calls for especially efficient computational correction approaches (Taspinar et al. 2016;Altinisik and Sencar 2021, i.a.). ...
... In the case of stabilized videos, Taspinar et al. (2016) proposed determining the presence of stabilization in a video by extracting reference PRNU patterns from the beginning and end of a video and by testing the match of the two patterns. If stabilization is detected, one of the I frames is designated as a reference and it is attempted to align other I frames to it through a search of inverse affine transformations to correct for the applied shift and rotation. ...
Chapter
Full-text available
Photo-response non-uniformity (PRNU) is an intrinsic characteristic of a digital imaging sensor, which manifests as a unique and permanent pattern introduced to all media captured by the sensor. The PRNU of a sensor has been proven to be a viable identifier for source attribution and has been successfully utilized for identification and verification of the source of digital media.
... On the flip side, many of these new processing steps interfere with the underlying assumptions at the core of PRNU-based camera identification, which requires strictly that the probe image and the camera fingerprint are spatially aligned with respect to the camera sensor elements. Techniques such as lens distortion correction (Goljan and Fridrich 2012), electronic image stabilization (Taspinar et al. 2016), or high dynamic range imaging (Shaya et al. 2018) have all been found to impede camera identification if not accounted for through spatial resynchronization. Robustness to low resolution and strong compression is another concern (van Houten and Geradts 2009; Chuang et al. 2011;Goljan et al. 2016;Altinisik et al. 2020, i. a.) that has been gaining more and more practical relevance due the widespread sharing of visual media through online social networks (Amerini et al. 2017;Meij and Geradts 2018). ...
... The source attribution of video data, and in particular the effects of electronic image stabilization (EIS), have arguably been the heavyweight in this research domain, and we direct readers to Chap. 5 of this book for a dedicated exposition. In the context of this chapter, it suffices to say that EIS effectively introduces gradually varying spatial misalignment to sequences of video frames, which calls for especially efficient computational correction approaches (Taspinar et al. 2016;Altinisik and Sencar 2021, i.a.). ...
... In the case of stabilized videos, Taspinar et al. (2016) proposed determining the presence of stabilization in a video by extracting reference PRNU patterns from the beginning and end of a video and by testing the match of the two patterns. If stabilization is detected, one of the I frames is designated as a reference and it is attempted to align other I frames to it through a search of inverse affine transformations to correct for the applied shift and rotation. ...
Chapter
Full-text available
Every imaging sensor introduces a certain amount of noise to the images it captures—slight fluctuations in the intensity of individual pixels even when the sensor plane was lit absolutely homogeneously. One of the breakthrough discoveries in multimedia forensics is that photo-response non-uniformity (PRNU), a multiplicative noise component caused by inevitable variations in the manufacturing process of sensor elements, is essentially a sensor fingerprint that can be estimated from and detected in arbitrary images. This chapter reviews the rich body of literature on camera identification from sensor noise fingerprints with an emphasis on still images from digital cameras and the evolving challenges in this domain.
... Note that when EIS is applied, PRNU is transformed accordingly, being solid with respect to the transformation, and this issue has to be considered during PRNU comparison. Starting from this consideration, in [20], the authors intuitively assume that I frames are still the best to be used with stabilized videos. Furthermore, Taspinar et al. emulate incamera stabilization with third-party software, thus making their results still far from actual cases. ...
... Finding #2 from the previous finding, we can derive that a system able to classify whether a video is stabilized or not reliably could lessen the problem. Some techniques exist, such as [20], but if used, the final accuracy is clearly affected by the uncertainty that this tool introduces. Moreover, when this solution is adopted in large-scale scenarios, the complexity that the solution introduces becomes not negligible. ...
Article
Full-text available
Photo Response Non-Uniformity (PRNU) is reputed the most successful trace to identify the source of a digital video. However, its effectiveness is mainly limited by compression and the effect of recently introduced electronic image stabilization on several devices. In the last decade, several approaches were proposed to overcome both these issues, mainly by selecting those video frames which are considered more informative. However, the two problems were always treated separately, and the combined effect of compression and digital stabilization was never considered. This separated analysis makes it hard to understand if achieved conclusions still stand for digitally stabilized videos and if those choices represent a general optimum strategy to perform video source attribution. In this paper, we explore whether an optimum strategy exists in selecting frames based on their type and their positions within the groups of pictures. We, therefore, systematically analyze the PRNU contribute provided by all frames belonging to either digitally stabilized or not stabilized videos. Results on the VISION dataset come up with some insights into optimizing video source attribution in different use cases.
... Unless those transformations are reverted, standard detection statistics will perform poorly as they are roughly based on cross-correlations that yield very small values under grid misalignment. In the literature several methods have been proposed to deal with those spatial transformations, including digital zoom [5], video stabilization [6], high dynamic range (HDR) processing [7], and radial distortion corrections [8], A. Montibeller is with the Department of Information Engineering and Computer Science, University of Trento, Trento 38123, Italy (e-mail: andrea.montibeller@unitn.it). ...
Preprint
Full-text available
Radial correction distortion, applied by in-camera or out-camera software/firmware alters the supporting grid of the image so as to hamper PRNU-based camera attribution. Existing solutions to deal with this problem try to invert/estimate the correction using radial transformations parameterized with few variables in order to restrain the computational load; however, with ever more prevalent complex distortion corrections their performance is unsatisfactory. In this paper we propose an adaptive algorithm that by dividing the image into concentric annuli is able to deal with sophisticated corrections like those applied out-camera by third party software like Adobe Lightroom, Photoshop, Gimp and PT-Lens. We also introduce a statistic called cumulative peak of correlation energy (CPCE) that allows for an efficient early stopping strategy. Experiments on a large dataset of in-camera and out-camera radially corrected images show that our solution improves the state of the art in terms of both accuracy and computational cost.
... During the last few years several works addressed PRNU based camera identification for stabilized videos, from preliminary works [20] to more advanced solutions such as brute-force search on video frames [21], hybrid image/video identifications [22,23], optimized search of transformation parameters [24,25], or preliminary camera model characterization [26,27]. However, a similar effort was not devoted to single images, for which information redundancy cannot be exploited in order to ease the identification task. ...
Article
Full-text available
Assessing if an image comes from a specific device is fundamental in many application scenarios. The most promising techniques to solve this problem rely on the Photo Response Non Uniformity (PRNU), a unique trace left during image acquisition. A PRNU fingerprint is computed from several images of a given device, then it is compared with the probe residual noise by means of correlation. However, such a comparison requires that PRNUs are synchronized: even small image transformations can spoil this task. Most of the attempts to solve the registration problem rely on time consuming brute-force search, which is prone to missing detections and false positives. In this paper, the problem is addressed from a computer vision perspective, exploiting recent image registration techniques based on deep learning, and focusing on scaling and rotation transformations. Experiments show that the proposed method is both more accurate and faster than state-of-the-art approaches.
Article
Analysis of imaging sensors is one of the most reliable photo forensic techniques, but it is increasingly challenged by complex image processing in modern cameras. The underlying photo response non-uniformity (PRNU) is distilled into a static sensor fingerprint unique for each device. This makes it easy to estimate and spoof and limits its reliability in face of sophisticated attackers. We propose to exploit computational capabilities of emerging intelligent vision sensors to design next-generation computational sensor fingerprints . Such sensors allow for running neural network inference directly on raw pixels, which enables end-to-end optimization of the entire photo acquisition and distribution pipeline. Control over fingerprint generation allows for adaptation to various requirements and threat models. In this study we provide a detailed assessment of security properties and evaluate two approaches to prevent spoofing: fingerprint generation based on local image content and adversarial training. We found that adversarial training is currently impractical, but content fingerprints deliver good performance in the considered cross-domain (RAW-RGB) setting and could provide robust best-effort protection against photo manipulation. Moreover, computational fingerprints can alleviate other limitations of PRNU, e.g., its limited reliability for dark/texture content and expensive fingerprint storage that hinders scalability. To enable this line of work, we developed a novel open-source and high-fidelity simulation environment for modeling photo acquisition and distribution pipelines ( https://github.com/pkorus/neural-imaging ).
Article
In recent years, many digital devices have been equipped with a video camera that allows videos to be recorded in good quality, free of charge and without restrictions. Concurrently, the widespread use of digital videos via web-based multimedia systems and mobile smartphone applications such as YouTube, Facebook, Twitter and WhatsApp is becoming increasing important. However, security challenges have emerged and are spreading worldwide. These issues may lead to serious problems, particularly in situations where video is a key part of decision-making in crimes, including movie piracy and child pornography. Thus, to increase the trustworthiness of using digital video in daily life, copyright protection and video authentication must be used. Although source camera identification based on digital images has attracted many researchers’ attention, less research has been performed on the forensic analysis of videos due to certain challenges, such as compression, stabilization, scaling, and cropping, as well as differences between frame types that can occur when a video is stored in digital devices. Thus, there are insufficient large standard digital video databases and updated databases with new devices based on new technologies. The goal of this paper is to offer an inclusive overview of what has been done over the last decade in the field of source video identification by examining existing techniques, such as photo response nonuniformity (PRNU) and machine learning approaches, and describing some popular video databases.
Article
Full-text available
Camera identification using sensor fingerprints is nowadays a well-established technology for reliably linking an image or a video clip to a specific camera. The sensor fingerprint is typically estimated from images (video frames) provably taken by the imaging device. An image or a video clip is then associated with the fingerprint when some form of a matched filter exceeds a certain threshold, which is set to achieve a prescribed false alarm. However, when the images from which the sensor fingerprint is estimated are lossy compressed, the statistical properties of the detection statistic change, which requires an adjustment of the decision threshold to guarantee the same false-alarm rate. In this paper, we study this effect both theoretically and experimentally. A very good match between the theoretical and experimental results validates our approach. This study is especially important for video forensic because of the higher compression rates.
Article
Full-text available
Photo-response non-uniformity (PRNU) of digital sensors was recently proposed [1] as a unique identification fingerprint for digital cameras. The PRNU extracted from a specific image can be used to link it to the digital camera that took the image. Because digital camcorders use the same imaging sensors, in this paper, we extend this technique for identification of digital camcorders from video clips. We also investigate the problem of determining whether two video clips came from the same camcorder and the problem of whether two differently transcoded versions of one movie came from the same camcorder. The identification technique is a joint estimation and detection procedure consisting of two steps: (1) estimation of PRNUs from video clips using the Maximum Likelihood Estimator and (2) detecting the presence of PRNU using normalized cross-correlation. We anticipate this technology to be an essential tool for fighting piracy of motion pictures. Experimental results demonstrate the reliability and generality of our approach.
Conference Paper
Full-text available
As image source attribution techniques have become significantly sophisticated and are now becoming commonplace, there is a growing need for capabilities to anonymize images and videos. Focusing on the photo response non-uniformity noise pattern based sensor fingerprinting technique, this work evaluates the effectiveness of well-established seam carving method to defend against sensor fingerprint matching. We consider ways in which seam-carving based anonymization can be countered and propose enhancements over conventional seam carving method. Our results show that applying geometrical distortion in addition to seam carving will make counter attack very ineffective both in terms of computational complexity and accuracy.
Article
Full-text available
publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10. Abstract—Graphics editing programs of the last generation provide ever more powerful tools which allow to retouch digital images leaving little or no traces of tampering. The reliable detection of image forgeries requires, therefore, a battery of complementary tools that exploit different image properties. Techniques based on the photo-response non-uniformity (PRNU) noise are among the most valuable such tools, since they do not detect the inserted object but rather the absence of the camera PRNU, a sort of camera fingerprint, dealing successfully with forgeries that elude most other detection strategies. In this work we propose a new approach to detect image forgeries using sensor pattern noise. Casting the problem in terms of Bayesian estimation, we use a suitable Markov random field prior to model the strong spatial dependencies of the source, and take decisions jointly on the whole image rather than individually for each pixel. Modern convex optimization techniques are then adopted to achieve a globally optimal solution and PRNU estimation is improved by resorting to nonlocal denoising. Large-scale experiments on simulated and real forgeries show that the proposed technique largely improves upon the current state of the art, and that it can be applied with success to a wide range of practical situations.
Book
Photographic imagery has come a long way from the pinhole cameras of the nineteenth century. Digital imagery, and its applications, develops in tandem with contemporary society's sophisticated literacy of this subtle medium. This book examines the ways in which digital images have become ever more ubiquitous as legal and medical evidence, just as they have become our primary source of news and have replaced paper-based financial documentation. Crucially, the contributions also analyze the very profound problems which have arisen alongside the digital image, issues of veracity and progeny that demand systematic and detailed response: It looks real, but is it? What camera captured it? Has it been doctored or subtly altered? Attempting to provide answers to these slippery issues, the book covers how digital images are created, processed and stored before moving on to set out the latest techniques for forensically examining images, and finally addressing practical issues such as courtroom admissibility. In an environment where even novice users can alter digital media, this authoritative publication will do much so stabilize public trust in these real, yet vastly flexible, images of the world around us. © Springer Science+Business Media New York 2013. All rights are reserved.
Conference Paper
Photo response noise uniformity (PRNU) based source attribution has proven to be a powerful technique in multimedia forensics. The increasing prominence of this technique, combined with its introduction as evidence in the court, brought with it the need for it to withstand anti-forensics. Although robustness under common signal processing operations and geometrical transformations have been considered as potential attacks on this technique, new adversarial settings that curtail the performance of this technique are constantly being introduced. Starting with an overview of proposed approaches to counter PRNU based source attribution, this work introduces photographic panoramas as one such approach and discusses how to defend against it.
Article
In this paper, we propose a novel camcorder identification method based on photo-response non-uniformity (PRNU) to perform well with heavily compressed low-resolution videos. The existing methods calculate normalized-cross-correlation (NCC) to measure the similarity between two PRNUs. Since the NCC is very sensitive to noises, these methods show low accuracy for heavily compressed low resolution videos. The proposed method calculates the similarity by minimum average correlation energy (MACE) filter. Our method shows that it identifies source device more accurately than existing algorithm.
Conference Paper
Video source identification is very important in validating video evidence, tracking down video piracy crimes and regulating individual video sources. With the prevalence of wireless communication, wireless video cameras continue to replace their wired counterparts in security/surveillance systems and tactical networks. However, wirelessly streamed videos usually suffer from blocking and blurring due to inevitable packet loss in wireless transmissions. The existing source identification methods experience significant performance degradation or even fail to work when identifying videos with blocking and blurring. In this paper, we propose a method which is effective and efficient in identifying such wirelessly streamed videos. In addition, we also propose to incorporate wireless channel signatures and selective frame processing into source identification, which significantly improve the identification speed.