Detecting image splicing in the wild Web

Detecting image splicing in the wild (Web)Markos Zampoglou, Symeon Papadopoulos, Yiannis Kompatsiaris

1Centre for Research and Technology Hellas (CERTH) – Information Technologies Institute (ITI)

WeMuV2015 workshop, ICME, June 29, 2015, Turin, Italy

A new journalistic paradigm

#2

…and its pitfalls

Blind image splicing detection

• Assume the splice differs in some aspect from the rest of the image

– Capture invisible “traces”: DCT coefficient distribution, PRNU, CFA interpolation patterns…

• But traces degrade at subsequent image alterations

• Social media journalism establishes a different paradigm from typical image forensics

– We don’t have the luxury of demanding we see the originals

#3

Image tampering lifecycle

#4

Images in the wild

#5

• Twitter:– Images larger than 2048×1024 are scaled down

– Large PNG files (> 3MB) converted to JPEG

– JPEG files resaved at quality 75

• Facebook– Images larger than 2048 × 2048 are scaled down

– Large PNG files converted to JPEG

– JPEG files resaved at varying quality (~70-90)

• Both media platforms also erase metadata from images

Existing image splicing datasets

#6

Name Format Masks #images

Columbia1 BMP grayscale No 933/912

Columbia Unc.2 TIFF Unc. Yes 183/180

CASIA TIDE v2.03 TIFF Unc. , JPEG, BMP No 7491/5123

VIPP Synthetic4 JPEG Yes 4800/4800

VIPP Realistic4 JPEG Manual 63/68

1http://www.ee.columbia.edu/ln/dvmm/downloads/AuthSplicedDataSet/AuthSplicedDataSet.htm2http://www.ee.columbia.edu/ln/dvmm/downloads/authsplcuncmp/3http://forensics.idealtest.org:8080/indexopt_v2.php4http://clem.dii.unisi.it/~vipp/index.php/imagerepository/129-a-framework-for-decision-fusion-in-image-forensics-based-on-dempster-shafer-theory-of-evidence

http://www.ee.columbia.edu/ln/dvmm/downloads/AuthSplicedDataSet/AuthSplicedDataSet.htm

http://www.ee.columbia.edu/ln/dvmm/downloads/authsplcuncmp/

http://forensics.idealtest.org:8080/indexopt_v2.php

http://clem.dii.unisi.it/~vipp/index.php/imagerepository/129-a-framework-for-decision-fusion-in-image-forensics-based-on-dempster-shafer-theory-of-evidence

Issues with existing datasets

#7

• Ground-truth masks: only Columbia Uncompressed and VIPP offer binary masks

• Quality of splices: only CASIA and VIPP Realistic contain realistic forgeries

• Image format: Only VIPP and CASIA offer JPEG images– At least 87% of the common crawl corpus

(http://commoncrawl.org/) images are JPEG– Out of 13,577 forged images collected in our investigations, ~95% were in JPEG format

• Neatness: All datasets contain first-level forgeries with no further alterations

Collecting a dataset of Web forgeries

• Aim: build an evaluation framework with the web-based case in mind

– Evaluate existing and future algorithms against the real-world, web-based application scenario

– Assess the status of the web: how many versions of each forgery, how close to the original

• Methodology: identify verified forgeries, and exhaustively download as many instances as possible for analysis

#8

The Wild Web Dataset (1/5)

• Identified 82 cases of confirmed forgeries

#9


• Collected all detectable instances of each case

• Removed exact file duplicates

• 13,577 images in total

• Identified and removed heavily altered variants of each case

#10


• By removing crops and post-splices, we were left with 9,751 images

• Variants within cases were separated, and the sources were gathered where possible

#11


• Designed ground-truth binary masks for each sub-case corresponding to each possible forgery step (for complex forgeries)

#12


#13

• The final dataset by the numbers:

– 82 cases of forgeries

– 92 forgery variants

– 101 unique masks

– 13,577 images total

– 9,751 images resembling the original forgery

• For each of the 82 cases, a match on any mask of any variant should be considered an overall success

Experimental evaluations

#14

• Emulated real-world conditions: we applied the minimum typical transformations (JPEG resave & rescaling) to the datasets compatible with the task:

– Columbia Uncompressed

– VIPP Synthetic

– VIPP Realistic

– Set 1: JPEG recompression at Quality 75

– Set 2: rescale to 75% size followed JPEG recompression at Quality 75

Reconsidering evaluation protocols (1/3)

#15

• Forgery localization algorithms typically produce a value map

• Ground truth takes the form of a binary mask signifying the tampered area

• Past approaches compare values under the mask to the rest of the image:– Kolmogorov-Smirnov (KS) statistic (Farid et al, 2009)

– Median value (Fontani et al, 2013)


#16

• A recompressed image from VIPP Realistic, analyzed using (Lin et al, 2009)

• This would be considered a good detection under typical methodologies

– Median under mask: ~0.93

– Median outside mask: ~0.02

– K-S Statistic: ~0.41

• Any human evaluator would disagree

#17


Proposed evaluation protocol (1/2)

#18

1. Take the output value map

2. Binarize according to some method-appropriate threshold

– e.g. 0.5 for probabilistic methods

3. Compare the binary map to the ground truth mask:

4. Values above an experimental threshold (0.65) suggest a strong match

𝐸 𝐴,𝑀 = 𝐴 ∩𝑀 2

𝐴 × 𝑀

Proposed evaluation protocol (2/2)

#19

• Adapt to mimic a human’s perspective:

1. Apply multiple morphological processing operations

2. Try multiple (method-appropriate) thresholds

3. Keep the best-fitting result (bias towards success)

• For non-spliced images (true negative/false positive detection), apply the same methodology and declare a success for a blank binary map

– Main disadvantage: binary outcome, no parameters to tweak for ROC curve generation.

Evaluations

#20

• Evaluated seven algorithms:

– Double JPEG quantization (Lin et al, 2009), (Bianchi et al, 2011), (Bianchi et al, 2012a)

– Non-Aligned double JPEG quantization (Bianchi et al, 2012b)

– CFA artifacts (Ferrara et al, 2007)

– High-frequency DW noise (Mahdian et al, 2009)

– JPEG ghosts (Farid, 2010)

• Comparing median values:

Evaluation results: Emulated datasets (1/2)

#21

Dataset(Lin et al,

2009)

(Bianchi et

al, 2011)

(Ferrara et

al, 2007)

(Bianchi

et al,

2012b)

(Bianchi

et al,

2012b)

(Mahdian

et al,

2009)

Columbia

Uncomp.

Orig.

JPEG

Resized

- -

0.89 (0.05)

0.05 (0.05)

0.03 (0.04)

- -

0.39 (0.04)

0.09 (0.05)

0.11 (0.05)

VIPP

Synthetic

Orig.

JPEG

Resized

0.47 (0.05)

0.30 (0.04)

0.05 (0.05)

0.51 (0.05)

0.43 (0.04)

0.05 (0.05)

0.15 (0.05)

0.16 (0.05)

0.05 (0.04)

0.57 (0.01)

0.39 (0.05)

0.05 (0.05)

0.28 (0.05)

0.16 (0.05)

0.05 (0.05)

0.13 (0.05)

0.10 (0.05)

0.06 (0.05)

VIPP

Realistic

Orig.

JPEG

Resized

0.54 (0.04)

0.32 (0.04)

0.13 (0.04)

0.58 (0.04)

0.36 (0.04)

0.12(0.06)

0.04 (0.04)

0.04 (0.04)

0.03 (0.04)

0.70 (0.04)

0.51 (0.04)

0.23 (0.04)

0.28 (0.04)

0.17 (0.04)

0.17 (0.04)

0.20 (0.04)

0.20 (0.04)

0.18 (0.04)

• Proposed evaluation framework:


#22

Dataset(Lin et al,

2009)

(Bianchi et

al, 2011)

(Ferrara et

al, 2007)

(Bianchi

et al,

2012b)

(Bianchi

et al,

2012b)

(Mahdian

et al,

2009)

Columbia

Uncomp.

Orig.

JPEG

Resized

- -

0.66 (0.16)

0.00 (0.20)

0.00 (0.24)

- -

0.12 (0.57)

0.02 (0.86)

0.04 (0.79)

VIPP

Synthetic

Orig.

JPEG

Resized

0.44 (0.27)

0.26 (0.30)

0.00 (0.23)

0.52 (0.00)

0.30 (0.10)

0.00 (0.00)

0.01 (0.23)

0.01 (0.28)

0.00 (0.23)

0.58 (0.09)

0.23 (0.27)

0.00 (0.15)

0.04 (0.25)

0.01 (0.29)

0.00 (0.29)

0.04 (0.74)

0.04 (0.74)

0.00 (0.84)

VIPP

Realistic

Orig.

JPEG

Resized

0.41 (0.46)

0.13 (0.44)

0.00 (0.47)

0.38 (0.09)

0.17 (0.29)

0.00 (0.00)

0.09 (0.22)

0.00 (0.25)

0.00 (0.28)

0.23 (0.30)

0.14 (0.46)

0.03 (0.25)

0.03 (0.39)

0.01 (0.43)

0.01 (0.47)

0.04 (0.90)

0.02 (0.90)

0.01 (0.47)


#23

• Methods behave generally as expected

– CFA patterns destroyed by the first JPEG compression

• (Mahdian et al, 2009) is not particularly effective, but shows little vulnerability to alterations

• DQ methods show some degree of robustness to recompression only

• Rescaling is extremely disruptive, as expected

Evaluation results: Wild Web dataset (1/2)

#24

• 36 out of 82 cases were successfully detected by at least one method

– Not a single image gave good results for the other 46 cases, for any algorithm

(Lin et

al, 2009)

(Bianchi et

al, 2011)

(Ferrara et

al, 2007)

(Bianchi et

al, 2012b)

(Bianchi et

al, 2012b)

(Mahdian

et al, 2009)

(Farid,

2010)

Detections 13 12 1 8 5 15 29

Unique 4 1 0 1 2 6 10

Evaluation results: Wild Web dataset (2/2)

#25

• The noise-based method of (Mahdian et al, 2009) proved disproportionately successful,– We should not forget how prone to false positives it is.

• JPEG Ghosts are very robust, if we can manage the amount of output they produce

• Even in the cases where successful detection occurred, only a few images were correctly detected– 1386 images in the entire dataset (~ 14.3%)

– Excluding the three easiest classes, only 333 out of 8580 images were detected (~ 3.9%)

Forgery detection in the Wild (1/4)

#26


#27


#28


#29

Conclusions

• In the web, very few images retain traces which are detectable with today’s state-of-the-art forensic approaches

• It is difficult to estimate the relative age of each instance of a viral image

• DQ-based methods give results with the highest confidence, but are not particularly robust

• JPEG Ghosts demonstrate significantly higher robustness than other methods, but produce large amounts of noisy output

• DW high-frequency noise also appears to give good results, but seems extremely prone to false positives

#30

Future steps

• For the web journalism case, robustness ought to be a central consideration for future algorithm evaluations

• The Wild Web dataset is freely distributed for research purposes– Due to copyright considerations, this is currently only feasible through direct contact– The dataset should be maintained to incorporate new cases of forgeries, as they

come out

• Advance the state-of-the-art by focusing on more robust traces of splicing

• Following the life-cycle of images on the web can help locate their earliest versions and build an account of the alterations that have taken place (Kennedy & Chang, 2008)

• The question remains: to what extent is the task feasible? When can we be certain that all traces have been lost?

#31

References

#32

• Bianchi, Tiziano, Alessia De Rosa, and Alessandro Piva. "Improved DCT coefficient analysis for forgery localization in JPEG images." In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pp. 2444-2447. IEEE, 2011.

• Bianchi, Tiziano and Alessandro Piva, “Image forgery localization via block-grained analysis of JPEG artifacts,” IEEE Transactions on Information Forensics and Security, vol. 7, no. 3, pp. 1003–1017, 2012.

• Ferrara, Pasquale, Tiziano Bianchi, Alessia De Rosa, and Alessandro Piva. "Image forgery localization via fine-grained analysis of cfa artifacts." Information Forensics and Security, IEEE Transactions on 7, no. 5 (2012): 1566-1577.

• Farid, Hany. "Exposing digital forgeries from JPEG ghosts." Information Forensics and Security, IEEE Transactions on 4, no. 1 (2009): 154-160.

• Fontani, Marco, Tiziano Bianchi, Alessia De Rosa, Alessandro Piva, and Mauro Barni. "A framework for decision fusion in image forensics based on dempster–shafer theory of evidence." Information Forensics and Security, IEEE Transactions on 8, no. 4 (2013): 593-607.

• Kennedy, Lyndon, and Shih-Fu Chang. "Internet image archaeology: automatically tracing the manipulation history of photographs on the web." In Proceedings of the 16th ACM international conference on Multimedia, pp. 349-358. ACM, 2008.

• Lin, Zhouchen, Junfeng He, Xiaoou Tang, and Chi-Keung Tang. "Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis." Pattern Recognition 42, no. 11 (2009): 2492-2501.

• Mahdian, Babak and Stanislav Saic, “Using noise inconsistencies for blind image forensics,” Image and Vision Computing, vol. 27, no. 10, pp. 1497–1503, 2009.

Thank you!

• Slides: http://www.slideshare.net/sympapadopoulos/detecting-image-splicing-in-the-wild-web

• Get in touch:

@markzampoglou / [email protected]

@sympapadopoulos / [email protected]

#33

Detecting image splicing in the wild Web

Technology

jpeg images

jpeg jpeg files

wild web dataset

dataset of web forgeries

images variants

web based case

forged images

vipp synthetic4 jpeg