Watermarking, steganography and content forensics - … · Watermarking Steganography Content forensics 2. Ingemar J. Cox Watermarking Watermarking is the practice of imperceptibly

Post on 14-Jul-2018

236 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Watermarking, steganography and content forensics

Ingemar J. Cox

Ingemar J. Cox

Introduction

Watermarking

Steganography

Content forensics

2

Ingemar J. Cox

Watermarking

Watermarking is the practice of imperceptibly altering a Work (image, song, etc.) to embed a

message about that Work.

3

Ingemar J. Cox

Watermarking

The primary motivation for watermarking has been to protect content

4

Ingemar J. Cox 5

Ingemar J. Cox 5

Ingemar J. Cox 6

Muzak: the first commercial watermarking

The first skyscraper was built in Chicago in 1885

Ingemar J. Cox

Muzak: the first commercial watermarking

The elevator was an essential element

In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers

Muzak was the dominant supplier

Nirvana - On a plain

Rockabye Baby - On a plain

7

Ingemar J. Cox

Muzak: the first commercial watermarking

The elevator was an essential element

In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers

Muzak was the dominant supplier

Nirvana - On a plain

Rockabye Baby - On a plain

7

Ingemar J. Cox

Muzak: the first commercial watermarking

The elevator was an essential element

In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers

Muzak was the dominant supplier

Nirvana - On a plain

Rockabye Baby - On a plain

7

Ingemar J. Cox 8

Muzak: the first commercial watermarking

Emil Hembrooke, “Identification of sound and like signals”, US Patent 3,004,104 Filed 1954, Issued 1961

“The present invention makes possible the positive identification of the origin of a musical presentation and thereby constitutes an effective means of preventing such piracy, i.e. it can be likened to a watermark in paper.”

In use until the mid 1980’s

Ingemar J. Cox 9

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Applications of digital watermarking

Broadcast Monitoring Nielsen/Digimarc Teletrax/Philips

Owner Identification Verimatrix - IPTV Widevine Technologies

Proof of Ownership

Transaction Tracking Thomson/Technicolor (Philips) - Oscar screeners Cinea/Dolby - Digital cinema

Ingemar J. Cox

Applications of digital watermarking

Content Authentication Signum Technologies

Copy Control Verance - HD-DVD, DVD-audio

Legacy systems Tektronix - syncing sound and video (lipsync) MarkAny - syncing lyrics with music (mp3 players)

11

Ingemar J. Cox

Watermarking

Why not use cryptography?

Cryptography assumes:1. Alice and Bob trust one another2. Communication between Alice and Bob succeeds

However, Alice (Hollywood) cannot trust Bob (consumer) And if communication fails, watermark protection fails

12

Ingemar J. Cox

Watermarking

Watermarking is NOT cryptography

13

Ingemar J. Cox

Watermarking

Watermarking IS communications

14

Ingemar J. Cox

Watermarking

The content is more important than the message

So the watermark/message must be imperceptible

And often, the message payload is small

But, to be practical, a watermark must also be robust

15

Ingemar J. Cox

Watermarking

Spread spectrum communications content modeled as noise

high noise regime

Communications with side information content modeled as side information

16

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Watermarking as communications

17

Transmitter Receiver+

Noise

message, m’message, m x y

x is limited by a power constraint∑ x2[i] ≤ p

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Watermarking as communications

18

Embedder Detector

Noise

message, m message, m’x y+ +

Noise

x is limited by a power constraint∑ x2[i] ≤ p

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

Requirements: Unobtrusive Survive common distortions

E.g. lossy compression

Spread spectrum communications Originally developed for military communications

Difficult for enemy to detect Difficult for enemy to jam

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

Let’s consider embedding an 8-bit message in an image 01100101

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

Since we have an 8-bit message Spread each bit over all pixels

Spread spectrum watermarking

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

Each bit is represented by a “chip” sequence A pseudo random number sequence

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

10 101 0 1 0

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

Detect each bit using linear correlation

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Spread spectrum communications

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Perceptual modelling

In the previous example, the random pattern was added equally to all parts of the image

But some areas are more (less) sensitive than others

We can identify these areas using perceptual models Same models used for lossy compression

Must embed in perceptually SIGNIFICANT regions to be robust to lossy compression

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Original image

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

No perceptual modeling

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Perceptual modeling

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Original image

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Communications with side information

Spread spectrum watermarking models the cover Work as noise

However, the cover Work is Not random Completely known at the time of embedding

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Watermarking as communications

35

Embedder Detector

Noise

message, m message, m’x y+ +

Noise

x is limited by a power constraint∑ x2[i] ≤ p

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Watermarking as communications

36

Embedder Detector

Noise

message, m message, m’x y+ +

Noise

x is limited by a power constraint∑ x2[i] ≤ p

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Communications with side information

Host signal need not interfere with watermark message Potential for much greater payloads

Dirty Paper Coding

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Writing on dirty paper

38

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Writing on dirty paper

38

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Writing on dirty paper

38

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Writing on dirty paper

38

A

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Writing on dirty paper

38

A

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Watermarking with side information

Communications one-to-one mapping between message and code

Communications with side information one-to-many mapping between message and codes

Implementations Quantization index modulation

Chen and Wornell Dirty paper trellis coding

39

Ingemar J. Cox 40

Ingemar J. Cox 40

END OF PART ONE

Ingemar J. Cox

Steganography

Steganography is the practice of undetectably altering a Work, to embed a message.

42

Ingemar J. Cox

Watermarking

Watermarking is the practice of imperceptibly altering a Work (image, song, etc.) to embed a

message about that Work.

43

Ingemar J. Cox

Steganography

Steganography is the practice of undetectably altering a Work, to embed a message.

44

Ingemar J. Cox

Steganography

Motivation Spies Dissidents Terrorism Organized crime

Little or no evidence to support motivation Child pornography

Little or no evidence to support motivation

45

Ingemar J. Cox

The Technical Mujahid

46

TABLE OF CONTENTS

Section 1: Covert Communications and Hiding Secrets Inside Images

Section 2: Designing Jihadi Websites from A-Z

Section 3: Smart Weapons, Short Range Shoulder-Fired Missiles

Section 4: The Secrets of the Mujahideen, an Inside Perspective

Section 5-6: Video Technology and Subtitling Video Clips

send technical articles to

http://www.teqanymag.arabform.com

Ingemar J. Cox

History of steganography

Herodotus tatooing slave’s shaved head

Aeneas the Tactician modifying height of letters, marking letters with holes

Cardan’s Grille

Francois Bacon italic and normal fonts

47

Ingemar J. Cox

History of steganography

SALT-II Signed June 18, 1979 by Jimmy Carter and Leonid

Brezhnev

48

Ingemar J. Cox

History of steganography

The Prisoner’s Problem ( G.J.Simmons)

49

Embedder Extractor

Cover Work

message, m message, m’x +y is limited by a statistical constraint

Warden BobAlice

y

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

The science behind steganography

Anderson and Petitcolas “Thought experiment”

Imagine a perfect compressor for music

Compressor Music in → random string out

Decompressor Random string in → music out

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

The science behind steganography

Then take message and encrypt it

Input encrypted message into decompressor Outputs music! Alice sends music to Bob

Bob now compresses music Output is encrypted message Decrypts to obtain message

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Statistical Steganography

Cachin Provided first information-theoretic definition for

steganographic security Perfectly secure

DKL(Pc ||Ps) = 0 ε-secure

DKL(Pc || Ps) ≤ ɛ

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Statistical steganography

Assumes the Warden knows the distribution of cover Works, i.e. PC

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

LSB embedding

One of the earliest forms of digital steganography

Simply flip the least significant bit to encode the hidden message

Assumes that the LSB bits are random There’re not!

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

LSB steganalysis

Histogram attack H(i) is the frequency of intensity i

Assume we embed a bit in every LSB

Then half the time we change on odd number to an even number e.g. 1→0, 3 →2, …

And half the time we change on even number to an odd number e.g. 0→1, 2 →3, …

Then Hs(2i)=Hs(2i+1) for i=0,127

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

First-order statistics

Stochastic modulation Maintains first-order statistics But not higher-order statistics

Various algorithms are available, e.g. OutGuess

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Model-based steganography

Sallee

Split cover Work, c, into two parts ca - unaltered cb – altered

Model the conditional distribution P(cb|ca)

Generate distribution using an arithmetic entropy encoder/decoder

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Choosing the cover text

Message more important than the cover Work

Given a message, we can choose which cover Work to hide it in Not possible for watermarking

Correlated steganography

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Choosing the cover text

If our hidden message is an image, X choose a cover image, Y, that is similar

The number of bits needed to encode X is H(X)

The number of bits needed to encode X given Y is H(X|Y)

Thus the number of bits needed to encode the hidden image may be much less

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Minimizing the embedding distortion

Matrix embedding

“Wet paper” codes

60

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Communications with side information

Coding for defective memory

Imagine I have a USB memory stick

It can store 3-bits of information

Worse still, it’s faulty One of the bits is stuck at “1”

I can therefore send you 2-bits of information But which 2-bits did I send?

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010001

011100101110111

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

010

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

010

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

010

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

010

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Matrix embedding/Wet paper coding

62

000

010

001

011100

101

110

111 00

01

10

11

010

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Wet paper codes

Several differences between coding for defective memory and steganography The number of “stuck at” elements is much greater The number of “stuck at” elements varies Real-time performance not required

Syndrome codes

LT-codes

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Blind steganalysis

How does the Warden know the distribution of cover Works, PC ?

Analytic models

Machine learning Neural networks Support vector machines (SVM) etc.

END OF PART TWO

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics

How can we be certain that an image, audio conversation or video is authentic?

Active technology Insert authentication signature

cryptography Watermarking

Passive (non-intrusive) technology Content analysis

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Recent history: 2003

LA Times 2003

67

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Recent history: 2004

68

US Democratic Presidential Nomination 2004

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Recent history: 2005

The Star, May 2005

69

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Recent history: 2006

Reuters August 2006

70

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Digital forensics

On August 7th 2006, Reuters withdrew all 920 photographs by a freelance Lebanese photographer from its database after a review showed Adnan Hajj had altered two images.

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics

Like steganalysis, look for statistical anomolies in the content.

72

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics: source identification

Identify which camera took an image

CCD cameras exhibit several sources of noise dark current, shot noise photoresponse non-uniformity noise (PRNU)

Estimate PRNU (a naturally occurring watermark) detect using correlation

Lukas, Fredrich and Goljan Bayram, Sencar and Memon

73

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics: detecting re-sampling

Detecting re-sampling linear, bi-cubic, etc.

Introduces correlation between neighboring pixels

Use EM algorithm to estimate both the re-sampling amount and the correlation Popescu and Farid

74

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics: double JPEG compression

Introduces artifacts in the histogram of the DCT coefficients

75

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics

Detecting lighting inconsistencies estimate light source direction

76

UC

L A

dast

ral P

ark

Po

stg

rad

ua

te C

am

pu

s

Content forensics: copy-move forgery

Copy-move forgery copy portion of image and paste in another location introduces correlation!

77

THE END

top related