Data Hiding Watermarking and Steganography. Outline Introduction to Data Hiding Watermarking –Definition and History –Applications –Basic Principles –Requirements.

Data Hiding

Watermarking and Steganography

Outline • Introduction to Data Hiding • Watermarking

– Definition and History– Applications – Basic Principles– Requirements– Attacks – Evaluation and Benchmarking– Examples

• Steganography – Definition and History – Applications – Basic Principles – Examples of Techniques– Demos

Data Hiding

• Information Hiding is a general term encompassing many sub-disciplines

• Two important sub-disciplines are: Steganography and Watermarking– Steganography:

• Hiding: keeping the existence of the information secret

– Watermarking• Hiding: making the information imperceptible

• Information hiding is different than cryptography (cryptography is about protecting the content of messages)

Secretmessage

Embeddingalgorithm

Carrierdocument

Transmissionvia network

Detector

Secretmessage

Key

Key

The Need for Data Hiding

• Covert communication using images (secret message is hidden in a carrier image)• Ownership of digital images, authentication, copyright• Data integrity, fraud detection, self-correcting images• Traitor-tracing (fingerprinting video-tapes)• Adding captions to images, additional information, such as subtitles, to video, embedding subtitles or audio tracks to video (video-in-video)• Intelligent browsers, automatic copyright information, viewing a movie in a given rated version• Copy control (secondary protection for DVD)

Issues in Data Hiding

• Perceptibility: does embedding information “distort” cover medium to a visually unacceptable level (subjective)

• Capacity: how much information can be hidden relative to its perceptibility (information theory)

• Robustness to attacks: can embedded data survive manipulation of the stego medium in an effort to destroy, remove, or change the embedded data

• Trade-offs between the three:– More robust => lower capacity– Lower perceptibility => lower capacity etc.

Covert communication

Copyright protection of images (authentication)

Fingerprinting (traitor-tracing)

Adding captions to images, additional information,such as subtitles, to videos

Image integrity protection (fraud detection)

Copy control in DVD

Intelligent browsers, automatic copyright information, viewing movies in given rated version

Requirements

Low High

capacityrobustness

invisibilitysecurity

embedding complexitydetection complexity

Requirements Application

Security Robustness

Capacity

The “Magic” Triangle

There is a trade-offbetween capacity,invisibility, and robustness

Secure steganographictechniques

Digital watermarking

• Complexity of embedding / extraction• Undetectability

Additional factors:

Naïve steganography

make data hiding possible

2 gray levels

5 gray levels

31 gray levels

Original

+

+

+

=

=

=

and• Information-theoretic• Removed by lossless compression

• Perceptual• Removed by lossy compression

Watermarking

• Intent: data embedding conveys some information about the cover medium such as owner, copyright, or other information

• Watermark can be considered to be an extended attribute of the data

• Robustness of watermark is a main issue

• Know watermark may be there

• Can be visible or invisible

Steganography

• Intent: transmit secret message hidden in innocuous-looking cover medium so that its existence is undetectable

• Robustness not typically an issue

• Capacity desired for message is large

• Always invisible

• Typically dependent on file format

Watermarking

Watermarking:Definition

• Watermarking is the practice of imperceptibly altering a cover to embed a message about that cover

• Watermarking is closely related to steganography but, there are differences between the two

– In watermarking the message is related to the cover

– Steganography typically relates to covert point-to-point communication between two parties .Therefore, steganography requires only limited robustness

– Watermarking is often used whenever the cover is available to parties who know the existence of the hidden data and may have an interest in removing it

– Therefore, watermarking has the additional notion resilience against attempts to remove the hidden data

• Watermarks are inseparable from the cover in which they are embedded. Unlike cryptography, watermarks can protect content even after they are decoded.

Watermarking:History • More than 700 years ago, watermarks were used in Italy indicate

the paper brand and the mill that produced it • By the 18th century watermarks began to be used as anti-

counterfeiting measures on money and other documents • The term watermark was introduced near the end of the century.It

was probably given because the marks resemble the effects of water on paper

• The first example of a technology similar to digital watermarking is a patent filed in 1954 by Emil Hembrooke for identifying works

• In 1988, Komatsu and Tominaga appear to be the first to use the term "digital watermarking"

• About 1995, interest in digital watermarking began to mushroom

Motivation

• The rapid revolution in digital multimedia and the ease of generating identical and unauthorized digital data.

USA Today, Jan. 2000:Estimated lost revenue from digital

audio piracy US $8,500,000,000.00 • The need to establish reliable methods for copyright protection and authentication.

• The need to establish secure invisible channels for covert communications.

• Adding caption and other additional information.

Watermarking:Applications• Copyright protection

– Most prominent application – Embed information about the owner to prevent others from claiming

copyright – Require very high level of robustness

• Copy protection – Embed watermark to disallow unauthorized copying of the cover – For example, a compliant DVD player will not playback or data that

carry a "copy never" watermark

• Content Authentication – Embed a watermark to detect modifications to the cover – The watermark in this case has low robustness, "fragile"

Watermarking:Basic principles

Watermarking: Requirements

• Imperceptibility – The modifications caused by watermark embedding should be below

the perceptible threshold

• Robustness – - The ability of the watermark to resist distortion introduced by

standard or malicious data processing

• Security – - A watermark is secure if knowing the algorithms for embedding and

extracting does not help unauthorized party to detect or remove the watermark

Digital Watermarking - Examples

• Text – varying spaces after punctuation, spaces in between lines of text, spaces at the end of sentences, etc.

• Audio – low bit coding, random imperceptible noise, fragile & robust, etc.

• Images – least-significant bit, random noise, masking and filtering, etc.

Digital Watermarking – Qualities/Types

Effect on quality of original content – how does watermarking technique impact level of degradation and what is the level of acceptability with the degradation

Visible vs. invisible – visible such as a company logo stamped on an image or movie or invisible and imperceptible

Fragile vs. robust – fragile watermarks break down easily whereas robust survive manipulations of content (in some watermarking of audio files, both are used)

Digital Watermarking –Qualities/Types.Digital Watermarking –Qualities/Types.

Public vs. private – private watermarking techniques require that the original be used as a basis of encryption whereas public does not

Public-key vs. secret-key – secret-key watermarking uses the same watermarking key to read the content as the key that was inserted into the image; public key uses different keys for watermarking the image and reading the image

Digital watermarks categories

Robust watermark- Used for copyright protection.

Requirements: the watermark should be permanently intact to the host signal, removing the watermark result in destroying the perceptual quality of the signal.

Fragile watermark- Used for tamper detection or as a digital signature.

Requirements: Break very easily under any modification of the host signal.

Semi Fragile watermark- used for data authentication.

Requirements: Robust to some benign modifications, but brake very easily to other attacks.

Provide information about the location and nature of attack

Copyright protection of digital images (authentication)

Original

+ =

Watermark Watermarkedimage

• Robustness against all kinds of image distortion• Robustness to intentional removal even when all details about the watermarking scheme are known (Kerckhoff’s principle)• Watermark pattern must be perceptually transparent• Watermark depends on a secret key• Robustness to over-watermarking, collusion, and other attacks

• Ownership is proved by showing that an image in question contains a watermark that depends on owner’s secret key• If pirate embeds his own watermark, the ownership can be resolved by producing the original image or the watermarked image (neither contains pirate’s watermark)

Detectable watermark:Pseudo-random sequenceis either present or not present (1 bit embedded)

Readable watermark: One can recover a short message, e.g. info aboutthe owner (100 bits)

Proving ownership using a digital watermark

Robust, secure, invisible watermark, resistant with respectto the collusion attack (averaging copies of documents with different marks).

Fingerprinting or traitor tracing

Marking copies of one document with a customer signature.

… W1 W2 WN

N customers…

+

original

Typical application:• Adding subtitles in multiple languages• Additional audio tracks to video• Tracking the use of the data (history file)• Adding comments, captions to images

Watermark requirements:• Moderately robust scheme• Robustness with respect to lossy compression, noise adding, and A/D D/A conversion • Original images (frames) not available for message extraction• Security requirement not so strong • Fast detection, watermark embedding can be more time consuming

Adding captions to images, additional information to videos

In spatial domainwatermark embeddedby directly modifying the pixel values

Watermarking for color images• One or more selected color channels. • Luminance

Oblivious vs. non-oblivious watermarkingnon-oblivious = original image is needed for extractionoblivious = original image is not necessary

In transform domainwatermark embedded in the transform space by modifying coefficients

+ =DCT

ModifyDCT

Inverse DCT

Watermarking principles

Watermark embedding:1000 highest energy DCT coefficients are modulated witha Gaussian random sequence wk N(0,1). The watermarkis embedded by modifying the 1000 highest energy DCT coefficients vk

vk’ = vk (1 + awk ),

where vk’ are the modified DCT coefficients, and a is the watermark strength also directly influencing watermarkvisibility.

NEC Scheme

''

')',sim(

NEC SchemeWatermark detection:• Subtract the original image from the watermarked (attacked) image, and extract the watermark sequence ’ (may be corrupted due to image distortion)• Correlate with ’ = original watermark sequence

sim(, ’) is called similarity sim(, ’) > Th => watermark is presentsim(, ’) < Th => watermark is not present

Watermark detection

)(1 i

n

i i baS

Patchwork, (Bender, Gruhl, and Morimoto)

Hypotheses testing is used to confirm the presence of watermark on a certain confidence level.

S = 0 with = 104.5 n if no watermark is present S 2n if watermark present

Set threshold Th to adjust probability of false alarms and missed detections

Using patches of pixels rather than single pixels improves robustness

• Initialize a PRNG with a secret key• Randomly select n pixel pairs with grayscales ai and bi

• Set ai ai + 1 and bi bi – 1• Use S to verify watermark presence

Direct Spread Spectrum in Spatial Domain

Frequency Based Spread Spectrum Watermarking

• Transform image using DCT, DFT, Hadamard, wavelet, key-dependent random transformations• Select n coefficients to be modified

- the most perceptually important coefficients- fixed band depending on image size- key-dependent selection (frequency hopping)

• Generate pseudo-random watermark sequence w1, …, wn

• Modulate selected coefficients vk, k = 1, …, n vk’ = vk + awk, (Ruanaidh et al.) vk’ = vk + avk wk, (Cox et al.) vk’ = vk + a|vk|wk (Piva et al.)

• Use inverse transform to get the watermarked image

Watermark embedding:

Watermark detection using correlation

Original image vk

Watermarked image v’kAttacked watermarked v’’k

Transform coefficients

Non-oblivious schemesWatermark approximation

vk’ = vk + awk, uk = (v’’k– vk)/a vk’ = vk + avk wk, uk = (v’’k– vk)/avk

vk’ = vk + a|vk|wk uk = (v’’k– vk)/a|vk|

• Correlate uk with wk

• Threshold the result• Make a decision about watermark presence

Oblivious schemes

• Correlate v’’k with wk

vk’ = vk + awk,vk’ = vk + a|vk|wk

• If no distortion is presentcorr = v’’k wk = (vk + awk)wk an2 corr = v’’k wk = (vk + a |vk|wk)wk an|v|2

• If incorrect noise sequence is used corr = 0 with corr2 nwhich enables us to set a decision threshold

Watermark detection using correlation

Frequency maskingThe presence of a signal of one frequency can raise the perceptual threshold of signals with frequencies close to the masking frequency.

Masking signal

Frequency

Masked signal

Masking threshold

Spatial maskingImage discontinuities also have the ability to mask small image distortions. Luminance

Edge

Masking threshold

(1) Image divided into 8x8 blocks(2) Each block is DCT transformed(3) Frequency masking*) determines JND for each freq. bin(4) vk = vk + k JND(b, k)(5) Block is inverse DCT transformed(6) Spatial masking**) model verifies invisibility

- If the changes are visible, JND is rescaled, goto (4)

*) Foley, Legge frequency masking model**) Girod’s spatial masking model

Perceptual Watermarking (Tewfik et al)

• Invisibility of the watermark guaranteed• Increased watermark energy leads to higher robustness

• Very high capacity with medium robustness• Useful for embedding video-in-video or audio-in-video without increasing the bandwidth or requiring two separate information streams.

• Watermarked block B’ = B + (p’– p) DCT(S)

8 x 8 block B

8 x 8 signature S

DCT

DCT

Perceptual mask MT = min M

x p

(k-1)T kT (k+1)T

p’ = kT–T/4 ~ 0

p’ = kT+T/4 ~ 1

Data Embedding in Video (Tewfik et al)

Block Diagram of Video Watermarking

Robustness to geometric transformations

Easy if the original image is available (non-oblivious schemes)

Very challenging for oblivious schemes especially for acombination of cropping, scaling, rotation, and shift

Approaches:• Watermarking by small blocks (good for cropping)• Embedding patterns with known geometry• Watermarking using Fourier-Mellin transform (scaling and rotation converted to shift)• Embedding watermarks into image features or salient points

Weak points:• Computational complexity• More powerful geometric attacks - StirMark

Analysis of lighting and shadows

Localized analysis of - noise- histogram- colors

Looking for discontinuities

Forensic analysis

Fragile watermarks

Break easilyComputationally cheapGood localization propertiesToo sensitive for redundant data

Embedding check-sums in the LSBsAdding m-sequences to image blocks

Properties:

Examples:

Steve Walton, “Information authentication for a slippery new age”, Dr. Dobbs Journal, vol. 20, no. 4, pp. 18–26, April 1995.

Fragile Watermarks for Tamper Detection

• A set of key-dependent random walks covering the image• Choose a large integer N• For each walk, add the gray values determined by 7 most significant bits; denote the sum by S• Embed the reminder S mod N into the LSB of the walk• Probability of making a compliant change is 1/N• S could be made walk-dependent to prevent exchanging groups of pixels with the same check-sum

1 2

34

5

6

7 p1: 1 0 1 0 0 0 1 1p2: 1 1 0 0 0 1 0 0… p3: 1 1 0 0 1 0 0 1

S Embedded check-sum

S mod N

1. Overlay the fragile watermark

Three key-dependent binary valued functions fR, fG, fB

fR,G,B : {0, 1, …, 255} {0,1},

are used to encode a binary logo B. The gray scales are perturbed in such a manner so that

B(i,j) = fR(R(i,j)) fG(G(i,j)) fB(B(i,j)) for all (i,j)

The image authenticity is verified by checking the relationship

B(i,j) = fR(R(i,j)) fG(G(i,j)) fB(B(i,j)) for each pixel (i,j)

Perturb

f ( ) = 1

Corresponding pixels

Original image

Authenticated image Binary logo

Robust watermarks on small blocks

Medium robustnessInsensitive to small changesNot as good localization propertiesCan distinguish malicious and

non-malicious modifications

Spread spectrum watermarks onmedium size blocks

Wavelet domain watermarks

Properties:

Examples:

J. Fridrich, “Image Watermarking for Tamper Detection”,Proc. ICIP ’98, Chicago, Oct 1998.

Robustbit extractor

Secretkey K

Block # B

B

64 pixels

50 bits

SynthesizingGaussiansequence

+ =

W(K, B) BWatermarked

block B

2. Insert robust watermark into every block

Hybrid watermark

Fragile, sensitive, and robustGood localization propertiesCan distinguish malicious and

non-malicious modifications

Robust watermarks on medium blockscombined with a fragile watermark

Properties:

Examples:

The watermarked image "Lena" with

outlined blocks and block numbers.

(Retouched eyes) Presence of the robust watermark (above); Fragile watermark indicated tampered areas

with black dots (below).

(After brightness adjustment and JPG compression) Presence of the robust watermark (above); Fragile

watermark indicated tampered areas with black dots (below).

(Replaced face and softened) Presence of the robust watermark (above); Fragile watermark indicated tampered areas with black dots (below).

Self-embedding

FragileSecurity problemsGood localization propertiesTampered areas can be fixedEasy to remove

Coding quantized DCT transformedblocks in distant blocks

Properties:

Examples:

J. Fridrich and M. Goljan “Protection of Digital Images Using Self Embedding”,Symposium on Content Security and Data Hiding in Digital Media, New Jersey Institute of Technology, May 14, 1999.

• Content of block B1 is compressed and encoded in the LSBs of B2

• B1 and B2 are separated by a random vector p

Images with Self-correcting Capabilities

CODE1 : 64 bits per block

CODE2 : 128 bits per block

QUANTIZATION

QUANTIZATION

Binary encoding 11 coefficients

Binary encoding 21 coefficients+ up to 2 next nonzero coefficients

Selfembedding algorithm #2

Selfembedding algorithm #1

For Binary EncodingL=[7 7 7 5 4 3 2 1 7 6 5 5 4 2 1 0 6 5 5 4 3 1 0 0 5 5 4 3 1 0 0 0 4 4 3 1 0 0 0 0 3 2 1 0 0 0 0 0 2 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0].

The bit lengths provided for encoding the 64 coefficients (with sign)

11 coefficient will take 64 bits

Original image Original image embedded in itself

Embedded image (1 LSB encoding) Embedded image (2 LSB encoding)

Reconstruction of a license plate

Tampered image - The license platehas been replaced with a different one

The original license plate afterreconstruction

• 2 LSBs have been used for selfembedding

Reconstruction after mosaic filtering

Secret key

Manipulated image Reconstructed image

Attacks

• Attacks are carried out with an intension to destroy watermark for the purposes of use without having to pay royalties to the originator of the content.

• Must withstand various signal processing attacks:– Compression– Cropping, editing, composing.– Printing.– Adding small amounts of noise.

Attack: Example

• Alice puts image on her web page.• Eve and Mallet copy image and claim it as their own.• All three appear before Judge. • Alice, using her image and Eve’s, extracts the

watermark. • Alice, using her image and Mallet’s altered one, extracts

a noisy version of her watermark.• Alice must convince Judge that the noisy watermark is

indeed hers and not a false alarm.

Watermark attacks• Robustness attacks: Intended to remove the watermark. JPEG

compression, filtering, cropping, histogram equalization additive noise etc.

• Presentation Attacks: Watermark detection failure. Geometric

transformation, rotation,scaling, translation, change aspect

ratio, line/frame dropping, affine transformation etc.

• Counterfeiting attacks: Render the original image useless,

generate fake original, dead lock problem.

• Court of law attacks: take advantage of legal issues.

Typical Attacks and Distortions used on Watermarks

• Enhancement: sharpening, contrast, color correction

• Additive and multiplicative noise: Gaussian, uniform, speckle

• Linear filtering: lowpass, highpass, bandpass

• Nonlinear filtering: median filters, rank filters, morphological filters

Typical Attacks and Distortions used to design Watermarks

• Lossy compression: JPEG, MPEG2, MPEG4, audio as well

• Geometric transformations: shifts, rotations, scaling, shearing (affine)

• Data reduction: cropping, clipping, histogram modification

• D/A and A/D conversion: print-scan, analog TV transmission

Alice’swatermark W1

+ =

Original Xbelongs to Alice

Distributed image

Bob generates a random watermark W2 Subtracts Y–W2 = X’ and creates a false original X’

X’ + W2 = Y = X + W1

X’ = X + W1 – W2 X’ contains W1

X = X’ + W2 – W1 X contains W2

Watermarkedimage Y

Bob’swatermark W2

+ =

False original X’belongs to Bob

Distributed image

Watermarkedimage Y

identical

The IBM Attack (ownership deadlock)

The IBM Attack - solution

• Make the watermark dependent on the original image in a non-invertible way

X + W1(X) = Y

For example, W1(X) is a watermark generated from aPRNG seeded with a hash of X.

Creating a forgery amounts to solving the equation

Y – W1(Z) = Z

for the unknown Z.

• Another possibility is timestamping.

Secure public watermark detector

Detector is implemented as a tamper-proof black box that takes integer matrices on its input and outputs onebit (watermark present or not).

Application: Copy control in DVD players.

Assumptions: The attacker knows the watermarking algorithm and the detection algorithm, has one watermarked image available, but does not have the secret built-in key.

Task: To obtain some knowledge about the secret keyor to remove the watermark

Attack: (Cox, Linnartz, Kalker, Dijk, ...)(1) Find a critical image by progressively deteriorating the image (for example, by replacing the pixel values one-by-one by the average gray level)(2) Feed the detector with special images to reconstruct wk or to learn the sensitivity of the detection function to various pixels.

Many watermark detectors D correlate some quantities xk derived from the watermarked image I with a secretsequence wk: D I H x w Thk kk

N( )

1


Th … thresholdH … Heaviside step function, H(x)=1 for x > 0, H=0 otherwise


Statistical attacks (Kalker)The culprit: Linearity of the watermark detector, andthe ability to purposely modify the derived quantitiesthrough pixel modifications.

Sensitivity attacks (Cox, Linnartz et al.)Determine the set of pixels with the largest influenceon the watermark detector; attempt to remove the watermark by subtracting set_of_sensitive_pixels;iterate.

The culprit: Sensitivity of the watermark detector at thecritical image is the similar or at least positively correlated with that for the watermarked image.

In order to design a watermarking method with a detectorthat would not be vulnerable to those attacks, we need tomask the quantities that are being correlated so that we cannot purposely change them through pixel values and we must introduce nonlinearity into the scheme to prevent the sensitivity attack.

Key-dependent basis functions and a special nonlinear detection function may solve the problem.

Observation:


Limitations of digital watermarking

• Digital watermarking does not prevent copying

or distribution.

• Digital watermarking alone is not a complete

solution for access/copy control or copyright

protection.

• Digital watermarks cannot survive every

possible attack.

Challenges in Watermarking research

• Lack of protocols, standards and benchmarking.

• Lack of comprehensive mathematical theory.

• Watermark survival for all attacks.

• Relating robustness, capacity, perceptual quality and security.

• Will it be used, and how the legal system adopt it?

Trends in watermarking research

• Color image watermarking, and other multimedia signals.

• 2nd generation watermarking.

• Watermarking of maps graphics and cartoons.

• Information theoretic issues.

• Applications beyond copyright protection.

• Protocols and standardization.

Steganography

Steganography• Embed information in such a way, its very

existence is concealed.

• Goal

– Hide information in undetectable way both perceptually

and statistically.

– Security, prevent extraction of the hidden information.

• Different concept than cryptography, but use some

of its basic principles.

HISTORY

• 440 B.C.– Histiaeus shaved the head of his most trusted slave

and tattooed it with a message which disappeared after the hair had regrown. To instigate a revolt against Persians.

• 1st and 2nd World Wars– German spies used invisible ink to print very small

dots on letters.

– Microdots – Blocks of text or images scaled down to the size of a regular dot.

Early steganography

• Pictographs: e.g., Sherlock Holmes’s Dancing Men.

“Come Here At Once”

An Example: Null-Cipher

• Message sent by a German spy during World war-I:

PRESIDENT’S EMBARGO RULING SHOULD HAVE IMMEDIATE NOTICE. GRAVE SITUATION AFFECTING INTERNATIONAL LAW. STATEMENT FORESHADOWS RUIN OF MANY NEUTRALS. YELLOW JOURNALS UNIFYING NATIONAL EXCITEMENT IMMENSELY.

Null Cipher-Solved!

• Message sent by a German spy during World war-I:

PRESIDENT’S EMBARGO RULING SHOULD HAVE IMMEDIATE NOTICE. GRAVE SITUATION AFFECTING INTERNATIONAL LAW. STATEMENT FORESHADOWS RUIN OF MANY NEUTRALS. YELLOW JOURNALS UNIFYING NATIONAL EXCITEMENT IMMENSELY.

Pershing sails from NY June I.

Other Old Ideas

• Pinpricks in maps.• Tattoos on scalp.• Dotted I’s and crossed T’s.• Hidden Meanings: “Is father dead or deceased?”• Deliberate Mispellings or Errors, e.g., errors in

trivia books, logtables, etc.• Unusual languages: e.g.,navajo, peculiar sounds

used esp., in Guerilla warfare (Chenghez Khan)

The prisoners problem

• Alice and Bob are in jail and wish to hatch an escape plan.

• Alice's and Bob's communication pass through Willy.

• Alice's and Bob's goal is to hide their ciphertext in innocuous

looking way so that Willy will not become suspicious.

• If Willy is a passive warden he will not do any thing to Alice's

and Bob's communication.

• If Willy is an active warden he will alter the data being sent

between Alice and Bob.

Problem Formulation

Wendy

Hello Hello

“Hello”

Terminology

Yes

NoEmbedding Algorithm

CoverMessage

Stego Message

SecretKey

SecretMessage

Message Retrieval Algorithm

Secret Message

Secret Key

Is Stego Message?

Suppress Message

Alice Wendy Bob

Steganography Techniques

• Substitution methods– Bit plane methods– Palette-based methods

• Signal Processing methods– Transform methods– Spread spectrum techniques

• Coding methods– Quantizing, dithering– Error correcting codes

• Statistical methods – use hypothesis testing• Cover generation methods - fractals

Stego-system Criteria

• Cover data should not be significantly modified ie perceptible to human perception system

• The embedded data should be directly encoded in the cover & not in wrapper or header

• Embedded data should be immune to modifications to cover

• Distortion cannot be eliminated so error-correcting codes need to be included whenever required

Places to Hide Information:Steganography

• Images

• Audio files

• Text

• Video

We focus on Images as cover media. Though most ideas apply to video and audio as well.

Steganography in Text

• Soft Copy Text– Encode data by varying the number of spaces

after punctuation – Slight modifications of formatted text will be

immediately apparent to anyone reading the text


• Soft Copy Text– Use of White Space (tabs & spaces) is much

more effective and less noticeable– This is most common method for hiding data

in text


• Soft Copy Text– Encode data in additional spaces placed at

the end of a line

F o u r s c o r e a n d

s e v e n y e a r s a g o

o u r f o r e f a t h e r s


• Hard Copy Text– Line Shift Coding

• Shifts every other line up or down slightly in order to encode data

– Word Shift Coding• Shifts some words slightly left or right in order to

encode data


• Some methods that can be used with either hard or soft copy text– Feature Coding– Syntactic – Semantic

Steganography in Audio

• Low Bit Coding

• Phase Coding

• Spread Spectrum

• Echo Data Hiding


• Low Bit Coding– Most digital audio is created by sampling the signal

and quantizing the sample with a 16-bit quantizer. – The rightmost bit, or low order bit, of each sample can

be changed from 0 to 1 or 1 to 0 – This modification from one sample value to another is

not perceptible by most people and the audio signal still sounds the same


• Phase Coding– Relies on the relative insensitivity of the human

auditory system to phase changes – Substitutes the initial phase of an audio signal with a

reference phase that represents the data – More complex than low bit encoding, but it is much

more robust and less likely to distort the signal that is carrying the hidden data.


• Direct Sequence Spread Spectrum– Spreads the signal by multiplying it by a chip,

which is a maximal length pseudorandom sequence

– DSSS introduces additive random noise to the sound file


• Echo Data Hiding– Discrete copies of the original signal are

mixed in with the original signal creating echoes of each sound.

– By using two different time values between an echo and the original sound, a binary 1 or binary 0 can be encoded.

Steganography in MP3

• Music company publishes albums in mp3 and publishes over internet.

• Some people take these mp3 files and publish under their own name.

• Case goes to court. • The Music company needs to prove that the material

which is exhibit is indeed the one they published. • They need a hidden copyright.

• Principle : Audio signals contain a significant portion of information that can be discarded without average listener noticing the change.

• MP3Stego – tool developed by Fabien A.P. Petitcolas• Tool operates within MP3 encoding process• The data to be hidden is first compressed, encrypted and hidden in

MP3 bit stream.• Quantization of original audio signal takes place. • At the same time, for some selected points, data is introduced in the

quantized output, • Distortions introduced by these are constantly checked for to satisfy the

psychoacoustics model.• A variable records the number of bits that are for data in the actual

audio, data for huffman coding and hidden data. • Key is selected using pseudo random bit generator based on SHA-1

and dictates the values that would be modified to hold the hidden data.

Steganography in MP3 (contd.)

Steganography in Images

Way images are stored:• Array of numbers representing RGB values for each

pixel• Common images are in 8-bit/pixel and 24-bit/pixel

format.• 24-bit images have lot of space for storage but are huge

and invite compression• 8-bits are good options.• Proper selection of cover image is important. • Best candidates: gray scale images ..• Cashing on limitations of perception in human vision

Steganography: Bit plane Methods

• Image: replace least significant bit (LSB) of image intensity with message bit

• Replace lowest 3 or 4 LSB with message bits or image data (assume 8 bit values)

• Data is hidden in “noise” of image• Can hide surprisingly large amounts of data this

way• Very fragile to any image manipulation

Bit plane Methods

• Variations include:– Using a permutation of pixel locations at

which to hide the bits.

– Put bits at only certain locations in image where there is “significant” variation and change in gray-value would not be visually perceptible

Least Significant Bit method

• Consider a 24 bit picture• Data to be inserted: character ‘A’: (10000011)• Host pixels: 3 pixel will be used to store one character of 8-bits• The pixels which would be selected for holding the data are chosen

on the basis of the key which can be a random number.• Ex: 00100111 11101001 11001000

00100111 11001000 11101001 11001000 00100111 11101001

Embedding ‘A’ 00100111 11101000 1100100000100110 11001000 1110100011001001 00100111 11101001

• According to researchers on an average only 50% of the pixels actually change from 0-1 or 1-0.

+ =

Example: Copyright Fabian A.P. Petitcolas,

Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/

8-bit (256 grayscale) images.

TOP SECRET

http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/

http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/

Sacrificing 2 bits of cover to carry 2 bits of secret image

Original Image Extracted Image

Sacrificing 5 bits of cover to carry 5 bits of secret image

Original Image Extracted Image

Palette-based Methods

• Palette manipulation means changing the way the color or grayscale palette represents the image colors

• Bit methods are used in palette manipulation schemes

• Data hidden in “noise” of image• Often radical color shifts occur - can tip off that

data is hidden• Use grayscale to overcome color shift problem

Sample palettes

Red color shade

variations

Drastic & Subtle shade

variations

Gray Scale shade

variations

Palette-based Methods

• Pseudo color 8-bit image: 256 different colors that are indexed by the numbers 0,…,255

• To insert information, for example, S-Tools reduces the number of colors from 256 to 32 and uses the lower LSB bit places to hide data

• In this case, 8 colors are the same before data embedding; after data embedding 8 colors are very close visually but differ in their bit representation

Steganography for palette images

LSB encoding cannot be directly applied to palette-based imagesbecause new colors, that are not present in the palette, would be created.

Two sources of palette images:1. Color truncation + dithering of photographs2. Computer generated images (fractals, cartoons, animations)

A secure steganographic method will produce modified carriers compatible with the source

Possibilities

Hiding in the paletteHiding in the image dataNon-adaptive techniquesAdaptive techniques

Palette artifactsImage data artifacts

Artifacts

Possible approachesMessage hiding in the image data - greedy techniques

Decrease color depth and expand 1. Collapse 256 colors 128 colors 2. Expand 128 colors 256 colors by including a close color (e.g., flip the LSB of the blue channel) 3. Embed a binary message into the LSB of the blue channel of randomly selected pixels 4. Read the message from the LSB of the blue channel

Alternatively 1. Decrease color depth to 32 colors and include all colors obtained from LSB shuffling of all 32 colors (one color produces 23 new colors) 2. Encode messages into the LSB of pixel colors

1 bpp

3 bpp

1. Assign parity to palette colors2. Embed message bits as the parity of colors

Possible approaches

Message hiding in the image data

Parity embedding

Message: 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 0 0 1 1 1 1

Randomly chosen pixel with color

Find the color in the sorted palette

Sorted palette

Replace the LSB of the index to color C1 with the message bit

The new index now points to aneighboring color C2

Replace the index of the pixel in the original image to point to thenew color C2.

index = 30 = 00011110

00011110

00011111

C1

C1

C2

Critical assumption: Colors close in the luminance-sorted palette are also close in the color space.

Airfield embedded using White Noise Storm

Airfield embedded using S-Tools with 8-bit Renoir

Original 24-bit Renoir converted to 248 color GIF; Airfield inserted using S-Tools; Final stego image has 256 colors in GIF format.

White Noise Storm inserts data by using spread spectrum technology and frequency hopping, but severely changes the palette.

Palette Methods

• Color ci 10110 0 1 0

• Color ci+1 10110 0 1 1

• When order palette by luminance there are groups of pixel colors that look identical to the eye; L = 0.299R + 0.587G + 0.114B

• Airfield is a 3 bit image put in last 3 bits of Renoir image

• Very fragile – destroyed by image manipulation

Original Image Text Hidden using StegoDos

Text Hidden using White Noise Storm

Text Hidden using S-Tools

Example: Insertion of a Paragraph of Text

Transform Domain Techniques

• Discrete Cosine Transform• Discrete Wavelet Transform• Discrete Fourier Transform• Mellin-Fourier Transform• Related:

– Singular Value Decomposition– Minimax Eigenvalue Decomposition

Discrete Cosine Transform

The forward equation, for image A, is

N

yv

N

xuyxavCuC

Nvub

N

x

N

y 2

)12(cos

2

)12(cos),()()(

2),(

1

0

1

0

N

yv

N

xuvubvCuC

Nyxa

N

u

N

v 2

)12(cos

2

)12(cos),()()(

2),(

1

0

1

0

The inverse equation, for image B, is


• JPEG uses DCT to compress an image• Many different approaches to use DCT to hide

information• Message is embedded in signal, not noise• Studies on visual distortions conducted by

source coding community can be used to predict the visible impact of the hidden data in the cover image

• Can be implemented in compressed domain, saving time


Basic idea of JPEG:

1. Convert image to YIQ color space

2. Each color plane is partitioned into 8x8 blocks

3. Apply DCT to each block

4. Values are quantized by dividing with preset quantization values (in a table)

5. Values are then rounded to nearest integer

Steganography: One Approach using DCT

• The sender and receiver agree ahead of time on location for two DCT coefficients in the 8 x 8 block

• Middle frequencies with same quantization value: Location 1 is (4,1) & Location 2 is (3,2)

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99

Steganography: DCT

• The DCT is applied to each 8 x 8 block in the image producing a block Bi

• Each block will encode a single bit, 0 or 1• If the message bit is a 1 then the larger of

the two values Bi(4,1) and Bi(3,2) is put in location (4,1), otherwise if the message bit is a 0, the smaller of the two values is put in location (4,1)

DCT Steganography

• If the difference |Bi(4,1) - Bi(3,2)| < µ, then the values Bi(4,1) and Bi(3,2) are adjusted so that |Bi(4,1) - Bi(3,2)| > µ

• This assures that the relative difference will not be lost when the compression is done

• This last step can introduce distortion into image• The JPEG compression is performed (if desired) and

then the resulting image is inversed transformed• Other modifications to this algorithm have been

researched that overcome some of these limitations

DCT Steganography

• To extract the data, the DCT is performed on each block, and the coefficient values at locations (4,1) and (3,2) are compared

• If Bi(4,1) > Bi(3,2) then the message bit is a 1, otherwise it is a 0

Wavelet Steganography

• Many different schemes proposed• Wang and Kao give a multithreshold wavelet

coding scheme where coefficients with high values are used to store information

• These coefficients are assumed to keep relative values the same even after multiple image processing operations

• If the coefficients change value much, the visual difference is noticeable in the image

• Can be used for textured and natural images

Example of Image and Its Wavelet Transform (no hidden data)

i i

Discrete Fourier Transform

The formulae for the DFT and its inverse are

1

0

1

0

2exp

2exp),(),(

N

x

N

y N

vyj

N

uxjyxavuF

1

0

1

02

2exp

2exp),(

1),(

N

u

N

v N

vyj

N

uxjvuF

Nyxa

Discrete Fourier Transform Steganography

• The DFT has success when phase modulation is used to hide data

• Phase components have less visual impact than magnitude components

• Phase components are also more robust against noise distortion

• A DFT coefficient is used if its energy is high enough

Quantization Based Steganography

• Message is embedded through choice of

quantizer.• Consider a uniform quantizer of step size Δ ,Odd

reconstruction points represents message ‘1’ & even represents message ‘0’.

• If the value of cover coefficient is ‘126’ , Δ=10,message bit = ‘1’

then after embedding the message

Stego coefficient = 130.

Selectively Embedding in Coefficients (SEC) Scheme

• The image is divided into 8 X 8 nonoverlapping blocks, and an 8 X 8 discrete cosine transform (DCT) of the blocks is taken. Let us denote the intensity values of the 8 X 8 blocks by aij and the corresponding DCT coefficients by cij , where i , j Є {0,1,2,…,7} . Thus,

c=DCT2 (a)

where DCT2 denotes a two-dimensional DCT. • Let the quantization matrix entries for a particular QF

be, MQFij where i , j Є {0,1,2,…,7} The coefficients cij

used for information embedding are computed as

c~ij = cij / MQF

ij i , j Є {0,1,2,…,7}

Ref : K. Solanki, N. Jacobsen, U. Madhow, B. S. Manjunath and S. Chandrasekaran, "Robust Image-Adaptive Data Hiding Based on Erasure and Error Correction" IEEE Transactions on Image Processing, vol. 13, no. 12, pp. 1627-1639, Dec. 2004.

Selectively Embedding in Coefficients (SEC) Scheme

• The coefficients are scanned in zig-zag fashion, as in JPEG, to get one dimensional vector c~

k where 0 <= k <=63 and only a predefined low frequency band after excluding the dc coefficient (k=0 term) is considered for hiding (i.e., 1 <= k <=n).

• Quantize these coefficient values c~k to nearest integers

and take their magnitude to get rk

• The i.e. coefficient after embedding is obtained as

where, bl is the message and Qbl is the quantizer Q0 or Q1 depending upon the message.

where, bl is the message and Qbl is the quantizer Q0 or Q1 depending upon the message.Q0: Quantize to even number. Q1:Quantize to odd number.

If after embedding = t then the same message is embedded into the next qualified coefficient to have synchronization with the decoder.

Results : SEC Scheme

• The decoding is perfect for jpeg compression less than QF

. Cover image ‘I13.jpg’ Stego image with 10000bits embedded

variations of capacity with QF variations of capacity with threshold

Steganalysis

DefinitionSearching for the existence of hidden messages

or Stego-content in a given medium.

• Stego-only: only stego-medium is available for analysis

• Known cover: both original cover media and stego-media are used

• Known message: hidden message is revealed to facilitate review of media in preparation for future attacks

Goals

• Passive steganalysis Detect the presence or absence of a message

• Active Steganalysis Estimate the message length and location Determine the algorithm/Stego tool Estimate the Secret Key in embedding Extract the message

Types of Steganalysis

Embedding algorithm specific Steganalysis

Universal Steganalysis

Universal Steganalysis Techniques

• Techniques which are independent of the embedding technique

• One approach – identify certain image features that reflect hidden message presence.

• Two stepsExtract ‘good’ featuresFinding strong classification algorithms

Steganalysis in Practice

• Techniques designed for a specific steganography algorithmGood detection accuracy for the specific technique

• Universal Steganalysis techniquesLess accurate in detectionUsable on new embedding techniques

Supervised learning based Steganalysis

• Supervised learning methods construct a classifier to differentiate between stego and non-stego images using training examples.

• Some features are first extracted and given as training inputs to a learning machine. These examples include both stego as well as non-stego examples.

• The learning classifier iteratively updates its classification rule based on its prediction and the ground truth. Upon convergence the final stego classifier is obtained.

Blind Identification based Steganalysis

This method can be clearly understood by the following block diagram:

Hence, by estimating the transformation A & its inverse the secret message can be obtained.

Statistical detection based Steganalysis

Here, 3 cases arise,

a) For completely known statistics case, the parametric models for stego-image & cover image.

b) For partially known statistics case, the parametric probability models are available but, not the exact parameter models. These parameters are estimated

c) For completely unknown case, Bayesian prior models are assumed and detectors are developed.

Universal Steganalysis Techniques

• Techniques which are independent of the embedding technique

• Identify certain image features that reflect hidden message presence.

• Two problems– Calculate features which are sensitive to the

embedding process– Finding strong classification algorithms which

are able to classify the images using the calculated features

State – of – Art Steganalysis Techniques

• Wavelet based methods - Farid and Lyu - Deepak Hinge• Using Markov Random Fields (Sullivan)• Using Binary Similarity Measures (N. Memon)• Using Image Quality Metrics (Avcibas)

Fusing one or more of the above techniques to improve the detection accuracy.

Wavelet-based Universal Steganalysis• Wavelet transform is used to obtain the features.• The mean, variance, skewness and kurtosis of the sub

band coefficients at each location, scale and color channel forms features.

i.e. 12(n-1) features per color. n: Number of scales.

usually 4 scales are used. therefore 36 features per color channel.

Wavelet-based Universal Steganalysis• In order to capture higher order statistical correlations

second set of 36 features per color are found based on the errors in a linear predictor of coefficient magnitude.

• For green channel at scale i ,

• This can be written in the matrix form as,

is found by minimizing,

Wavelet-based Universal Steganalysis

• Therefore is found by solving

Which yields, The log error between the actual & predicted coefficients

is, Then the mean, variance mean, variance, skewness

and kurtosis of this log error is used as another 36 features per color.

Steganalysis: Binary Similarity Measures

Motivation: Embedding leaves Statistical Artifacts. Correlation between the low-bit planes for a cover image

differs from a stego image. Set of Binary Similarity Measures used to detect the artifacts. A feature vector is generated using the BSMs.

Bit planes

Each bit-plane is a binary image in itself.

11010011 00011011 00011010

11010010 00000110 00011000

11011111 11010100 00011000

Value : 1 1 0 1 0 0 1 1Value : 1 1 0 1 0 0 1 1

Bit- no: 1 2 3 4 5 6 7 8 Bit- no: 1 2 3 4 5 6 7 8

. . . .. . . .

Bitplane-1 Bitplane-2 Bitplane-7 Bitplane-8

Binary texture Statistics

Let xi = { x i,k |, k = 1,2,…K } be the sequences of bits representing ‘K’ neighborhood pixels {N,E,W,S}

i runs over all the image pixels

M X N size of the image

Binary texture Statistics

1 1

= 3

= 4

0 1

We define an agreement variable for pixel Xi as:

, j = 1,…4., K = 4, i = 1....M x N.

,the Kronecker Delta function

Now, we can calculate the one step co-occurrence values :-

Now, we define 3 types of binary similarity measures :-

The first group consists of the computed similarity differences dmi = mi

7th – mi8th , i = 1…10 across the 7th and 8th bit-planes.

These use { a, b, c, d }. The second group consists of histogram and entropic features.

We first normalize the histograms of the agreement scores for the new bit-planes(7th and 8th).

Then, based on these values, we define the similarity measures

The third set of measures are some what different as we use the neighborhood – weighing mask in that.

For each binary image, we compute a 512 bin histogram based on weighted neighborhood where, the score given by weighing the eight directional neighbors with following mask.

We get 18 such measures for grayscale images and 54, for color images

Commercial Watermarking / Steganography Tools

• Digimarc ImageBridge – Inserts imperceptible digital watermarks onto images

• Digimarc MarcSpider – Tracks all images with Digimarc’s watermark on the

Internet – Searches over 50 million images on the Internet

• Digimarc is providing secure identification solution to over 200 government units for over 24 countries including the state of New Jersey, Vermont, and Michigan

• Philips Digital Network WaterCast for videos

• Companies which mark their products: Corbis, workbookstock.com, The British Library

(Digimarc). BBC, Reuters, The Universal Studios (Philips watercast)…

• Some Success stories: Corbis

– identifies up to 50 cases of unauthorized commercial use of its images per month

– Settled 28 cases in and out of court in 8 months – Movie Market paid 1 million for the settlement

• Playboy– Webbworld paid $310,000 as well as reasonable attorney’s fee

for using 62 Playboy’s images

Conclusion

• Steganography has its place in the security. Field is very young. On its own, it won’t serve much but when used as a layer of cryptography, it would lead to a greater security.

• Far fetched applications in privacy protection and intellectual property rights protection.

• Research is going on in both the directions – One is how to incorporate hidden or visible copyright information

in various media, which would be published. – At the same time, in opposite direction, researcher are working

on how to detect the trafficking of illicit material & covert messages published by certain outlawed groups.

On-line Sources

• Stego-Tools: <http://www.stegoarchive.com/>

Lots of freeware (and commercial) tools for hiding information in text, audio, video, and image files

Famous Stego-tools for image – Outguess+, F5+, S-Tools, etc,.

• Helpful Steganalysis programs– WinHex-www.winhex.com– Hiderman– Stegspy– Etc..

Data Hiding Watermarking and Steganography. Outline Introduction to Data Hiding Watermarking –Definition and History –Applications –Basic Principles –Requirements.

Documents

data hiding information

additional information

information watermark

data hiding perceptibility

data robustness of watermark

watermarking steganography

copyright data integrity

dvd slide