Top Banner
XML Watermarking & Information Hiding 孙孙孙 孙孙 孙孙孙 孙孙孙孙孙 、、 孙孙孙孙孙孙孙孙孙孙孙孙 孙孙孙孙孙孙孙孙孙孙孙孙孙孙孙
39

XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Jan 17, 2016

Download

Documents

Frank Webb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

XML Watermarking & Information Hiding

孙星明 博士、教授、博士生导师

湖南大学计算机与通信学院网络与信息安全湖南省重点实验室

Page 2: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Markup Language

SGML (Standard Generalized Markup L

anguage)

XML (Extensible Markup Language)

HTML (HyperText Markup Language)

XHTML

Page 3: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Publishing Information in WWW

Page 4: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Publishing Information in WWW

Page 5: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

XML Document

XML element type

text

image

Video

Audio

executive codes

CorrespondingWatermarking

and information hiding

techniquescan be employed

Can we use its own information to do watermarking or

information hiding?

Page 6: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Known content-based technique

Change font size, color

Append white spaces at the end of

a line0-space ( )

1-tab (	)

Page 7: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Shortcomings

white spaces at the end of a line

Increase page size

Layout might be changed

Detect very easily by selection

Page 8: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Specification

Element (Entity) <name attribute1 … attributen> contents </name >

<name attribute1 … attributen> </name >

<name attribute1 … attributen>

Attributename=“value”

Example<font face="Verdana" size="4" color="#FFFF00">Student Number: </font>

Page 9: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Properties of markup labels

Property 1: Element and attribute

names are case-insensitive<font face="Verdana" size="4" color="#FFFF00">Student Number: </font>

<Font face="Verdana" size="4" color="#FFFF00">Student Number: </font>

<font face="Verdana" size="4" color="#FFFF00">Student Number: </Font>

<Font face="Verdana" size="4" color="#FFFF00">Student Number: </Font>

Page 10: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Properties of markup labels

Property 2: Attributes are order-

insensitive<font face="Verdana" size="4" color="#FFFF00">Student Number: </font>

<font size="4" face="Verdana" color="#FFFF00">Student Number: </font>

Page 11: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Pair attributes technique

pair attributes order (Corinna John)

key attribute, corresponding attribute

key / corresponding (1) corresponding/key (0) <font face="Verdana" size="4" color="#FFFF0

0">Student Name:</font><font size="4" face="Verdana" color="#FFFF0

0">Student Name:</Font>

key / corresponding table

size, detect difficultly

Page 12: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Attributes permutation technique

equivalent attributes permutation<font face="Verdana" size="4" color="#FFFF00">Student Name:</font>

<font face="Verdana" color="#FFFF00" size="4">Student Name:</font>

<font size="4" face="Verdana" color="#FFFF00">Student Name:</font>

<font size="4" color="#FFFF00" face="Verdana" >Student Name:</font>

<font color="#FFFF00" face="Verdana" size="4" >Student Name:</font>

<font color="#FFFF00" size="4" face="Verdana" >Student Name:</font>

lexicographic (alphabetic) order: f precedes a permutation g iff f(k)<g(k) for the minimum value of k such that f(k)<>g(k).

Page 13: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Attributes permutation technique

Generating attributes permutations in lexicographical order

<font color="#FFFF00" face="Verdana" size="4" >Student Name:</font>

<font color="#FFFF00" size="4" face="Verdana" >Student Name:</font>

<font face="Verdana" color="#FFFF00" size="4">Student Name:</font>

<font face="Verdana" size="4" color="#FFFF00">Student Name:</font>

<font size="4" face="Verdana" color="#FFFF00">Student Name:</font>

<font size="4" color="#FFFF00" face="Verdana" >Student Name:</font>

attributes permutations order numberscolor face size 0

color size face 1

face color size 2

face size color 3

size face color 4

Size color face 5

Page 14: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Attributes permutation technique

If the number of attributes of an element >=2, it may be used to embed hidden information or watermark

Let be the elements, whose number of attributes , in a web page, the embedded capacity is

1{ }ni iE

| | 2iE

21

log (| | !)n

ii

E

Page 15: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Embedded capacity example

Name of web page Capacity (bytes)

www.163.com 48

www.sina.com.cn 279

www.sohu.com.cn 338

www.microsfot.com 15

www.ebay.com 78

www.yahoo.com 33

Page 16: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Perceivability

Can not perceive when browse the page

Hard to perceive through reading the source codes

Page 17: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Robust or resistant against editing Contents can be changed

Page 18: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Robust or resistant against editing Font, size, color can be changed

Page 19: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Security

attributes permutations order numberscolor face size 0

color size face 1

face color size 2

face size color 3

size face color 4

Size color face 5

Apply hash to concatenation of attributes and key to get order number

( )hash attribute key

Page 20: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Performance comparison

TypeSize

change

Perceivable by Capacity

(bit)

Extra

payloadview code

White

spaceY easy easy Page lines N

Case

changeN N easy Tags N

Attribute

pairN N hard Pair table

Equivalent

attributesN N hard N

1

| | / 2n

ii

E

21

log (| | !)n

ii

E

Page 21: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Other potential properties

String delimitersname=“value”

name=‘value’

White Space Between the Element’s Name and the First Attribute

<font face=”verdana” size=”3”>

<font face=”verdana” size=”3”>

White Space Between Attributes<font face=”verdana” size=”3”>

<font face=”verdana” size=”3”>

Page 22: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Other potential properties

White Space after “=“

<font face=”verdana” size=”3”>

<font face= ”verdana” size=”3”>

White Space Between Elements

<td>con1</td><td>con2</td>

<td>con1</td> <td>con2</td>

Page 23: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Other potential properties

The default value of an attribute

<font face=”verdana” size=”3”>

<font face=”verdana”>

Page 24: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Current progress

Introduce insignificant attributes<font face=”verdana”>

<font face=”verdana” xyz=“abcd”>

Break through the capacity bottle neck

Web page watermarking

Text watermarking

21

log (| | !)n

ii

E

Page 25: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Our focus on watermarking

Text content securityFunded by NSFC Key Project 60736016

Funded by NSFC 60373062

Software watermarkingFunded by NSFC 60573045

Wireless sensor network securityFunded by 973 Project 2006CB303000

Funded by NSFC 60873198

SteganalysisFunded by 115 Project

Page 26: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

谢谢

联系电话: 0731-8821341 , 13875971258

Email : [email protected]

http://nisl.hnu.cn/

Page 27: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

HyperText Markup Language (HTML), version 4.0, the publishing language of the World Wide Web

Recall that in HTML, element and attribute names are case-insensitive; the convention is meant to encourage readability.

Element and attribute names in this document have been marked up and may be rendered specially by some user agents.

http://www.w3.org/TR/1998/REC-html40-19980424/about.html#h-1.2.1

Page 28: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

http://www.w3.org/TR/html/#xhtml HTML 4 [HTML4] is an SGML (Standard Generalized Markup Language) application

conforming to International Standard ISO 8879, and is widely regarded as the standard publishing language of the World Wide Web.

SGML is a language for describing markup languages, particularly those used in electronic document exchange, document management, and document publishing. HTML is an example of a language defined in SGML.

SGML has been around since the middle 1980's and has remained quite stable. Much of this stability stems from the fact that the language is both feature-rich and flexible. This flexibility, however, comes at a price, and that price is a level of complexity that has inhibited its adoption in a diversity of environments, including the World Wide Web.

HTML, as originally conceived, was to be a language for the exchange of scientific and other technical documents, suitable for use by non-document specialists. HTML addressed the problem of SGML complexity by specifying a small set of structural and semantic tags suitable for authoring relatively simple documents. In addition to simplifying the document structure, HTML added support for hypertext. Multimedia capabilities were added later.

In a remarkably short space of time, HTML became wildly popular and rapidly outgrew its original purpose. Since HTML's inception, there has been rapid invention of new elements for use within HTML (as a standard) and for adapting HTML to vertical, highly specialized, markets. This plethora of new elements has led to interoperability problems for documents across different platforms.

Page 29: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

XML™ is the shorthand name for Extensible Markup Language [XML].

XML was conceived as a means of regaining the power and flexibility of SGML without most of its complexity. Although a restricted form of SGML, XML nonetheless preserves most of SGML's power and richness, and yet still retains all of SGML's commonly used features.

While retaining these beneficial features, XML removes many of the more complex features of SGML that make the authoring and design of suitable software both difficult and costly.

Page 30: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

XHTML is a family of current and future document types and modules that reproduce, subset, and extend HTML 4 [HTML4]. XHTML family document types are XML based, and ultimately are designed to work in conjunction with XML-based user agents. The details of this family and its evolution are discussed in more detail in [XHTMLMOD].

XHTML 1.0 (this specification) is the first document type in the XHTML family. It is a reformulation of the three HTML 4 document types as applications of XML 1.0 [XML]. It is intended to be used as a language for content that is both XML-conforming and, if some simple guidelines are followed, operates in HTML 4 conforming user agents. Developers who migrate their content to XHTML 1.0 will realize the following benefits:

XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools.

XHTML documents can be written to operate as well or better than they did before in existing HTML 4-conforming user agents as well as in new, XHTML 1.0 conforming user agents.

XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model [DOM].

As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments.

The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility.

Page 31: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Terrorismhttp://www.arabteam2000-forum.com/

Jihad 信息隐藏技术训练手册 ( 阿拉伯文 ) 的部分英文翻译

Page 32: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Watermark embedding

Page 33: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Watermark detection

Page 34: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Classification of watermarking—by host Image

Audio

Video

Text (Document)

Software / Executive code

Database

Page 35: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Text watermarking & Information Hiding

email

web

book PDF,WORDWPS,PS,etc

TXTunformatted

WatermarkingWatermarking

Information hidingInformation hiding

Page 36: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Any redundance?

Character CodeOne to oneOne to one

NONO

Page 37: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Utilize format information

Line-shift Coding

vertically displacing an entire text line

Word-shift Coding

horizontally shifting the location of a word within a text line

Character feature coding

altering a particular feature of an individual character

Page 38: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Utilize language information

Synonym substitution

Syntactic transform

TMR tree (text meaning representation)

Add spaces at the end of a line

Page 39: XML Watermarking & Information Hiding 孙星明 博士、教授、博士生导师 湖南大学计算机与通信学院 网络与信息安全湖南省重点实验室.

Text recoverable watermarking

Format based watermarking?

Natural language watermarking?

How to combine??

Text recoverable watermarking???