KI2 - 1 Kunstmatige Intelligentie / RuG Structural Pattern Recognition Marius Bulacu & prof. dr. Lambert Schomaker.

Post on 20-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

KI2 - 1

Kunstmatige Intelligentie / RuG

Structural Pattern Recognition

Marius Bulacu & prof. dr. Lambert Schomaker

2

Classification – First Step to Intelligence

The intelligent agent confronts an overwhelming confusion of sensory data and pattern classification is the first crucial step in making sense of the world.

The nature of classification and decision had been a central theme in the discipline of philosophical epistemology, the study of the nature of knowledge.

The foundations of pattern recognition can be traced back to Plato and later Aristotle, who distinguished between:

- an “essential property” – shared by all members in a class or “natural kind”

- an “accidental property” – which would differ among members in the class

Pattern recognition can be cast as the problem of finding such essential properties of a category.

3

General Structure of aPattern Recognition System

segmentation

sensing

feature extraction

classification

post-processing

decision

input

costs

context

missing features

DEEP PROBLEM!

4

Statistical vs. StructuralPattern Recognition

Statistical Patterns are represented by an n-dimensional feature vector.

Each feature is a numerical measure of a characteristic of the pattern.

The statistical variations of the features within each class are described and evaluated.

The feature space is partitioned into mutually exclusive regions with each region belonging to a specific class.

Recognition of an unknown sample is performed by determining in which region it falls and therefore to which pattern class it belongs.

Structural Patterns are represented by knowledge about how sub-pattern primitives relate to each other and must be combined to make up the entire pattern.

The primitives are simple and invariant sub-pattern formations that have no direct relation to the structure of the entire pattern.

Patterns are modeled in terms of primitives and their relations and are usually represented as strings, trees or graphs.

Recognition of an unknown pattern is performed by finding the most similar sample from a database of known objects using error-tolerant string or graph matching.

5

Knowledge-based Symbolic Methods

Assumption: the Turing / Von Neumann computer is a universal computation engine…

…therefore it can be used at all levels of information processing:

provided an appropriate algorithm can be designed which operates on appropriate representations

6

Knowledge-based Symbolic Methods

provided an appropriate algorithm can be designed…

which operates on appropriate representations…

7

Knowledge-based Symbolic Methods

…provided an appropriate algorithm can be designed…

mechanisms: recursion, hierarchic procedures search algorithms parsers matching algorithms string manipulation.. numerical computing

signal processing image processing statistical processing

8

…which operates on appropriate representations…

stacks linear strings and arrays matrices linked lists trees

Knowledge-based Symbolic Methods

9

…which operates on appropriate representations…

stacks linear strings and arrays matrices linked lists trees

is indeed successful in many information processing problems

Knowledge-based Symbolic Methods

Example: double spiral problem

in inner orouter spiral?

Example: double spiral problem

in inner orouter spiral?

difficult for, e.g., neural nets

Example: double spiral problem

in inner orouter spiral?

Answer: outside

difficult for, e.g., neural nets

Example: double spiral problem

in inner orouter spiral?

How?- flood-fill algorithm- other?

Example: double spiral problem

in inner orouter spiral?

- Find the right representation!

odd/even count

is not sensitive to shape variations of the spiral: a general solution

= Outside

count theintersections

Example: double spiral problem

in inner orouter spiral?

Outside

16

Culture

If it doesn’t work, you didn’t think hard enough.

You have to know what you do. You have to prove that & why it works. Even neural networks work on top of the

Turing/von Neumann engine (it will always win).

If you’re smart, you can often avoidNP-completeness.

Use of probabilities is a sign of weakness.

17

Strong Points

Scalability is often possible Convenience: little context dependence, no

training Reusability Transformability (compilation) Algorithmic refinement once it is known

how to do a trick (e.g., graphics cards and

DSPs in mobile phones: ugly code but

highly efficient)

18

Challenges

Knowledge dependence is expensive– not a problem in “IT” application design– a challenge to AI

Uncertainty

Noise

Brittleness

19

Solutions

More and more representational weight: (UML, Semantic Web, XML solves everything)

Symbolic learning mechanisms:– induction: version spaces grammar inference– decision tree learning– rewriting formalisms

Active hypothesis testing (what if…, assume X…)

20

Example 1

Primitives:

horizontal stick - H

vertical stick - V

loop west - W

loop east - E

loop north - N

loop south - S

closed loop - C

Patterns:

a -b -

e -

c -

h -

d -

s -z -

T -x -

WC

VC

E

CV

CE

VS

EW

WE

NWES

HV

6 - EC

21

Example 2

In Reading Systems (optical character recognition), only a small part of the algorithm concerns problems of image processing and character classification.

Most of the code is concerned with the structure

of the text image:– where are the blobs? – are these blobs text, photo or graphics?– how to segment into meaningful chunks: characters, words?– what is the logical organization (reading order) in the physical

organization of pixels?

Knowledge-based approaches are a necessity!

Name of conference

Programme committee

Brief description of conference

Submission details

26

Example of layout analysis

Knowing the type of a text block strongly reduces the number of possible interpretations

Example: “address block”

Address:– name of person– street, number– postal code, city

prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland

Amsterdam7/7/2003

address

prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland

address

person name

street

codes+city

country

prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland

address

titles initials surname

street street ,,, digits

4 digits 2 upper case city name

country name

prof dr. L.R.B. SchomakerGrote Appelstraat 239712 TS GroningenNederland

<address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </country></address>

(address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city)is-above (country))

Content Layout

prof dr. L.R.B. SchomakerGroteAppelstraat 239712 TSGroningenNederland

etc.

etc.

<address> <person> <title></title> <initials or first name> </initials or first name> <surname></surname> </person> <home> <street name></street name> <number> </number> </home> <city> <postal code> <four digits></four digits> <white space></white space> <two upper-case letters> …. </postal code> </city> <country> </country></address>

(address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city)is-above (country))

Content Layout

etc.

etc.

HELPS TEXT CLASSIFICATION

HELPS TEXT SEGMENTATION

prof dr. L.R.B. SchomakerGroteAppelstraat 239712 TSGroningenNederland

33

Spatial Relations in the XY PlaneBetween Rectangles

A 1) left-disjoint

2) left-touch

3) left-overlap

4) included-touch_left

5) included / includes

6) included-touch_right

7) right-overlap

8) right-touch

9) right-disjoint

X axis (similar on the Y axis)

1 2

3

4 5 6

7

8 9

34

Constructing a Graph Representation

A

B C

A B

C

(4, 9)

(7, 9) (9, 4)

1) left-disjoint

2) left-touch

3) left-overlap

4) included-touch_left

5) included / includes

6) included-touch_right

7) right-overlap

8) right-touch

9) right-disjoint

35

Example 3

Primitives: Patterns:

Head – H

Arm – A

Leg – L

Tail – T

HAALL HTTLLLL

How similar are these two patterns?

What is a proper similarity / distance measure between strings?

36

String Matching with ErrorsEdit Distance

Edit distance = how many fundamental operations are required to transform one string x into another string y

The fundamental operations are:

Substitution: a character in x is replaced by the corresponding character in y

Insertion: a character of y is inserted into x

Deletion: a character in x is deleted

v r i e n d e l i j k

rf

iendly

y

x

j

i

source

sink

0 1 2 43 65 7 8 9 1110

2

1

3

4

6

5

7

8

1 2 43 65 7 8 9 1110

2 1 32 54 6 7 8 109

3 2 21 43 5 6 7 98

4 3 12 32 4 5 6 87

5 4 23 21 3 4 5 76

6 5 34 12 2 3 4 65

7 6 45 23 2 2 3 54

8 7 56 34 3 3 3 5 = d(x, y) = d(y, x) 4

deletionremove letter of x

insertioninsert letter of y

into x

substitutionreplace letter of x

by letter of y

C[i, j] = min( C[i-1, j] + 1, C[i, j-1] + 1, C[i-1, j-1] + 1 - (x[i], y[j]) )

Edit Distance – The Cost Matrix

deletion insertion substitution / no change

no change

38

Statistical vs. StructuralPattern Recognition

Statistical

vectors of real numbers

probability distributions

metrics

Structural

lists of nominal attributes

strings, trees, graphs

rules

top related