Top Banner
Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005
24

Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Dec 14, 2015

Download

Documents

Elyssa Ashurst
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Two-dimensional pattern matchingM.G.W.H. van de Rijdt

23 August 2005

Page 2: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Introduction•Problem description

•Naive algorithm

•Filter-based algorithms

– A simple filter function

– Takaoka-Zhu

– Baker-Bird

•Baeza-Yates & Régnier

•Polcar

•Conclusions

•Future work

•Questions

Page 3: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Problem description•One-dimensional pattern matching: finding all

occurrences of a pattern string in a text string

•Two-dimensional pattern matching: finding all

occurrences of a 2D pattern matrix in a 2D text matrix

•Applications: image processing, ...

Page 4: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Naive algorithm•Simply check for each position in the text whether

there is a match there

•Most straightforward, but inefficient, solution

•Better algorithms

– use gathered information to disregard a larger

area of the text at onces

and/or

– precompute information to determine more

quickly whether a match exists on a position in

the text

Page 5: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms (0)

•Define a “filter function”, which transforms each

row of the pattern matrix to a single value

•Using this function, reduce the pattern matrix to a

single (column) vector

Page 6: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms (1)•Apply the filter function to partial rows of the text

matrix

•There can only be an occurrence where the

pattern’s column vector occurs in the reduced text

•Use 1D pattern matching to find those occurrences

Page 7: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: a simple filter function•A simple example of a filter function: f(x) = x[0]

•Pattern:

•Text:

a a ab a ba a a

aba

a a ab a ba a a

b a ab a aa b a

a b bb b bb a b

b ba bb b

b a aa b ba b a

a aa ba a

a a ab a ba a a

b a ab a aa b a

a b bb b bb a b

b a aa b ba b a

Page 8: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: Takaoka-Zhu

•Filter function: hash function from the (1D) Karp-

Rabin algorithm

a a ab a ba a a

b a ab a aa b a

a b bb b bb a b

b ba bb b

b a aa b ba b a

a aa ba a

qjxrowPcoljjxf jPcol mod||])[(:)(0: 1)(

Page 9: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: Baker-Bird (0)

•Based on Aho-Corasick automaton

– Aho-Corasick is an algorithm for (1D)

multipattern matching

– It uses a special automaton, based on the

pattern strings

•Filter function for Baker-Bird: state in the Aho-

Corasick automaton, based on the pattern’s rows

Page 10: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: Baker-Bird (1)

•Pattern:

•Trie based on pattern rows {aaa, bab}:

q0

q1 q2 q3

q4 q5 q6

aa a

ab

b

a a ab a ba a a

Page 11: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: Baker-Bird (2)

•Pattern:

•Aho-Corasick automaton based on pattern rows

{aaa, bab}:

q0

q1 q2 q3

q4 q5 q6

aa a

ab

b

b

b b

b

a

b

a

a

a a ab a ba a a

q3q6q3

b

Page 12: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Filter-based algorithms: Baker-Bird (3)

•Pattern:

•Text:

a a ab a ba a a

q3q6q3

a a ab a ba a a

b a ab a aa b a

a b bb b bb a b

b ba bb b

b a aa b ba b a

a aa ba a

q3 q4 q5q6 q4 q5q3 q3 q4

q2 q4 q4q2 q3 q4q5 q6 q4

q4 q4 q5q4 q5 q6q6 q5 q6

q2 q3 q3q4 q5 q6q5 q2 q3

Page 13: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Baeza-Yates & Régnier (0)

•Say our pattern has m rows

•In the text, each occurrence of the pattern

intersects with exactly one row of the form i * m – 1

0

m-1

2*m-1

3*m-1

Page 14: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Baeza-Yates & Régnier (1)

•Algorithm idea:

– use 1D multipattern matching to search for

occurrences of any pattern row in these rows

of the text

– where such a match occurs, check if there is

a match with the entire pattern in the

surrounding areaa a ab a ba a a

a a ab a ba a a

b a ab a aa b a

a b bb b bb a b

b ba bb b

b a aa b ba b a

a aa ba a

Page 15: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (0)

•In some 1D pattern matching algorithms, we view

an occurrence of the pattern as a suffix of a prefix

of the text

•For Polcar, we do the same in two dimensions

Page 16: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (1)

•For each prefix of the text A, we compute the set of

suffixes of A that are also a prefix of the pattern:)()( PprefAsuff

Page 17: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (1)

•For each prefix of the text A, we compute the set of

suffixes of A that are also a prefix of the pattern:)()( PprefAsuff

Page 18: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (1)

•For each prefix of the text A, we compute the set of

suffixes of A that are also a prefix of the pattern:)()( PprefAsuff

Page 19: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (1)

•For each prefix of the text A, we compute the set of

suffixes of A that are also a prefix of the pattern:)()( PprefAsuff

Page 20: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Polcar (2)•In derivations of the corresponding 1D pattern

matching algorithms, sets of prefixes of the pattern

are represented by their element of maximum length

•In 2D there is not always one unique maximum

•But these sets of matrices can be represented by

their maximal elements

Page 21: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Conclusions•Presentation of several 2D pattern matching

algorithms

•All of them have been formally derived

– derivation is a formal proof

– derivations show the major design decisions

•Similarities between the filter-based algorithms

•Several improvements to existing algorithms

– most notably: in Polcar’s algorithm, sets of

matrices can be represented by their maximal

elements

Page 22: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Future work•Derive other existing algorithms

•Construct a taxonomy

•Find new algorithms

•Expand existing pattern matching toolkits (SPARE Time / SPARE

Parts) or create a new 2D pattern matching toolkit

•Thorough performance analysis

•Further generalisations of the 2D pattern matching problem

– Multipattern matching

– More than two dimensions

– Approximate 2D pattern matching

– Patterns of non-rectangular shapes

– ...

Page 23: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.

Questions

Page 24: Two-dimensional pattern matching M.G.W.H. van de Rijdt 23 August 2005.