Top Banner
CSC401 – Analysis of Algorithms Chapter 9 Text Processing Text Processing Objectives: • Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore algorithm Knuth-Morris-Pratt algorithm • Tries Standard tries Compressed tries Suffix tries Huffman encoding algorithm
25

Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

Dec 31, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401 – Analysis of Algorithms Chapter 9

Text ProcessingText ProcessingObjectives:

• Strings • Pattern matching algorithms

• Brute-force algorithm • Boyer-Moore algorithm • Knuth-Morris-Pratt algorithm

• Tries• Standard tries• Compressed tries • Suffix tries

• Huffman encoding algorithm

Page 2: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-22

StringsStringsA string is a sequence A string is a sequence of charactersof charactersExamples of strings:Examples of strings:– Java programJava program– HTML documentHTML document– DNA sequenceDNA sequence– Digitized imageDigitized image

An alphabet An alphabet is the is the set of possible set of possible characters for a characters for a family of stringsfamily of stringsExample of alphabets:Example of alphabets:– ASCIIASCII– UnicodeUnicode– {A, C, G, T}{A, C, G, T}

Let Let PP be a string of size be a string of size mm – A substring A substring PP[[i .. ji .. j]] of of PP is the is the

subsequence of subsequence of PP consisting consisting of the characters with ranks of the characters with ranks between between i i and and jj

– A prefix of A prefix of PP is a substring of is a substring of the type the type PP[0 [0 .. i.. i]]

– A suffix of A suffix of PP is a substring of is a substring of the type the type PP[[i ..m i ..m 1] 1]

Given strings Given strings TT (text) and (text) and PP (pattern), the pattern (pattern), the pattern matching problem consists matching problem consists of finding a substring of of finding a substring of TT equal to equal to PPApplications:Applications:Text editors, Search engines, Text editors, Search engines,

Biological researchBiological research

Page 3: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-33

Brute-Force AlgorithmBrute-Force AlgorithmThe brute-force pattern The brute-force pattern matching algorithm matching algorithm compares the pattern compares the pattern PP with the text with the text TT for each for each possible shift of possible shift of PP relative to relative to TT, until , until eithereither– a match is found, ora match is found, or– all placements of the all placements of the

pattern have been triedpattern have been tried

Brute-force pattern Brute-force pattern matching runs in time matching runs in time OO((nmnm)) Example of worst case:Example of worst case:– T T aaa … ah aaa … ah– P P aaah aaah– may occur in images and may occur in images and

DNA sequencesDNA sequences– unlikely in English textunlikely in English text

Algorithm BruteForceMatch(T, P)Input text T of size n and pattern

P of size mOutput starting index of a

substring of T equal to P or 1 if no such substring exists

for i 0 to n m{ test shift i of the pattern }j 0while j m T[i j] P[j]

j j 1if j m

return i {match at i}else

break while loop {mismatch}return -1 {no match anywhere}

Page 4: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-44

Boyer-Moore HeuristicsBoyer-Moore HeuristicsThe Boyer-Moore’s pattern matching algorithm is The Boyer-Moore’s pattern matching algorithm is based on two heuristicsbased on two heuristicsLooking-glass heuristic:Looking-glass heuristic: Compare Compare PP with a with a subsequence of subsequence of TT moving backwards moving backwardsCharacter-jump heuristic:Character-jump heuristic: When a mismatch occurs When a mismatch occurs at at TT[[ii] ] c c – If If P P contains contains cc, shift , shift PP to align the last occurrence of to align the last occurrence of c c in in P P

with with TT[[ii] ] – Else, shift Else, shift PP to align to align PP[0][0] with with TT[[i i 1] 1]

Example Example

1

a p a t t e r n m a t c h i n g a l g o r i t h m

r i t h m

r i t h m

r i t h m

r i t h m

r i t h m

r i t h m

r i t h m

2

3

4

5

6

7891011

Page 5: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-55

Last-Occurrence FunctionLast-Occurrence FunctionBoyer-Moore’s algorithm preprocesses the Boyer-Moore’s algorithm preprocesses the pattern pattern PP and the alphabet and the alphabet to build the last- to build the last-occurrence function occurrence function LL mapping mapping to integers, to integers, where where LL((cc)) is defined as is defined as– the largest index the largest index ii such that such that PP[[ii]] c c oror 11 if no such index exists if no such index exists

Example:Example: {{a, b, c, da, b, c, d}}– PP abacababacab

The last-occurrence function can be represented The last-occurrence function can be represented by an array indexed by the numeric codes of the by an array indexed by the numeric codes of the characterscharactersThe last-occurrence function can be computed in The last-occurrence function can be computed in time time OO((m m s s)), where , where mm is the size of is the size of PP and and ss is the is the size of size of

cc aa bb cc dd

LL((cc)) 44 55 33 11

Page 6: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-66

m j

i

j l

. . . . . . a . . . . . .

. . . . b a

. . . . b a

j

Case 1: j 1l

The Boyer-Moore AlgorithmThe Boyer-Moore AlgorithmAlgorithm BoyerMooreMatch(T, P, )

L lastOccurenceFunction(P, )i m 1j m 1repeat

if T[i] P[j]if j 0

return i { match at i }

elsei i 1j j 1

else{ character-jump }l L[T[i]]i i m – min(j, 1l)j m 1

until i n 1return 1 { no match }

m (1 l)

i

jl

. . . . . . a . . . . . .

. a . . b .

. a . . b .

1 l

Case 2: 1lj

Page 7: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-77

ExampleExample

1

a b a c a a b a d c a b a c a b a a b b

234

5

6

7

891012

a b a c a b

a b a c a b

a b a c a b

a b a c a b

a b a c a b

a b a c a b1113

Page 8: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-88

AnalysisAnalysisBoyer-Moore’s algorithm Boyer-Moore’s algorithm runs in time runs in time OO((nm nm s s))Example of worst case:Example of worst case:– T T aaa … a aaa … a– P P baaa baaa

The worst case may The worst case may occur in images and occur in images and DNA sequences but is DNA sequences but is unlikely in English textunlikely in English textBoyer-Moore’s algorithm Boyer-Moore’s algorithm is significantly faster is significantly faster than the brute-force than the brute-force algorithm on English algorithm on English texttext

11

1

a a a a a a a a a

23456

b a a a a a

b a a a a a

b a a a a a

b a a a a a

7891012

131415161718

192021222324

Page 9: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-99

The KMP Algorithm - MotivationThe KMP Algorithm - MotivationKnuth-Morris-Pratt’s Knuth-Morris-Pratt’s algorithm compares algorithm compares the pattern to the text the pattern to the text in in left-to-rightleft-to-right, but , but shifts the pattern more shifts the pattern more intelligently than the intelligently than the brute-force algorithm. brute-force algorithm. When a mismatch When a mismatch occurs, what is the occurs, what is the mostmost we can shift the we can shift the pattern so as to avoid pattern so as to avoid redundant redundant comparisons?comparisons?Answer: the largest Answer: the largest prefix of prefix of PP[0..[0..jj]] that is a that is a suffix of suffix of PP[1..[1..jj]]

x

j

. . a b a a b . . . . .

a b a a b a

a b a a b a

No need torepeat thesecomparisons

Resumecomparing

here

Page 10: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1010

KMP Failure FunctionKMP Failure FunctionKnuth-Morris-Pratt’s algorithm Knuth-Morris-Pratt’s algorithm

preprocesses the pattern to preprocesses the pattern to

find matches of prefixes of the find matches of prefixes of the

pattern with the pattern itselfpattern with the pattern itself

The The failure functionfailure function FF((jj)) is is

defined as the size of the defined as the size of the

largest prefix of largest prefix of PP[0..[0..jj]] that is that is

also a suffix of also a suffix of PP[1..[1..jj]]

Knuth-Morris-Pratt’s algorithm Knuth-Morris-Pratt’s algorithm

modifies the brute-force modifies the brute-force

algorithm so that if a mismatch algorithm so that if a mismatch

occurs at occurs at PP[[jj]]TT[[ii] ] we set we set j j

FF((j j 1)1)

jj 00 11 22 33 44

PP[[jj]] aa bb aa aa bb aa

FF((jj)) 00 00 11 11 22

x

j

. . a b a a b . . . . .

a b a a b a

F(j 1)

a b a a b a

Page 11: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1111

The KMP AlgorithmThe KMP AlgorithmThe failure function can The failure function can be represented by an be represented by an array and can be array and can be computed in computed in OO((mm)) time timeAt each iteration of the At each iteration of the while-loop, eitherwhile-loop, either– ii increases by one, or increases by one, or– the shift amount the shift amount i i j j

increases by at least one increases by at least one (observe that (observe that FF((j j 1)1) < < jj))

Hence, there are no more Hence, there are no more than than 22n n iterations of the iterations of the while-loopwhile-loopThus, KMP’s algorithm Thus, KMP’s algorithm runs in optimal time runs in optimal time OO((m m n n))

Algorithm KMPMatch(T, P)F failureFunction(P)i 0j 0while i n

if T[i] P[j]if j m 1

return i j { match }

elsei i 1j j 1

elseif j 0

j F[j 1]else

i i 1return 1 { no match }

Page 12: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1212

Computing the Failure FunctionComputing the Failure FunctionThe failure function can be The failure function can be

represented by an array and represented by an array and

can be computed in can be computed in OO((mm)) time time

The construction is similar to The construction is similar to

the KMP algorithm itselfthe KMP algorithm itself

At each iteration of the while-At each iteration of the while-

loop, eitherloop, either

– ii increases by one, or increases by one, or

– the shift amount the shift amount i i j j increases increases

by at least one (observe that by at least one (observe that

FF((j j 1)1) < < jj))

Hence, there are no more Hence, there are no more

than than 22m m iterations of the iterations of the

while-loopwhile-loop

Algorithm failureFunction(P)F[0] 0i 1j 0while i m

if P[i] P[j]{we have matched j + 1

chars}F[i] j + 1i i 1j j 1

else if j 0 then{use failure function to shift

P}j F[j 1]

elseF[i] 0 { no match }i i 1

Page 13: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1313

ExampleExample

1

a b a c a a b a c a b a c a b a a b b

7

8

19181715

a b a c a b

1614

13

2 3 4 5 6

9

a b a c a b

a b a c a b

a b a c a b

a b a c a b

10 11 12

c

jj 00 11 22 33 44

PP[[jj]] aa bb aa cc aa bb

FF((jj)) 00 00 11 00 11

Page 14: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1414

Preprocessing StringsPreprocessing StringsPreprocessing the pattern speeds up pattern Preprocessing the pattern speeds up pattern matching queriesmatching queries– After preprocessing the pattern, KMP’s algorithm After preprocessing the pattern, KMP’s algorithm

performs pattern matching in time proportional to the performs pattern matching in time proportional to the text sizetext size

If the text is large, immutable and searched for If the text is large, immutable and searched for often (e.g., works by Shakespeare), we may often (e.g., works by Shakespeare), we may want to preprocess the text instead of the want to preprocess the text instead of the patternpattern

A trie is a compact data structure for A trie is a compact data structure for representing a set of strings, such as all the representing a set of strings, such as all the words in a textwords in a text– A tries supports pattern matching queries in time A tries supports pattern matching queries in time

proportional to the pattern sizeproportional to the pattern size

Page 15: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1515

Standard Trie (1) Standard Trie (1) The standard trie for a set of strings S is an ordered tree The standard trie for a set of strings S is an ordered tree such that:such that:– Each node but the root is labeled with a characterEach node but the root is labeled with a character– The children of a node are alphabetically orderedThe children of a node are alphabetically ordered– The paths from the external nodes to the root yield the strings The paths from the external nodes to the root yield the strings

of Sof S

Example: standard trie for the set of stringsExample: standard trie for the set of stringsS = { bear, bell, bid, bull, buy, sell, stock, stop }S = { bear, bell, bid, bull, buy, sell, stock, stop }

a

e

b

r

l

l

s

u

l

l

y

e t

l

l

o

c

k

p

i

d

Page 16: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1616

Standard Trie (2)Standard Trie (2)A standard trie uses A standard trie uses OO((nn)) space and supports space and supports searches, insertions and deletions in time searches, insertions and deletions in time OO((dmdm)), where:, where:nn total size of the strings in Stotal size of the strings in Smm size of the string parameter of the operationsize of the string parameter of the operationd d size of the alphabet size of the alphabet

a

e

b

r

l

l

s

u

l

l

y

e t

l

l

o

c

k

p

i

d

Page 17: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1717

Word Matching with a TrieWord Matching with a TrieWe insert We insert the words the words of the text of the text into a trieinto a trieEach leaf Each leaf stores the stores the occurrenceoccurrences of the s of the associated associated word in the word in the text text

s e e b e a r ? s e l l s t o c k !

s e e b u l l ? b u y s t o c k !

b i d s t o c k !

a

a

h e t h e b e l l ? s t o p !

b i d s t o c k !

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68

69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

a r87 88

a

e

b

l

s

u

l

e t

e

0, 24

o

c

i

l

r

6

l

78

d

47, 58l

30

y

36l

12k

17, 40,51, 62

p

84

h

e

r

69

a

Page 18: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1818

Compressed TrieCompressed TrieA compressed trie A compressed trie has internal nodes of has internal nodes of degree at least twodegree at least twoIt is obtained from It is obtained from standard trie by standard trie by compressing chains compressing chains of “redundant” nodesof “redundant” nodes

e

b

ar ll

s

u

ll y

ell to

ck p

id

a

e

b

r

l

l

s

u

l

l

y

e t

l

l

o

c

k

p

i

d

Page 19: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-1919

Compact RepresentationCompact RepresentationCompact representation of a compressed trie for an array Compact representation of a compressed trie for an array ofof strings:strings:– Stores at the nodes ranges of indices instead of substringsStores at the nodes ranges of indices instead of substrings– Uses Uses OO((ss) ) space, where space, where s s is the number of strings in the arrayis the number of strings in the array– Serves as an auxiliary index structureServes as an auxiliary index structure

s e e

b e a r

s e l l

s t o c k

b u l l

b u y

b i d

h e

b e l l

s t o p

0 1 2 3 4a rS[0] =

S[1] =

S[2] =

S[3] =

S[4] =

S[5] =

S[6] =

S[7] =

S[8] =

S[9] =

0 1 2 3 0 1 2 3

1, 1, 1

1, 0, 0 0, 0, 0

4, 1, 1

0, 2, 2

3, 1, 2

1, 2, 3 8, 2, 3

6, 1, 2

4, 2, 3 5, 2, 2 2, 2, 3 3, 3, 4 9, 3, 3

7, 0, 3

0, 1, 1

Page 20: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2020

Suffix Trie (1)Suffix Trie (1)The suffix trie of a string The suffix trie of a string XX is the compressed is the compressed trie of all the suffixes of trie of all the suffixes of XX

e nimize

nimize ze

zei mi

mize nimize ze

m i n i z em i0 1 2 3 4 5 6 7

Page 21: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2121

Suffix Trie (2)Suffix Trie (2)Compact representation of the suffix trie for a Compact representation of the suffix trie for a string string XX of size of size nn from an alphabet of size from an alphabet of size dd– Uses Uses OO((nn)) space space– Supports arbitrary pattern matching queries in Supports arbitrary pattern matching queries in XX in in

OO((dmdm)) time, where time, where mm is the size of the pattern is the size of the pattern

7, 7 2, 7

2, 7 6, 7

6, 7

4, 7 2, 7 6, 7

1, 1 0, 1

m i n i z em i0 1 2 3 4 5 6 7

Page 22: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2222

Encoding Trie (1)Encoding Trie (1)A code is a mapping of each character of an alphabet to a A code is a mapping of each character of an alphabet to a binary code-wordbinary code-word

A prefix code is a binary code such that no code-word is A prefix code is a binary code such that no code-word is the prefix of another code-wordthe prefix of another code-word

An encoding trie represents a prefix codeAn encoding trie represents a prefix code– Each leaf stores a characterEach leaf stores a character– The code word of a character is given by the path from the The code word of a character is given by the path from the

root to the leaf storing the character (0 for a left child and 1 root to the leaf storing the character (0 for a left child and 1 for a right childfor a right child

a

b c

d e

0000 010010 011011 1010 1111

aa bb cc dd ee

Page 23: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2323

Encoding Trie (2)Encoding Trie (2)Given a text string Given a text string XX, we want to find a prefix code for , we want to find a prefix code for the characters of the characters of XX that yields a small encoding for that yields a small encoding for XX– Frequent characters should have long code-wordsFrequent characters should have long code-words– Rare characters should have short code-wordsRare characters should have short code-words

ExampleExample– X X == abracadabraabracadabra– TT11 encodes encodes XX into into 2929 bits bits– TT22 encodes encodes XX into into 2424 bits bits

c

a r

d b a

c d

b r

T1 T2

Page 24: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2424

Huffman’s AlgorithmHuffman’s AlgorithmGiven a string Given a string XX, , Huffman’s algorithm Huffman’s algorithm construct a prefix construct a prefix code the minimizes code the minimizes the size of the the size of the encoding of encoding of XX

It runs in timeIt runs in timeOO((nnd d loglog d d)), where , where nn is the size of is the size of XX and and dd is the number of is the number of distinct characters of distinct characters of XX

A heap-based priority A heap-based priority queue is used as an queue is used as an auxiliary structureauxiliary structure

Algorithm HuffmanEncoding(X)Input string X of size nOutput optimal encoding trie for XC distinctCharacters(X)computeFrequencies(C, X)Q new empty heap for all c C

T new single-node tree storing cQ.insert(getFrequency(c), T)

while Q.size() > 1f1 Q.minKey()

T1 Q.removeMin()

f2 Q.minKey()

T2 Q.removeMin()

T join(T1, T2)

Q.insert(f1 + f2, T)return Q.removeMin()

Page 25: Text Processing CSC401 – Analysis of Algorithms Chapter 9 Text Processing Objectives: Strings Pattern matching algorithms Brute-force algorithm Boyer-Moore.

CSC401: Analysis of AlgorithmsCSC401: Analysis of Algorithms 9-9-2525

ExampleExample

aa bb cc dd rr

55 22 11 11 22

X = abracadabraFrequencies

ca rdb5 2 1 1 2

ca rdb

2

5 2 2

ca bd r

2

5

4

ca bd r

2

5

4

6

c

a

bd r

2 4

6

11