Chapter 2.8 Search Algorithms
Search Algorithms
• Array Search– An array contains a certain number of records– Each record is identified by a certain key– One searches the record with a given key
• String Search– A text is represented by an array of characters– One searches one or all occurrences of a certain
string
Search Algorithms
• Array Search– An array contains a certain number of records– Each record is identified by a certain key– One searches the record with a given key
• String Search– A text is represented by an array of characters– One searches one or all occurrences of a certain
string
Array Search
PROCEDURE Search– Two parameters:• The Array to be searched–Contains n Items–Index in the range 0..n–Item with index 0 is not used
• The Key of the Item to be found– Search returns a Cardinal value• the index of the Item where the key has been
found• if the key has not been found , 0.
Array Search
TYPEArrayOfItems = ARRAY[0..n] OF Item;(* By convention, the element with index 0 is not used *)
Item = RECORD Key : KeyType (* any ordinal type *); Other record fields ; END;
Straight Search
VAR
PROCEDURE Search( A: ARRAY OF Item,Key: KeyType): CARDINAL;
VAR i : CARDINAL;
BEGINi := HIGH(A);
WHILE A[i].Key # Key AND i # 0 DO
DEC(i)
END;
RETURN iEND Search;
Straight Search
VAR
PROCEDURE Search( A: ARRAY OF Item,
VAR i : CARDINAL;
BEGINi := HIGH(A);
WHILE A[i].Key # Key AND i # 0 DO
DEC(i)
END;
RETURN iEND Search;
Key: KeyType): CARDINAL;
Sentinel Search
PROCEDURE Search( VAR A: ARRAY OF Item,Key: KeyType): CARDINAL;
VAR i : CARDINAL;
BEGIN
i := HIGH(A);
A[0].Key := Key;
WHILE A[i].Key # Key DO DEC(i) END;RETURN i
END Search;
Binary Search
# elements > 1
Binary Search
No Yes# elements > 1
No YesKey < Keymiddle
No YesKey = Keyelement
BinarySearchrighthalf
BinarySearch
lefthalf
NotFound Found
Binary Search (1)
PROCEDURE Search(VAR a: ARRAY OF Item, Key:KeyType):CARDINAL;
VAR Min,Max,m: CARDINAL; PROCEDURE src(Min,Max: CARDINAL); … END src;BEGIN Min := 1; Max := HIGH(a); src(Min,Max); IF a[m].Key = Key THEN RETURN m ELSE RETURN 0 ENDEND Search;
Binary Search (2)
PROCEDURE Src(Min,Max : CARDINAL);BEGIN m := (Min+Max) DIV 2; IF Min # Max THEN IF a[m].Key >= Key THEN src(Min,m) ELSE src(m+1,Max) END; ENDEND Src;
Iterative Binary SearchPROCEDURE Search(VAR a: ARRAY OF Item,
Key:KeyType):CARDINAL;VAR Min,Max,m: CARDINAL;
BEGIN Min := 1; Max := HIGH(a); WHILE Min < Max DO m := (Min+Max) DIV 2; IF a[m].Key >= Key THEN Max := m ELSE Min := m+1 END; (* IF *) END; (* WHILE *) IF a[m].Key = Key THEN RETURN m ELSE RETURN 0 ENDEND Search;
Array Search Performance
(Number of comparisons)
• Unordered Array–Straight search : 2n–Sentinel search : n
• Ordered Array– Binary search : log 2 n
Search Algorithms
• Array Search– An array contains a certain number of records– Each record is identified by a certain key– One searches the record with a given key
• String Search– A text is represented by an array of characters– One searches one or all occurrences of a certain
string
String Search• The problem: Find a given string in a text.
• Data structures:
Text : ARRAY[1..TextSize] OF CHAR;
String : ARRAY[1..StringSize] OF CHAR;• The Algorithms:– Brute Force– Knuth, Morris & Pratt (KPM - 1974)– Boyer & Moore (BM - 1977)
String Search
by Brute Forcethis algorithm tries to find a string
string stringstring
String matched OR Text exhausted
Move to next character in Text
End of Text reached ?No
WHILE current char.in Text # String[1]
WHILE char. in Text = char. in String
Move to next character pair
Brute Force String Search
(Straightforward coding)PROCEDURE Search (VAR Text: TextType; TLength:CARDINAL; VAR String: StringType; SLength:CARDINAL):CARDINAL; VAR j, jmax : CARDINAL;BEGIN j := 0; jmax := TLength - SLength; REPEAT WHILE (Text[j] # String[1]) AND (j <= jmax) DO j := j+1 END; IF j <= jmax THEN i := 2; WHILE (Text[j+i] = String[i]) AND (i < SLength) DO i := i+1 END; END; (* IF *) j := j + 1; UNTIL (i = SLength) OR (j > jmax); RETURN j - 1END Search;
String Search (1)
by the KMP algorithmGATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in first position of string,Move string 1 position in text
String Search(2)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in fourth position of string,Move string 4 positions in text
String Search(3)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in fifth position of string,Move string 4 positions in text
String Search(4)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in first position of string,Move string 1 position in text
String Search(5)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in first position of string,Move string 1 position in text
String Search(6)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in second position of string,Move string 1 position in text
String Search(7)by the KMP algorithm
GATCGATCAGCAATCATCATCACATC
ATCATCACAT
Mismatch in eight position of string,Move string 3 positions in text
The KMP algorithm
The Next function
A T C A T C A C A T
1 ? ? ? ? ? ? ? ? ?Step:
x x x A x x x x x x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 ? ? ? ? ? ? ? ?Step:
x x x A T x x x x x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 ? ? ? ? ? ? ?Step:
x x x A T C x x x x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 ? ? ? ? ? ?Step:
x x x A T C A x x x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 ? ? ? ? ?Step:
x x x A T C A T x x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 5 ? ? ? ?Step:
x x x A T C A T C x x x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 5 7 ? ? ?Step:
x x x A T C A T C A x x x x x x x /
A T C A T C A
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 5 7 3 ? ?Step:
x x x A T C A T C A C x x x x x x /
A T C A T C A C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 5 7 3 9 ?Step:
x x x A T C A T C A C A x x x x x /
A T C A T
The KMP algorithm
The Next function
A T C A T C A C A T
1 1 2 4 4 5 7 3 9 9Step:
x x x A T C A T C A C A T x x x x /
A T C A T
The KMP algorithm
The Next functionA T C A T C A C A T
String:
1 1 2 4 4 5 7 3 9 9Step:
0 1 1 0 1 1 0 5 0 1Next:
i = 1 2 3 4 5 6 7 8 9 10
Next[i] = i – Step[i]
Computation of the Next table
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 1
k : 0
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 1
k : 0
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 1 > 2
k : 0 > 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 2
k : 1 > 0
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 2 > 3
k : 0 > 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 3
k : 1 > 0
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 3 > 4
k : 0 > 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 4
k : 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 4 > 5
k : 1 > 2
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 5
k : 2
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 5 > 6
k : 2 > 3
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 6
k : 3
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 6 > 7
k : 3 > 4
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 7
k : 4
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 5 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 7 > 8
k : 4 > 5
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 5 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 8
k : 5 > 1 > 0
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 5 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 8 > 9
k : 0 > 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 5 0 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 9
k : 1
The KMP algorithm
Building the Next function
A T C A T C A C A T String0 1 1 0 1 1 0 5 0 1 Next
i := 1; k := 0; Next[1] := 0; WHILE i < SLength DO WHILE (k > 0) AND (String[i]#String[k]) DO k := Next[k] END; (* WHILE *) k := k + 1; i := i + 1; IF String[i] = String[k] THEN Next[i] := Next[k] ELSE Next[i] := k END; (* IF *) END; (* WHILE *)
i : 9 > 10
k : 1 > 2
Actual KMP search
j := 1; i := 1;WHILE (i <= SLength) AND (j <= TLength) DO WHILE (i > 0) AND (Text[j] # String[i]) DO i := Next[i] END; (* WHILE *) i := i + 1; j := j + 1;END; (* WHILE *)IF i > SLength THEN RETURN j - SLength ELSE RETURN TLengthEND; (* IF *)