Top Banner
1 CSE 326: Data Structures: Advanced Topics Lecture 26: Wednesday, March 12 th , 2003
28

CSE 326: Data Structures: Advanced Topics

Jan 01, 2016

Download

Documents

ronan-austin

CSE 326: Data Structures: Advanced Topics. Lecture 26: Wednesday, March 12 th , 2003. Today. Dynamic programming for ordering matrix multiplication Very similar to Query Optimization in databases String processing Final review. Ordering Matrix Multiplication. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSE 326: Data Structures:  Advanced Topics

1

CSE 326: Data Structures: Advanced Topics

Lecture 26:Wednesday, March 12th, 2003

Page 2: CSE 326: Data Structures:  Advanced Topics

2

Today

• Dynamic programming for ordering matrix multiplication– Very similar to Query Optimization in

databases

• String processing

• Final review

Page 3: CSE 326: Data Structures:  Advanced Topics

3

Ordering Matrix Multiplication

• Need to compute A B C D

=

Page 4: CSE 326: Data Structures:  Advanced Topics

4

Ordering Matrix Multiplication

• One solution: (A B) (C D):

)( )=(

=

Cost: (3 2 4) + (4 2 3) + (3 4 3) = 84

Page 5: CSE 326: Data Structures:  Advanced Topics

5

Ordering Matrix Multiplication

• Anoter solution: (A (B C)) D:

( )) =(

Cost: (2 4 2) + (3 2 2) + (3 2 3) = 46

( ) = =...

Page 6: CSE 326: Data Structures:  Advanced Topics

6

Ordering Matrix Multiplication

Problem:• Given A1 A2 . . . An, compute optimal

ordering

Solution:• Dynamic programming• Compute cost[i][j]

– the minimum cost to compute Ai Ai+1 . . . Aj

• Proceed iteratively, increasing the gap = j – i

Page 7: CSE 326: Data Structures:  Advanced Topics

7

Ordering Matrix Multiplication/* initialize */for i = 1 to n-1 do cost[i][i] = 0 /* why ? */

/* dynamic programming */for gap = 1 to n do { for i = 1 to n – gap do { j = i + gap; c = ; for k = i to j-1 do /* how much would it cost to do (Ai . . . Ak ) (Ak+1 . . . Aj) ? */ c = min(c, cost[i][k] + cost[k+1][j] + A[i].rows * A[k].columns * A[j].columns) cost[i][j] = c; }}

/* initialize */for i = 1 to n-1 do cost[i][i] = 0 /* why ? */

/* dynamic programming */for gap = 1 to n do { for i = 1 to n – gap do { j = i + gap; c = ; for k = i to j-1 do /* how much would it cost to do (Ai . . . Ak ) (Ak+1 . . . Aj) ? */ c = min(c, cost[i][k] + cost[k+1][j] + A[i].rows * A[k].columns * A[j].columns) cost[i][j] = c; }}

= A[k+1].rows

Page 8: CSE 326: Data Structures:  Advanced Topics

8

Ordering Matrix Multiplication

• Running time: O(n3)

Important variation:

• Database systems do join reordering

• A very similar algorithm

• Come to CSE 544...

Page 9: CSE 326: Data Structures:  Advanced Topics

9

String Matching• The problem• Given a text T[1], T[2], ..., T[n]

and a pattern P[1], P[2], ..., P[m]

• Find all positions s such that P “occurs” in T at position s:(T[s], T[s+1], ..., T[s+m-1]) = (P[1], ..., P[m])

• Where do we need this ?– text editors (e.g. emacs)– grep– XML processing

Page 10: CSE 326: Data Structures:  Advanced Topics

10

String Matching

• Example:

T= b a c b a b a b a b a c a b a b a c b a

P= a b a b a c a

Page 11: CSE 326: Data Structures:  Advanced Topics

11

Naive String Matching

/* initialize */for i = 1 to n-m do if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m]) then print i

/* initialize */for i = 1 to n-m do if (T[i], T[i+1], ..., T[i+m-1]) = (P[1], P[2], ..., P[m]) then print i

running time: O(mn)

Page 12: CSE 326: Data Structures:  Advanced Topics

12

Knuth-Morris-Pratt String Matching

• main idea: reuse the work, after a failure

T= b a c b a b a b a b a c a b a b a c b a

P= a b a b a c a

fail !

P= a b a b a c a

reuse ! precompute on P

Page 13: CSE 326: Data Structures:  Advanced Topics

13

Knuth-Morris-Pratt String Matching

• The Prefix-Function:

[q] = the largest k < q s.t.

(P[1], P[2], ..., P[k-1]) = (P[q-k+1], P[q-k+2], ..., P[q-1])

Page 14: CSE 326: Data Structures:  Advanced Topics

14

1 2 3 4 5 6 7

P= a b a b a c a [7] = 1

1 2 3 4 5 6

P= a b a b a c [6] = 4

1 2 3 4 5

P= a b a b a [5] = 3

1 2 3 4

P= a b a b [4] = 2

P= a b a b

P= a b a

P= a b

[3] = [2] = [1] = 1

1 2 3 4 5 6 7 8

P= a b a b a c a [8] = 2

P= a b

Page 15: CSE 326: Data Structures:  Advanced Topics

15

Knuth-Morris-Pratt String Matching

/* compute */. . . .

/* do the matching */q = 0; /* q = where we are in P */for i = 1 to n do { q = q+1; while (q > 1 and P[q] != T[i]) q = [q]; if (P[q] = T[i]) { if (q=m) print(i – m+1); q = q+1; }}

/* compute */. . . .

/* do the matching */q = 0; /* q = where we are in P */for i = 1 to n do { q = q+1; while (q > 1 and P[q] != T[i]) q = [q]; if (P[q] = T[i]) { if (q=m) print(i – m+1); q = q+1; }}

Time = O(n) (why ?)

Page 16: CSE 326: Data Structures:  Advanced Topics

16

Knuth-Morris-Pratt String Matching

/* compute */[1] = 0;for q = 2 to m+1 do { k = [q – 1]; while (k > 1 and P[k – 1] != P[q – 1]) k = [k]; if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1; [q] = k;}

/* do the matching */ . . .

/* compute */[1] = 0;for q = 2 to m+1 do { k = [q – 1]; while (k > 1 and P[k – 1] != P[q – 1]) k = [k]; if (k> 1 and P[k – 1] = P[q – 1]) then k = k+1; [q] = k;}

/* do the matching */ . . .

Time = O(m) (why ?)

Total running time of KMP algoritm: O(m+n)

Page 17: CSE 326: Data Structures:  Advanced Topics

17

Final Review

• Basic math– logs, exponents, summations– proof by induction

• asymptotic analysis– big-oh, theta, omega– how to estimate running times

• need sums

• need recurrences

Page 18: CSE 326: Data Structures:  Advanced Topics

18

Final Review

• Lists, stacks queues– ADT definition– Array, v.s. pointer implementation– variations: headers, doubly linked, etc

• Trees:– definitions/terminology (root, parent, child, etc)– relationship between depth and size of a tree

• depth is between O(log N) and O(N)

Page 19: CSE 326: Data Structures:  Advanced Topics

19

Final Review• Binary Search Trees

– basic implementations of find, insert, delete– worst case performance: O(N)– average case performance: O(log N) (inserts only)

• AVL trees– balance factor +1, 0, -1– known single and double rotations to keep it balanced– all operations are O(log N) worst case time

• Splay trees– good amortized performance– single operation may take O(N)– know the zig-zig, zig-zag, etc

• B-trees: know basic idea behind insert/delete

Page 20: CSE 326: Data Structures:  Advanced Topics

20

Final Review

• Priority Queues– binary heaps: insert/deleteMin, percolate

up/down– array implementation– buildheap takes only O(N) !! Used in HeapSort

• Binomial queues– merge is fast: O(log N)– insert, deleteMin are based on merge

Page 21: CSE 326: Data Structures:  Advanced Topics

21

Final Review

• Hashing– hash functions based on the mod function– collision resolution strategies

• chaining, linear and quadratic probing, double hashing

– load factor of a hash table

Page 22: CSE 326: Data Structures:  Advanced Topics

22

Final Review

• Sorting– elementary sorting algorithm: bubble sort, selection sort,

insertion sort– heapsort O(N log N)– mergesort O(N log N)– quicksort O(N log N) average

• fastest in practice, but O(N2) worst case performance• pivot selection – median of the three works well

– known which of these are stable and in-place– lower bound on sorting– bucket sort, radix sort– external memory sort

Page 23: CSE 326: Data Structures:  Advanced Topics

23

Final Review

• Disjoint sets and Union-Find– up-trees and their array-based implementation– know how union-by-size and path compression

work– know the running time (not the proof)

Page 24: CSE 326: Data Structures:  Advanced Topics

24

Final Review

• graph algorithms– adjacency matrix v.s. adjacency list representation

– topological sort in O(n+m) time using a queue

– Breadth-First-Search (BFS) for unweighted shortest path

– Dijkstra’s shortest path algorithm

– DFS

– minimum spanning trees: Prim, Kruskal

Page 25: CSE 326: Data Structures:  Advanced Topics

25

Final Review

• Graph algorithms (cont’d)– Euler v.s. Hamiltonian circuits– Know what P, NP and NP-completeness mean

Page 26: CSE 326: Data Structures:  Advanced Topics

26

Final Review

• Algorithm design techniques– greedy: bin packing– divide and conquer

• solving various types of recurrence relations for T(N)

– dynamic programming (memoization)• DP-Fibonacci• Ordering matrix multiplication

– randomized data structures• treaps• primality testing

• string matching• Backtracking and game trees

Page 27: CSE 326: Data Structures:  Advanced Topics

27

The Final

• Details:– covers chapters 1-10, 12.5, and some extra material

– closed book, closed notes except:• you may bring one sheet of notes

– time: 1 hour and 50 minutes

– Monday, 3/17/2003, 2:30 – 4:20, this room

– bring pens/pencils/etc

– sleep well the night before

Page 28: CSE 326: Data Structures:  Advanced Topics

28

What About Friday ?

• I will cover some of the problems on the website

• I will take your questions