CS 231: Algorithmic Problem Solving Naomi Nishimura Module 5 Date of this version: November 22, 2019 WARNING: Drafts of slides are made available prior to lecture for your convenience. After lecture, slides will be updated to reflect material taught. Check the date on this page to make sure you have the correct, updated version. WARNING: Slides do not include all class material; if you have missed a lecture, make sure to find out from a classmate what material was presented verbally or on the board. CS 231 Module 5 1 / 39
38
Embed
CS 231: Algorithmic Problem Solvingcs231/resources/231slidesmo… · 2 M 0 CS 231 Module 5 Matrix-chain multiplication 2 / 39. Approach 3: Divide-and-conquer What if we knew that
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CS 231: Algorithmic Problem SolvingNaomi Nishimura
Module 5Date of this version: November 22, 2019
WARNING: Drafts of slides are made available prior to lecture foryour convenience. After lecture, slides will be updated to reflectmaterial taught. Check the date on this page to make sure youhave the correct, updated version.
WARNING: Slides do not include all class material; if you havemissed a lecture, make sure to find out from a classmate whatmaterial was presented verbally or on the board.
CS 231 Module 5 1 / 39
Matrix-chain multiplication
Matrix-chain multiplication
Input: A sequence (chain) of matrices M0, . . . ,Mn−1, where Mi hasdimension di ×di+1
Output: A parenthesization that results in the smallest number ofmultiplications of pairs of values
A generic optimal solution to matrix-chain multiplication
In an optimal solution, the very last multiplication will consist ofmultiplying a pair of matrices A and B such that for some value of k :
• A is formed by multiplying M0 through Mk
• B is formed by multiplying Mk+1 through Mn−1
We have already executed the first two steps in the Optimal Substructurerecipe. We still need to show that if any piece O ′ of O is not an optimalsolution for a smaller input I ′ of I , then O is not an optimal solution for I .
To be able to repeat the process for each smaller instance, we need to beable to break up each smaller instance in all possible ways.
The smaller pieces may not start at M0 or end at Mn−1.
We define m[i , j ] to be the smallest cost to multiply Mi through Mj , wherem[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}.
The matrix formed by multiplying Mi through Mk will have dimensions diand dk+1 and the matrix formed by multiplying Mk+1 through Mj willhave dimensions dk+1 and dj+1.
Step 2: Determine what information should be stored in each tableentry.We store each value m[i , j ].Step 3: Determine the shape of the table or tables needed to storethe solutions to the smaller instances.A triangle (or a grid).Step 4: Determine the base cases.When m[i , j ] requires no multiplications, the cost is 0.Thus, m[i , i ] = 0 for all values of i .Step 5: Choose an order of evaluation.When we calculate m[i , j ], we want to be able to look up the values ofm[i ,k] and m[k + 1, j ] in the table.m[i , j ] = mink{m[i ,k] +m[k + 1, j ] +didk+1dj+1}To ensure that the “smaller” problems have been stored before they areused, we view the “size” of m[i , j ] as j− i and fill in entries innondecreasing order of size.
To write an algorithm that fills in table entries, typically one loop is usedfor each dimension in the table.
For a one-dimensional table or list, the single loop will typically iterateover indices.
For a two-dimensional table:
• If filled row by row, typically the outer loop will iterate over rowindices and the inner loop will iterate over column indices.
• If filled column by column, typically the outer loop will iterate overcolumn indices and the inner loop will iterate over row indices.
• If filled diagonal by diagonal, typically the outer loop will iterate overdiagonals and the inner loop will iterate either over row or columnindices.
2. Ensure that the algorithm produces the correct output for anyinstance.
For dynamic programming algorithms:
1. Typically, the cost is dominated by the product of the number of tableentries and the cost of filling one entry. The cost of extracting asolution is usually no greater than the cost of filling the table.
2. Correctness is proved by showing that the problem has optimalsubstructure and that the order of evaulation ensures that thealgorithm finds the correct smaller instances to check.
The space required is typically the number of table entries.
CS 231 Module 5 Analysis 21 / 39
Expressing the solution in terms of other solutions
Common methods based on types of inputs:
Set Determine an order on the elements. Smaller instances aredefined using smaller subsets of the elements. Biggerinstances are formed by adding elements one at a time.
Sequence or Grid Define a problem in terms of the position or positions inthe sequence. Smaller instances are at earlier positions.Bigger instances are at later positions.
Tree Define a problem on a subtree. Smaller instances are onsmaller subtrees. Bigger instances are on bigger subtrees.
CS 231 Module 5 General approaches to dynamic programming 22 / 39
Determining what information should be stored in eachtable entry
Common types of information stored:
Decision problem True or False (solutions to smaller instances)
Evaluation problem Values (solutions to smaller instances)
Search or constructive problem Information indicating which smallerinstances led to the optimal solution
Because the cost of the algorithm will depend on the amount ofinformation stored in each entry (cost of calculation as well as cost ofreading time), often full solutions are not stored.
Instead, they can be reconstructed in asymptotically the same amount oftime required to fill the table.
CS 231 Module 5 General approaches to dynamic programming 23 / 39
Common table shapes
1D Typically used when each problem has only one variable.
2 1D Typically used when each problem has two variables, butbigger instances depend only on one-smaller values of onevariable. Examples: the input naturally has two variables(e.g. grids or graphs) or there are two inputs.
2D Typically used when each problem has two variables, andbigger instances depend on more than just one-smaller valuesof one variable. Sometimes a grid is used, and sometimes atriangle if, for example, M[i,j] = M[j,i] or one not defined.
2 2D Typically used in situations like for 2 1D tables, but this timefor problems with three variables.
3D or higher Typically used when problems have three or more variables.
For a tree, often nodes are ordered from leaves to root.
CS 231 Module 5 General approaches to dynamic programming 24 / 39
Longest common subsequence
A subsequence of a string x0x1 . . .xn−1 is any string xi1xi2 . . .xij such that0≤ i1 ≤ i2 ≤ ·· · ≤ ij ≤ n−1.
Example: The string ”abcde” is a subsequence of the string”000a0b0000cd0000e”.
Longest common subsequence
Input: A string X of length m and a string Y of length n
Output: A string Z that is a subsequence of both X and Y and ofmaximum length
a r m a g e d d o n
a a r d v a r k s
a r m a g e d d o n
a a r d v a r k s
CS 231 Module 5 Longest common subsequence 25 / 39
Recipe Step 1 for LCSSuppose Z is an optimal solution. Where in X and Y can the last symbolin Z be found?
• Matching the last symbols inboth X and Y .
• Matching a non-last symbol inX .
• Matching a non-last symbol inY .
z
x
y
z
x
y
z
x
y
CS 231 Module 5 Longest common subsequence 26 / 39
Defining smaller instances
z
x
y
z
x
y
z
x
y
z
x
y
z
x
y
z
x
y
CS 231 Module 5 Longest common subsequence 27 / 39
Recipe Step 2 for LCS
We’ll use Python slice notation to represent smaller strings, such as
• Z [: k] is the first k symbols (positions 0 to k−1)
• Z [k :] is all but the first k symbols (positions k and on)
• Z [j : k] is symbols j to k−1
Suppose Z [: k] is an LCS of X [: i ] and Y [: j ].
Claim
Case 1 xi−1 = yj−1 ⇒ Z [: k−1] is an LCS of X [: i −1] and Y [: j−1].Case 2 xi−1 6= yj−1 and xi−1 6= zk−1 ⇒ Z [: k] is an LCS of X [: i −1] andY [: j ].Case 3 xi−1 6= yj−1 and yj−1 6= zk−1 ⇒ Z [: k] is an LCS of X [: i ] andY [: j−1].
For C [i , j ] the length of the LCS of X [: i ] and Y [: j ]:C [i , j ] = C [i −1, j−1] + 1 if xi−1 = yj−1C [i , j ] = max{C [i −1, j ],C [i , j−1]} otherwise
CS 231 Module 5 Longest common subsequence 28 / 39
Steps 3 through 5Step 3: Determine what information should be stored in each tableentry.
Store length of subsequence as well as whether the best value of C [i , j ]was found using C [i −1, j−1], C [i −1, j ], or C [i , j−1].
Step 3: Determine the shape of the table or tables needed to storethe solutions to the smaller instances.A grid of dimensions (m+ 1)× (n+ 1)Step 4: Determine the base cases.
Use value 0 to denote empty string, so C [0,0] = C [0, j ] = C [i ,0] = 0
Step 5: Choose an order of evaluation.
C [i , j ] = C [i −1, j−1] + 1 if xi−1 = yj−1
C [i , j ] = max{C [i −1, j ],C [i , j−1]} if xi−1 = yj−1
We need to know C [i −1, j−1], C [i −1, j ], and C [i , j−1] before C [i , j ].
Use increasing values of i and then increasing values of j (hence row byrow and then column by column).
CS 231 Module 5 Longest common subsequence 29 / 39
Completion and analysis
Step 6: Extract the solution from the table.
C[m,n] contains the length of the longest common subsequence of theinputs.
To extract the sequence, store information on which of the three values ledto the best answer.
Analysis:
Space: O(mn)
Time: O(mn)
CS 231 Module 5 Longest common subsequence 30 / 39
Filling in the table for LCS
0 1 2 3
0 0 1 2 3
1 2 3 4
2 4 5
i
j
4
5
6
4
1
2 3
• Number of diagonals
• Starting and ending number ofcolumns
• For each diagonal and column,the row of the entry in thegiven diagonal
CS 231 Module 5 Longest common subsequence 31 / 39
Completing the implementation
Possible implementation choices:
• Use a Grid or two to store values and to keep track of which gave abest solution (plus the character if match found).
• Extract the solution by working backwards from C[m,n] and following“arrows” to previous best solutions, adding characters when found.
See sample code on website for calculating just the value and alsocalculating the value and extracting the subsequence.
CS 231 Module 5 Longest common subsequence 32 / 39
All-pairs cheapest paths
All-pairs cheapest paths
Input: A graph G with non-negative edge weights
Output: The least-cost paths between each pair of vertices in G
Note: Other methods needed when there can be negative cycles.
Naıve method: Use Dijkstra n times.
Floyd-Warshall dynamic programming algorithm
CS 231 Module 5 All-pairs cheapest paths 33 / 39
Knapsack (or 0-1 knapsack)
Note: In this version, each entire object is either included in or excludedfrom the knapsack.
Knapsack
Input: A set of n types of objects, where object i has positiveinteger weight wi and positive integer value vi , and aninteger weight bound W
Output: A subset of objects with total weight at most W and themaximum total value
CS 231 Module 5 Knapsack 34 / 39
Options for dynamic programming on graphs
Idea:
• Removing a set of one or more vertices to break a graph into smallergraphs
• Solve instances on smaller graphs that can be combined to solve theoriginal instance on a larger graph
• Solutions to additional problems on smaller graphs may also beneeded
Challenges in splitting up a graph:
• Using a large set of vertices to break up the graph may make itexpensive to account for solutions that involve vertices in the set
• Breaking a graph into a large number of smaller graphs may make itexpensive to combine the information in the smaller graphs
CS 231 Module 5 Graphs 35 / 39
Using dynamic programming on assignments and exams
In many of the examples seen in lectures, the biggest challenge is figuringout how to express an optimal solution in terms of optimal solutions tosmaller instances.
It is not expected that on your own you would be able to devise expressionsfor problems as challenging as, for example, all-pairs cheapest paths.
For assignments and exams, you can expect either to be provided with theexpression or to be asked a question in which finding such an expression isrelatively simple.
You will be expected to understand the rest of the steps in the recipe tobe able to figure those out on your own.
CS 231 Module 5 Summary 36 / 39
Comparing paradigms
Aspect Exhaustive Greedy D-and-C DP
Feasible solutions All Some Some Some
Applicability Wide Narrow Medium Medium
Speed Slow Fast Medium Medium
Divide-and-conquer:
• assumes you know which problems are needed
• works top down
• may repeat smaller instances
Dynamic programming:
• does not assume you know which smaller instances are needed
• works bottom up (top down version memoization exists)