Chapter 4: Divide and Conquer - Introdjmoon/algs-g/algs-notes/c4... · 2019. 2. 18. · Chapter 4: Divide and Conquer - The Max Subarray Problem Consider a vector A[n] of numeric

Chapter 4: Divide and Conquer - Intro

• A recurrence equation (recurrence) is an equation or inequality that describesa function in terms of its values on smaller inputs

• There are various forms:

1. Recurse on subproblems of equal size (typical of divide and conquer)

T (n) = cT

(n

c

)+ f(n)

2. Recurse on subproblems of unequal size (typical of divide and conquer)

T (n) = T

(c1n

c3

)+ T

(c3 − c1)nc3

+ f(n)

3. Recurse on a single smaller problem size (typical of decrease and conquer)

T (n) = T (n− c) + f(n)

• Issues:

1. Sometimes recurrences are expressed as inequalities

– If T (n) ≤ some recurrence function, solution will be in terms of big ohnotation

– If T (n) ≥ some recurrence function, solution will be in terms of bigomega notation

2. For exact solutions, must usually consider floor and ceiling functions

– For example, when n is not even

T (n) = T

(⌈n

2

⌉)+ T

(⌊n

2

⌋)+ f(n)

3. Boundary conditions are generalized

– Rather than specify an exact function, usually represent the aspects ofminor importance using asymptotic notation

– Unless an exact solution is required, the base case need not be repre-sented by an exact function

1

Chapter 4: Divide and Conquer - The Max Subarray Problem

• Consider a vector A[n] of numeric values

• The maximum subarray problem is to find i0, j0 such that i0 ≤ j0 ≤ n and∑j0k=i0

A[k] is greater or equal to the sum over any other assignments of i and j

• The brute force approach would be to id all pairs < i, j > that meet the abovecriteria and find the largest sum over those pairs

– The number of pairs for (i0, j0) is

n2

– Value of each subarray can be computed in constant time using values of

previously computed subarrays, so overall ∈ Θ(n2)

• Divide and conquer approach

– Strategy:

∗ Divide array in half

∗ The max subarray is either

1. Entirely in the left half

2. Entirely in the right half

3. Spanning the middle

∗ The alg will

1. Recurse on the left and right halves

2. Find the largest subarray that spans the middle

3. Return the largest of the three

2

Chapter 4: Divide and Conquer - The Max Subarray Problem (2)

– The crossing alg:

FindMaxCrossingSubarray (A, low, mid, high)

1 left_sum = -infinity

2 sum = 0

3 for (i = mid down to low)

4 sum = sum + A[i]

5 if (sum > left_sum)

6 left_sum = sum

7 max_left = i

8 right_sum = -infinity

9 sum = 0

10 for (j = mid + 1 to high)

11 sum = sum + A[j]

12 if (sum > right_sum)

13 right_sum = sum

14 max_right = j

15 return(max_left, max_right, left_sum + right_sum)

∗ Analysis:

· Body of each for loop ∈ Θ(1)

· Total number of iterations of the two loops is (mid−low+1)+(high−mid) = high− low + 1 = n

· Total effort is n ∗Θ(1) = Θ(n)

3

Chapter 4: Divide and Conquer - The Max Subarray Problem (3)

– The main alg:

FindMaxSubarray (A, low, high)

1 if (high == low)

2 return(low, high, A[low]) //base case

else

3 mid = floor((low + high)/2)

4 left_low, left_high, left_sum = FindMaxSubarray (A, low, mid)

5 right_low, right_high, right_sum = FindMaxSubarray (A, mid + 1, high)

6 cross_low, cross_high, cross_sum = FindMaxCrossingSubarray (A, low, mid, high)

7 if ((left_sum >= right_sum) && (left_sum >= cross_sum)

8 return (left_low, left_high, left_sum)

9 else if ((right_sum >= left_sum) && (right_sum >= cross_sum)

10 return (right_low, right_high, right_sum)

else

11 return (cross_low, cross_high, cross_sum)

∗ Analysis:

· Let n = 2i, i ≥ 0

· T (1) ∈ Θ(1)

· Each reursion is on n/2 elements

· Return statements, conditionals ∈ Θ(1)

· Recursion:

T (n) =

Θ(1) n = 12T

(n2

)+ Θ(n) + Θ(1) n > 1

where First term represents recursive calls Second term finding the spanning subarray Third the remaining instructions

∗ This is the same result obtained for merge sort, so T (n) ∈ Θ(n lg n)

4

Chapter 4: Divide and Conquer - Matrix Multiplication Intro

• Consider 2 n× n matrices A and B

• Let C = A ∗B

• Then

Cij =n−1∑j=0

aij ∗ bji

• For a 2× 2 matrix, this requires 8 multiplications and 4 additions

• A brute force algorithm that implements the above:

Square-Matrix-Multiply (A, B)

1 n = A.rows

2 C[n][n] = new array

3 for (i = 1 to n)

4 for (j = 1 to n)

5 C[i][j] = 0

6 for (k = 1 to n)

7 C[i][j] = C[i][j] + A[i][k] * B[k][j]

– By virtue of the triply-nested loops, each of which executes n times, Square-Matrix-Multiply ∈ Θ(n3)

5

Chapter 4: Divide and Conquer - Matrix Multiplication Recursive Approach

• Again consider square matrices A and B of size n >= 2

• Each matrix can be considered to be a matrix of square sub matrices; e.g.,

A =

A11 A12

A21 A22

• We can then represent matrix multiplication in terms of these sub matrices: C11 C12

C21 C22

=

A11 A12

A21 A22

∗ B11 B12

B21 B22

• The multiplication is achieved recursively, where C11 = A11 ∗ B11 + A12 ∗ B21,

etc.

• The algorithm:

Square-Matrix-Multiply-Recursive (A, B)

1 n = A.rows

2 C[n][n] = new array

3 if (n == 1)

4 C[1][1] = A[1][1] * B[1][1]

else

5 partition A, B, C into 4 sub matrices each

6 for (k = 1 to n)

7 C11 = Square-Matrix-Multiply-Recursive(A11, B11) +

Square-Matrix-Multiply-Recursive(A12, B21)







6

Chapter 4: Divide and Conquer - Matrix Multiplication Recursive Approach(2)

• Analysis

– The partitioning (line 5) can be done in Θ(1) time if index calculationsare used instead of copying the elements into new matrices (which wouldrequire Θ(n2) time)

– Base case when n = 1 ∈ Θ(1) time

– Recurse eight times on matrices of size n/2× n/2 (lines 7 - 10)

– Four additions performed on matrices that hold n2/4 elements (Θ(n2) lines7 - 10)

– The recursion is then

T (n) =

Θ(1) n = 18T (n2 ) + Θ(n2) n > 1

7

Chapter 4: Divide and Conquer - Strassen’s Algorithm

• Strassen’s algorithm improves on this by making the recursion tree slightly lessbushy at the expense of increased additions and subtractions

– But these are negligible wrt asymptotic growth, being subsumed by themultiplications

• As in the recursive algorithm, n × n matrices A, B, C are partitioned inton/2× n/2 sub matrices: C11 C12

C21 C22

=

A11 A12

A21 A22

∗ B11 B12

B21 B22

• The four subarrays of C are computed in terms of n/2× n/2 matrices mi:

C =

P5 + P4 − P2 + P6 P1 + P2

P3 + P4 P5 + P1 − P3 − P7

• The Pi are defined as

1. P1 = A11 ∗ S1 = A11 ∗B12 − A11 ∗B22

2. P2 = S2 ∗B22 = A11 ∗B22 + A12 ∗B22

3. P3 = S3 ∗B11 = A21 ∗B11 + A22 ∗B11

4. P4 = A22 ∗ S4 = A22 ∗B21 − A22 ∗B11

5. P5 = S5 ∗ S6 = A11 ∗B11 + A11 ∗B22 + A22 ∗B11 + A22 ∗B22

6. P6 = S7 ∗ S8 = A12 ∗B21 + A12 ∗B22 − A22 ∗B21 − A22 ∗B22

7. P7 = S9 ∗ S10 = A11 ∗B11 + A11 ∗B12 − A21 ∗B11 − A21 ∗B12

8

Chapter 4: Divide and Conquer - Strassen’s Algorithm (2)

• The Si are defined as

1. S1 = B12 −B22

2. S2 = A11 + A12

3. S3 = A21 + A22

4. S4 = B21 −B11

5. S5 = A11 + A12

6. S6 = B11 +B22

7. S7 = A12 − A22

8. S8 = B21 +B22

9. S9 = A11 − A21

10. S10 = B11 +B12

• This requires only 7 multiplications, but 18 additions/subtractions

• The multiplications will be performed recursively

• Analysis:

– The steps involved:

1. Divide A, B, C into n/2× n/2 subarrays:Θ(1) as discussed in the straight recursive approach

2. Create ten n/2× n/2 Si arrays:Θ(n2)

3. Recursively create seven n/2× n/2 Pi arrays:7T (n2 )

4. Combine the Pi into C11, C12, C21, C22 n/2× n/2 arrays:Θ(n2)

– Run time is then

T (n) =

Θ(1) n = 17T (n2 ) + Θ(n2) n > 1

9

Chapter 4: Divide and Conquer - Strassen’s Algorithm (3)

• Correctness

1. C11 = P5 + P4 − P2 + P6

A11 ∗B11 +A11 ∗B22 +A22 ∗B11 +A22 ∗B22

−A22 ∗B11 +A22 ∗B21

−A11 ∗B22 −A12 ∗B22

−A22 ∗B22 −A22 ∗B21 +A12 ∗B22 +A12 ∗B21

A11 ∗B11 +A12 ∗B21

2. C12 = P1 + P2

A11 ∗B12 −A11 ∗B22

−A11 ∗B22 +A12 ∗B22

A11 ∗B12 +A12 ∗B22

3. C21 = P3 + P4

A21 ∗B11 +A22 ∗B11

−A22 ∗B11 +A22 ∗B21

A21 ∗B11 +A22 ∗B21

4. C22 = P5 + P1 − P3 − P7

A11 ∗B11 +A11 ∗B22 +A22 ∗B11 +A22 ∗B22

−A11 ∗B22 +A11 ∗B12

−A22 ∗B11 −A21 ∗B11

−A11 ∗B11 −A11 ∗B12 +A21 ∗B11 +A21 ∗B12

+A22 ∗B22 +A21 ∗B12

10

Chapter 4: Divide and Conquer - Analysis Techniques

• Comparisons of OoG

limn→∞t(n)g(n) =

0⇒ t(n)′s OoG < g(n)′s OoGc⇒ t(n)′s OoG = g(n)′s OoG∞⇒ t(n)′s OoG > g(n)′s OoG

• L ’Hopital’s Rule

limn→∞t(n)g(n) = limn→∞

t′(n)g′(n)

– Useful in regards to the above

• Forward substitution

– Given a recurrence of the form

T (n) = cT

(n

d

)+ f(n)

– Let n = di, and iteratively compute T (i+ 1) in terms of T (i):

1. Start with base condition/time

2. Determine the value of the caller by substituting the value of the basecondition into the recursion equation

3. Continue in this manner for a few iteration in the hope that a formulawill emerge

– For example, consider

T (n) = cT

(n

d

)+ f(n)

Let n = di with base case T (1) = d0 = b (step i = 0)

Then,

i = 1: T (d1) = cT ((d1)/d) + f(d1) = cT (d0) + f(d1) = cb+ f(d1)i = 2: T (d2) = cT ((d2)/d) + f(d2) = cT (d1) + f(d2) = c[cb+ f(d1)] +f(d2) = c2b+ cf(d1) + f(d2))etc.

– Will ultimately express i in terms of n

11

Chapter 4: Divide and Conquer - Analysis Techniques (2)

• Backward substitution

– Start at the top level (n elements) and work towards the base case

– Substitute the value from the next recursion into the recursion equation;i.e.,Express T (di) in terms of T (di−1) and substitute in original recurrencerelation

– Repeat with successively smaller terms until a formula that expresses thesequence can be determined

– For example, consider

T (n) = cT

(n

d

)+ f(n)

Let n = dk with base case T (1) = d0 = b (step i = 0)

Then,

i = 1: T (dk) = cT ((dk)/d) + f(dk) = cT (dk−i) + f(dk)Since T (dk−1) = cT ((dk−1)/d) + f(dk−1) = cT (dk−2) + f(dk−1) we canrewrite T (dk) asi = 2: T (dk) = c[cT (dk−i)+f(dk−i+1)]+f(dk) = c2T (dk−i)+cf(dk−i+1)+f(dk)etc.

– Will ultimately express i in terms of n

12

Chapter 4: Divide and Conquer - Analysis Techniques, The Substitution Method

• This technique involves

1. Guess a form for the equation (i.e., g(n))

2. Use induction to find asymptotic constant(s) and show that the guess works

• The guess might be based on

1. A previous analysis that had a similar equation

2. Result of another proof technique

• The proof itself consists of

1. Substituting the guess for the recurrence in the equation

2. Find a constant (could be two) that satisfies the relation in the asymptoticdefinition

3. Demonstrate that the guess holds for the base case too

• Example: Consider (pp 83 - 84)

T (n) = 2T

(⌊n

2

⌋)+ n

– The guess is T (n) = O(nlg n)

– To prove guess correct, must show that T (n) ≤ cnlg n for some c > 0

– Assume that this holds for all m < n (i.e.,⌊n2

⌋)

– Substitute cnlg n in the right-hand side of the inequality, using⌊n2

⌋for n

– Then,T (n) ≤ 2

(c⌊n2

⌋lg

(⌊n2

⌋))+ n

≤ cnlg(n2

)+ n

= cnlg n− cnlg 2 + n= cnlg n− cn+ n

≤ cnlg n

which holds for c ≥ 1

13

Chapter 4: Divide and Conquer - Analysis Techniques, The SubstitutionMethod (2)

– Next, need to demonstrate that the guess holds for the base case

∗ This step can sometimes require some ingenuity

∗ Assume that T (1) = 1 for the example above; then

T (1) = 1 ≤ c ∗ 1 ∗ lg 1 = 0

∗ This contradicts the hypothesis

∗ What we must do is distinguish between the base case of the equation,and the base case of the proof

∗ While the base case of the recursion is n = 1, for the proof we can uselarger values (because we only have to show that it holds for all n ≥some n0, which we essentially get to choose)

∗ Since n = 1 is problemmatic, judicious choices for base cases are n = 2and n = 3

· These are the two values that result in recursion with n = 1

∗ Using these values as base cases,

· T (2) = 4 (i.e., 2T (1) + 2 = 2 + 2 ≤ 2lg 2)

· T (3) = 5 (i.e., 2T (1) + 3 = 2 + 3 ≤ 3lg 3)

• Heuristics for making a good guess:

1. If know bounds for a similar recursion, try it

2. Prove weak upper and lower bounds, then iteratively try to improve themby tightening them

• Issues

1. May have correct bound but induction doesn’t work

– Reason might be because induction isn’t strong enough to prove thedetailed (exact) bound

– Can sometimes subtract a lower-order term to make it work

14

Chapter 4: Divide and Conquer - Analysis Techniques, TheSubstitution Method (3)

– For example (p 85)

T (n) = T

(⌊n

2

⌋)+ T

(⌈n

2

⌉)+ 1

Guess T (n) = O(n)Need to show that T (n) ≤ cn

On substitution:

T (n) = c

⌊n

2

⌋+ c

⌈n

2

⌉+ 1 = cn+ 1

Clearly this isn’t less than cn for any value of n

– The problem is the constant term 1

– To remedy the situation, subtract a constant d ≥ 0

– The revised guess is T (n) ≤ cn− d

T (n) ≤(c

⌊n

2

⌋− d

)+

(c

⌈n

2

⌉− d

)+ 1 = cn− 2d+ 1 ≤ cn− d

15

Chapter 4: Divide and Conquer - Analysis Techniques, The SubstitutionMethod (4)

2. You must prove the exact form of the inductive hypothesis

– For example, if you want to prove T (n) ∈ O(n), you must showT (n) ≤ cn

– Consider the following common example of incorrect logic

∗ Want to show

T (n) = 2T

(⌊n

2

⌋)+ n ∈ O(n)

∗ ThenT (n) ≤ 2

(c⌊n2

⌋)+ n

≤ cn+ n ∈ O(n)

∗ But this reasoning is fallacious because have not shown thatT (n) ≤ cn

3. Sometimes changing variables makes proofs easier

– ConsiderT (n) = 2T (b

√nc) + lg n

– Let m = lg n⇒ n = 2m

– ThenT (2m) = 2T (2

m2 ) +m

– Then, let S(m) = T (2m), which lets us rewrite the above as

S(m) = 2S

(m

2

)+m

– This has the same form as

T (n) = 2T

(⌊n

2

⌋)+ n ∈ O(n)

– So we can guess a solution of

S(m) = O(mlg m) = O((lg n)(lg lg n))

16

Chapter 4: Divide and Conquer - Analysis Techniques, Recursion Trees

• This approach can be used to find a good guess to OoG to be used in thesubstitution method, or - if done in an exacting manor - it can be a proof initself

• The approach:

1. Create a recursion tree

2. Calculate the cost of each level

3. Calculate the cost over all levels

• Consider

T (n) = 3T

(⌊n

4

⌋)+O(n2)

– Consider the following examples of ’sloppiness’ (i.e., laxness) that is allow-able since we are looking for a guesstimate

∗ Let n = 4i, eliminating the need for the floor function in the above

∗ Replace O(n2) by cn2

– This generates the following recursion tree:

17

Chapter 4: Divide and Conquer - Analysis Techniques, Recursion Trees(2)

– Let i represent the level of the tree; then

Level Cost

i = 0 c30(n40

)2= cn2

i = 1 c31(n41

)2=(316

)cn2

i = 2 c32(n42

)2=(316

)2cn2

. . . . . . . . .

– The leaf level corresponds to 4i = n⇒ log4(4i) = log4 n

Thus i = log4n, which corresponds to the height of the tree

– Cost of the leaf level is 3log4nT (1) = cnlog43

– The total cost of the tree is

T (n) =∑log4n−1i=0

(316

)icn2 + cnlog43

<∑∞i=0

(316

)icn2 + cnlog43

= 11− 3

16

cn2 + cnlog43

= 1613cn

2 + cnlog43

∈ O(n2)

– The root of the tree contributes the greatest amount to the total cost (cout of 16

13c, with 313c contributed from the lower levels of the tree)

18

Chapter 4: Divide and Conquer - Analysis Techniques, Recursion Trees (3)

• To prove that T (n) ∈ O(n2) (via substitution), must show that T (n) ≤ dn2 forsome d > 0

T (n) ≤ 3T(⌊n4

⌋)+ cn2

≤ 3d⌊n4

⌋2+ cn2

≤ 3d(n4

)2+ cn2

= 316dn

2 + cn2

≤ dn2

when d ≥ 1613c

19

Chapter 4: Divide and Conquer - Analysis Techniques, The Master Theorem

• The Master Theorem allows easy solutions to recursions of the form

T (n) = aT

(n

b

)+ f(n)

– f(n) represents the cost of dividing and combining

– Omitting floors and ceilings has no effect on the results

• The Master Theorem:

Let

1. a ≥ 1, b > 1 be constants

2. f(n) be a function that is asymptotically positive

3. T (n) is defined for integers n ≥ 0 as

T (n) = aT

(n

b

)+ f(n)

where nb is either dn/be or bn/bc

Then T (n) has the following asymptotic bounds:

T (n) =

Θ(nlogba) when f(n) ∈ O(nlogba−ε) for constant ε > 0

Θ(nlogbalg n) when f(n) ∈ Θ(nlogba)

Θ(f(n)) when f(n) ∈ Ω(nlogba+ε) for constant ε > 0,af(nb ) ≤ cf(n) for some constant c < 1, and

for all sufficiently large n

• Points re the theorem:

1. Form of solution based on comparison of f(n) with nlogba, the larger beingthe determining factor

– In the first case, f(n) can’t be just smaller, but must be so by a factorof nε

– In the third case, f(n) can’t be just larger, but must be so by a factorof nε AND must satisfy af(nb ) ≤ cf(n)

2. Not all situations are covered

– For example, when f(n) > nlogba but is not polynomially smaller (andsimilarly for < case)

20

Chapter 4: Divide and Conquer - Analysis Techniques, The Master Theorem(2)

• Examples:

1. T (n) = 9T (n/3) + na = 9, b = 3, f(n) = nnlogba = nlog39 = n2

Since f(n) ∈ O(n2−ε), where ε = 1, the Master Theorem appliesT (n) ∈ Θ(n2)

2. T (n) = T (2n/3) + 1a = 1, b = 3/2, f(n) = 1nlogba = nlog3/21 = n0 = 1Since f(n) ∈ Θ(1),T (n) ∈ Θ(lg n)

3. T (n) = 3T (n/4) + nlg n

a = 3, b = 4, f(n) = nlg n

nlogba = nlog43 = n0.793

Since f(n) ∈ Ω(nlog43+ε), where ε ∼ 0.2, andcf(n/b) = 3(n/4)lg(n/4) ≤ 3

4nlg n = cf(n) for c = 34 , the Master Theorem

appliesT (n) ∈ Θ(nlg n)

4. T (n) = 2T (n/2) + nlg n

a = 2, b = 2, f(n) = nlg n

nlogba = nlog22 = nnlg n is asymptotically greater than n, but not polynomially larger:

f(n)

nlogba=nlg n

n= lg n

which is asymptotically smaller then n2 for any ε > 0The Master Theorem does not apply

21

Chapter 4: Divide and Conquer - Analysis Techniques, Proof of the MasterTheorem

22

Chapter 4: Divide and Conquer - Introdjmoon/algs-g/algs-notes/c4... · 2019. 2. 18. · Chapter 4: Divide and Conquer - The Max Subarray Problem Consider a vector A[n] of numeric

Documents