Algorithm Analysis Data Structures and Algorithms (60- 254)
Algorithm AnalysisData Structures and Algorithms (60-254)
2
Quantification of Performance
• We want to understand the behaviour of our algorithms (both runtime and space utilization) on different inputs.
• This will enable us to compare different algorithms that solve the same problem.
• To do this, we characterize the performance as a function of the number of inputs (why? reasonable? pitfalls?)
• The functions may be complex. We want to understand their growth rate in simpler terms.
3
Growth of Functions
Let f(n) be a function of a positive integer n. The dominant term of f(n) determines the behavior of f(n) as n .For example, let f(n) = 2n3 + 3n2 + 4n + 1The dominant term of f(n) is 2n3.This means that as n becomes large (n ):
2n3 dominates the behavior of f(n) The other terms’ contributions become much less significant.
4
More Examples
Example 2:The term dominates the behavior of f(n) as n
Example 3:The term 2n dominates the behavior of f(n) as n
5
Rate of Growth
The rate of growth means how function behaves as n .It is determined by its dominant term.The big-Oh notation is a short-hand way of expressing this. The relationship f(n) is O(n2) is interpreted as:
f(n) grows no faster than n2 as n becomes large (n ).The dominant term of f(n) does not grow faster than n2.
6
More Examples
Also, O(n2) is true for:
We could make up as many such functions as we wish.
7
Definition of Big Oh
Problem: Find the most accurate description for any function in terms of the big-Oh notation.
A formal definition of: f(n) is O(g(n))is that the inequality: f(n) c * g(n)holds for all n n0, where n0 and c are positive constants
f(n) and g(n) are functions mapping nonnegative integers to real numbersInformally “f(n) is order g(n)”
8
Graphical Interpretation of Big Oh
9
f(n) = 7n2 + .5n + 6 and g(n) = n2
f(n) is O(n2), provided that c = 10 and n0 = 2
n0 = 2
Example
7n2 + .5n + 6
10n2
10
In General
In general, if f(n) = a0 + a1 n + … + ad-1 nd-1 + ad nd
Then, f(n) is O(nd)
We will see other functions too:For example: O(log n), O(n log n), , etc. Having defined the big-Oh notation, now …
11
Quantifying Performance
Input size TimeI1 T1
I2 T2
… …
We want to quantify the behavior of an algorithm.Use this to compare efficiency of two algorithms on the same problem.
Observations:1. A program (algorithm) consumes resources: time and space2. Amount of resources directly related to size of input.
Say, we have the following table:
How can we derive the function T(I) ?
12
Quantifying Performance
How can we derive the function T(I) ?
Problems:• Too many parameters for using interpolation• It depends on
• machine in which program is run• Compiler used• Programming language• Programmer who writes the code
Solution:Imagine our algorithm runs on an “algorithm machine” that accepts pseudo-code.
Assumptions:READs and WRITEs take constant timeArithmetic operations take constant timeLogical operations “ “ “
13
Worst Case
• Even though we made various assumptions, it is still complicated.• Instead, we quantify our algorithm on the worst-case input.
• This is called “worst-case analysis”
• Also, the “average-case analysis” exists:• Requires probability distribution of set of inputs which is usually unknown.
• Not studied in this course.
14
Input Size
• not always easy to determine, and• problem dependent
Some examples:• Graph-theoretic problem: Number of vertices, V, and number of
edges, E.• Matrix multiplication: Number of rows and columns of input matrices.• Sorting: The number of elements, n.
15
Reporting Performance
• Typically we don’t try to find exactly what T(I) is.• Instead, we can say: T(I) is O(g(I))
• For example, time complexity of mergesort of n elements:T(n) is O(n log n)
• Behavior of mergesort is better than a constant times n log n, where n n0.
16
Analysis of Examples
1. Given a list of n elements, find the minimum (or maximum).Then, T(n) is O(n)We look at all elements to determine minimum (maximum).
2. Given n points in the plane, find the closest pair of points.In this case, T(n) is O(n2)Why? A brute-force algorithm that looks at all n2 pairs of points.
17
Analysis of Examples
3. Given n points in a plane, determine if any three points are contained in a straight line.
In this case, T(n) is O(n3)
Why? A brute-force algorithm that searches all n3 triplets.
18
WAIT A MINUTE!
• What is T(n) for finding the GCD of m and n?
• The naïve brute force algorithm was O(n) but the Euclidean algorithm was O(log n)? Hmmm…
• And how about this? Find the gcd(m=1989,n=1590).Algorithm:Step 1. Output 3.
• So T(n) is O(1).
• We must distinguish between the complexity of an algorithm and the complexity of a class of problems.
19
Maximum Contiguous Subsequence (MCS) ProblemGiven a sequence of n integers: a1, a2, a3, …, an-1, an
a contiguous subsequence is: ai, ai+1, …, aj-1, aj, where 1 i j n.
The problem: Determine a contiguous subsequence such that:
ai + ai+1 + … + aj-1 + aj 0 is maximal.
Some examples: -1, -2, -3, -4, -5, -6MCS is empty, has value 0 by definition.
20
More Examples
For the sequence: -1, 2, 3, -3, 2, an MCS is 2, 3 whose value is 2 + 3 = 5. Note: There may be more than one MCS. For example: -1, 1, -1, 1, -1, 1 has six MCS whose value is 1
21
An O(n2) Algorithm for MCS
Search problems have an associated search space.To figure out: How large the search space is.For the MCS problem: How many sequences need be examined?For example, -1, 2, 3, -3, 2Then, the subsequences that begin with –1 are:-1-1, 2-1, 2, 3-1, 2, 3, -3-1, 2, 3, -3, 2
22
An O(n2) Algorithm for MCS
The ones beginning with 2 are:22, 32, 3, -32, 3, -3, 2Those beginning with 3 are:33, -33, -3, 2The ones beginning with –3:-3-3, 2and beginning with 2, just one: 2
23
An O(n2) Algorithm for MCS
Then, including the empty sequence, a total of 16 examined.
In general, given a1, a2, a3, …, an-1, an
We have n sequences beginning with a1:a1
a1, a2
a1, a2, a3
….a1, a2, a3, …, an-1, an
n-1 beginning with a2:a2
a2, a3
….a2, a3, …, an-1, an
24
An O(n2) Algorithm for MCS
and so on. Then, two subsequences beginning with an-1:an-1
an-1, an
and, finally, one beginning with an
an
Total of possible subsequences: 1 + 2 + … n-1 + n + 1 = n(n+1)/2 + 1 Analysis:The dominant term is n2/2, hence search space is O(n2). A “brute-force” algorithm follows…
25
An O(n2) Algorithm for MCS
Algorithm MCSBruteForce
Input: A sequence a1, a2, a3, …, an-1, an.Output: value, start and end of MCS.
Set maxSum 0for i = 1 to n do
Set sum 0for j = i to n do
sum sum + aj
if (sum > maxSum).maxSum sumstart iend j
Print start, end, maxSum and STOP.
26
Improved MCS Algorithm
Think of avoiding looking at all the subsequences.Introduce the following notion.
Given: ai, ai+1, …, ak, ak+1, …, aj (1)
the subsequence: ai, ai+1, …, ak
is a prefix of (1), where i k j.
The prefix sum is: ai + ai+1 + … + ak
Observation: In an MCS no prefix sum can be negative.
27
Improved MCS Algorithm
In the previous example, -1, 2, 3, -3, 2, we exclude:
-1-1, 2-1, 2, 3-1, 2, 3, -3-1, 2, 3, -3, 2
and -3-3, 2
as being possible candidates.
28
Improved MCS Algorithm
In general:If ever sum < 0, skip over index positions from i+1, …, jAlso, if sum 0 always for a starting position i, none of positions i+1, …, n is a candidate start position, since all prefix sums are non-negative.
The improved MCS algorithm inspects ai just once. The algorithm follows….
29
Improved MCS Algorithm
Algorithm MCSImprovedSet i 1; Set start end 1Set maxSum sum 0
for j = 1 to n dosum sum + aj
if (sum > maxSum)maxSum sumstart iend j
if (sum < 0)i j + 1sum 0
Print start, end, maxSum and STOP.
30
Analysis of the Algorithms
Algorithm MCSBruteForce:The outer loop is executed n timesFor each i, the inner loop is executed n – i + 1 timesThus, the total number of times the inner loop is executed: Algorithm MCSImproved:It has a single for loop, which visits all n elements.Hence,