1 TCSS 342 Lecture Notes Course Overview, Review of Math Concepts, Algorithm Analysis and Big-Oh Notation Weiss book, Chapter 5 These lecture notes are copyright (C) Marty Stepp, 2005. They may not be rehosted, sold, or modified without expressed permission from the author. All rights reserved.
57
Embed
1 TCSS 342 Lecture Notes Course Overview, Review of Math Concepts, Algorithm Analysis and Big-Oh Notation Weiss book, Chapter 5 These lecture notes are.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
TCSS 342 Lecture NotesCourse Overview,
Review of Math Concepts,Algorithm Analysis and Big-Oh Notation
Weiss book, Chapter 5
These lecture notes are copyright (C) Marty Stepp, 2005. They may not be rehosted, sold, or modified without expressed permission from the author. All rights reserved.
2
Lecture outline Introduction and course objectives
Mathematical background review exponents and logarithms arithmetic and geometric series
Algorithm analysis and Big-Oh notation the RAM model Big-Oh Big-Omega, Big-Theta, Little-Oh, Little-Omega
Comparison of algorithms to solve the Maximum Contiguous Subsequence Sum problem
3
Course objectives (broad) prepare you to be a good software engineer
(specific) learn basic data structures and algorithms data structures – how data is organized algorithms – unambiguous sequence of steps to compute
something algorithm analysis – determining how long an algorithm will take to
solve a problem Who cares? Aren't computers fast enough and getting faster?
"Data Structures + Algorithms = Programs"-- Niklaus Wirth, author of Pascal language
4
Given an array of 1,000,000 integers, find the maximum integer in the array.
Now suppose we are asked to find the kth largest element. (The Selection Problem)
…0 1 2 999,999
An example
5
candidate solution 1 sort the entire array (from small to large), using Java's Arrays.sort()
pick out the (1,000,000 – k)th element
candidate solution 2 place the first k elements into a sorted array of size k for each of the remaining 1,000,000 – k elements,
keep the k largest in the array pick out the smallest of the k survivors and toss it
Candidate solutions
6
Is either solution good? What advantages and disadvantages does each
solution have?
Is there a better solution?
What makes a solution "better" than another? Is it entirely based on runtime?
How would you go about determining which solution is better? could code them, test them could somehow make predictions and analysis of
each solution, without coding
7
Why algorithm analysis? as computers get faster and problem sizes get bigger,
analysis will become more important
The difference between good and bad algorithms will get bigger
being able to analyze algorithms will help us identify good ones without having to program them and test them first
8
Why data structures? when programming, you are an engineer
engineers have a bag of tools and tricks – and the knowledge of which tool is the right one for a given problem
for (int i = 0; i < n; i += c) // O(n) statement(s); Adding to the loop counter means that the loop runtime grows linearly
when compared to its maximum value n. Loop executes its body exactly n / c times.
for (int i = 0; i < n; i *= c) // O(log n) statement(s); Multiplying the loop counter means that the maximum value n must grow
exponentially to linearly increase the loop runtime; therefore, it is logarithmic.
Loop executes its body exactly logc n times.
for (int i = 0; i < n * n; i += c) // O(n2) statement(s); The loop maximum is n2, so the runtime is quadratic.
Loop executes its body exactly (n2 / c) times.
Program loop runtimes
43
Nesting loops multiplies their runtimes.for (int i = 0; i < n; i += c) { // O(n2) for (int j = 0; j < n; i += c) { statement;} }
Loops in sequence add together their runtimes, which means the loop set with the larger runtime dominates.for (int i = 0; i < n; i += c) { // O(n) statement;}for (int i = 0; i < n; i += c) { // O(n log n) for (int j = 0; j < n; i *= c) { statement;} }
More loop runtimes
44
Loop runtime problems Compute the exact value of the variable sum after the
following code fragment, as a closed-form expression in terms of input size n.int sum = 0;
for (int i = 1; i <= n; i *= 2) {
sum++;
}
for (int i = 1; i <= 1000; i++) {
sum++;
sum++;
}
45
Loop runtime problems Compute the exact value of the variable sum after the
following code fragment, as a closed-form expression in terms of input size n.int sum = 0;
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= i / 2; j += 2) {
sum++;
}
}
46
"Maximum subsequence sum" problem
47
Maximum subsequence sum
The maximum subsequence sum problem:
Given a sequence of integers A0, A1, ..., An - 1,
find the maximum value of
for any integers 0 (i, j) < n.
(This sum is zero if all numbers in the sequence are negative.)
For example, the maximum subseq. sum of this list is 33:[2, 9, -4, 1, -20, 3, 15, -10, 20, 5, -100, 10, 7]
j
ikkA
48
Thoughts about the problem This is a maximization problem.
There are a set of possible answers (sums of subsets of the original array), and we are being asked to find the largest one.
When solving maximization problems, one possible technique is to generate all possible solutions, compare them, and choose the appropriate one.
In the context of this problem, that would mean generating all possible subsets of the original array, adding them up to determine each of their sums, and comparing these sums to choose the best one of them all.
brute force algorithm: A method for solving a problem that proceeds in a simple and obvious way, possibly making it require more steps than necessary.
49
First algorithm (brute force)generate and test all possible subsequences
// implement togetherfunction maxSubsequence(array[]): max sum = 0
for each starting index i, for each ending index j, add up the sum from Ai to Aj
if this sum is bigger than max, max sum = this sum
return max sum
Using the RAM model, how many steps does this algorithm require?
What do you believe is the runtime (Big-Oh) of this algorithm? How could we empirically test this?
What is wasteful about the algorithm? How could it be improved?
50
More observations The preceding algorithm is a brute force algorithm, but
it isn't even the best brute force algorithm.
We are redundantly re-computing a large number of values many times.
For example, we compute the sum of the subsequence between indexes 2 and 5,A[2] + A[3] + A[4] + A[5]
next we compute the sum of the subsequence between indexes 2 and 6.A[2] + A[3] + A[4] + A[5] + A[6]
We already had computed the sum of 2--5, but we compute it again as part of the 2--6 computation.
Assuming from the RAM model that each addition takes 1 unit of time, we are wasting 4 add operations!
51
still try all possible combinations, but don't redundantly add the sums
key observation:
in other words, we don't need to throw away partial sums
can we use this information to remove one of the loops from our algorithm?
// implement together
function maxSubsequence2(array[]):
What is the runtime (Big-Oh) of this new algorithm? Can it still be improved further?
1j
ikkj
j
ikk AAA
Second algorithm (improved)
52
Observations: Claim 1 To make our code even faster, we must avoid trying all
possible combinations; to do this, we must find a way to broadly eliminate many potential combinations from consideration.
Claim #1: A subsequence with a negative sum cannot be the start of the maximum-sum subsequence.
53
Claim #1, more formally: If Ai, j is a subsequence such that ,
then there is no q such that Ai,q is the maximum-sum subsequence.
Proof: (do it together in class)
Can this help us produce a better algorithm?
0
j
ikkA
Claim 1, continued
54
Claim #2: when examining subsequences left-to-right, for some starting index i, if Ai,j becomes the first subsequence starting with i ,
such that
Then no part of Ai,j can be part of the maximum-sum subsequence.
(Why is this a stronger claim than Claim #1?)
Proof: (do it together in class)
0
j
ikkA
Claim 2
55
Claim 2, continued
These figures show the possible contents of Ai,j
56
Can we eliminate another loop from our algorithm?
// implement together
function maxSubsequence3(array[]):
What is its runtime (Big-Oh)?
Is there an even better algorithm than this third algorithm? Can you make a strong argument about why or why not?
Third (best) algorithm
57
Express the running time as f(N), where N is the size of the input
worst case: your enemy gets to pick the input
average case: need to assume a probability distribution on the inputs
amortized: your enemy gets to pick the inputs/operations, but you only have to guarantee speed over a large number of operations