Discrete Math

Discrete Mathematics: Week 1

Reference:

Johnsonbaugh, R., Discrete Mathematics, (6th edition), Pearson Prentice Hall, 2005.

The 5th edition of Johnsonbaugh may be used, but the 6th edition has some notationchanges and some different problem numbering.

Logic and Proofs

Propositions

Logic is the study of reasoning. We can look at examples involving everyday sentences,and proceed to more formal mathematical approaches.

Consider the following sentences:

Adelaide is the capital of South Australia.

There are 30 hours in a day.

The square of 12 is 144.

Every even number greater than 2 can be expressed as the sum of two prime numbers(Goldbach’s conjecture).

Each of the statements is either true or false. The first and third are obviously true, andthe second is obviously false. What do you think about the last?

A proposition is a statement that is either true or false (but not both). Whichever ofthese (true or false) is the case is called the truth value of the proposition.

Some statements cannot be considered as propositions e.g.

Fred is a nerd.

The truth value is not well defined. However propositions in mathematics are well defined.

Definition. Let p and q be propositions.

The conjunction of p and q, denoted p ∧ q, is the proposition

p and q.

The disjunction of p and q, denoted p ∨ q, is the proposition

p or q.

1

Example 1. If

p : It is raining,

q : It is cold,

then the conjunction of p and q is

p ∧ q : It is raining and it is cold.

The disjunction of p and q is

p ∨ q : It is raining or it is cold.

A binary operator on a set X assigns to each pair of elements in X an element of X. Theoperator ∧ assigns to each pair of propositions p and q the proposition p∧ q. Thus, ∧ and∨ are binary operators on propositions.

Definition. The truth value of the proposition p ∧ q is defined by the truth table

p q p ∧ q

T T TT F FF T FF F F

In essence, p ∧ q is true provided that both p and q are true, and is false otherwise.

Definition. The truth value of the proposition p ∨ q is defined by the truth table

p q p ∨ q

T T TT F TF T TF F F

In essence, p ∨ q is true provided that p or q (or both) are true, and is false otherwise.

Definition. The negation of p, denoted ¬ p, is the proposition

not p.

The truth value of the proposition ¬ p is defined by the truth table

p ¬ p

T FF T

In English, we sometimes write ¬ p as “It is not the case that p.”

2

Example 2. We have

p : The digit 1 occurs twice in the first 13 digits of π,

q : The digit 7 does not occur in the first 13 digits of π,

r : The first 13 digits of π sum to 60.

The compound proposition is

“Either 1 occurs twice in the first 13 digits of π and the digit 7 occurs at least once inthe first 13 digits of π or the first 13 digits of π sum to 60.”

The proposition can be written symbolically as

(p ∧ ¬ q) ∨ r.

The first digits of π are

π = 3.141592653589 79323864 . . . .

Then

(p ∧ ¬ q) ∨ r = (T ∧ ¬T ) ∨ F

= (T ∧ F ) ∨ F

= F ∨ F

= F,

and so the compound proposition is false.

Conditional Propositions and Logical Equivalence

Definition. If p and q are propositions, the proposition

if p then q

is called a conditional proposition and is denoted

p → q.

The proposition p is called the hypothesis (or antecedent) and the proposition q is calledthe conclusion (or consequent).

Example 3. The lecturer states that if a student gets more than 50% then the studentwill pass the course.

p : The student gets more than 50%,

q : The student passes the course.

If p and q are both true then p → q is true.

If p is true and q is false then p → q is false.

3

If p is false then p → q does not depend on the conclusion’s truth value, and so is regardedas true.

This last often presents some difficulty in comprehending. We can think of it in this way.If the student does not get more than 50%, we cannot regard p → q as false, and so it isconsidered true. This gives the following truth table.

Definition. The truth value of the conditional proposition p → q is defined by thefollowing truth table:

p q p → q

T T TT F FF T TF F T

Note thatp only if q

is considered logically the same asif p then q.

An example of this is the two statements

“The student is eligible to take Maths 3 only if the student has passed Maths 2”

and

“If the student takes Maths 3 then the student has passed Maths 2,”

which are logically equivalent.

Definition. If p and q are propositions, the proposition

p if and only if q

is called a biconditional proposition and is denoted

p ↔ q.

The truth value of the proposition p ↔ q is defined by the following truth table:

p q p ↔ q

T T TT F FF T FF F T

Note that p ↔ q means that p is a necessary and sufficient condition for q. The proposition“p if and only if q” can be written “p iff q”.

Definition. Suppose that the propositions P and Q are made up of the propositionsp1, . . . , p

n. We say that P and Q are logically equivalent and write

P ≡ Q,

provided that, given any truth values p1, . . . , pn, either P and Q are both true, or P and

Q are both false.

4

Example 4. Verify the first of De Morgan’s laws

¬ (p ∨ q) ≡ ¬ p ∧ ¬ q,

and the second,¬ (p ∧ q) ≡ ¬ p ∨ ¬ q

will be a tutorial exercise.

P = ¬ (p ∧ q),

Q = ¬ p ∨ ¬ q.

p q p ∨ q ¬ p ¬ q ¬ (p ∨ q) ¬ p ∧ ¬ q

T T T F F F FT F T F T F FF T T T F F FF F F T T T T

Thus P and Q are logically equivalent.

Definition. The converse of the conditional proposition p → q is the proposition q → p.

The contrapositive (or transposition) of the conditional proposition p → q is the proposi-tion ¬ q → ¬ p.

Theorem 1. The conditional proposition p → q and its contrapositive ¬ q → ¬ p arelogically equivalent.

Proof:

The truth tablep q p → q ¬ q ¬ p ¬ q → ¬ p

T T T F F TT F F T F FF T T F T TF F T T T T

shows that p → q and ¬ q → ¬ p are logically equivalent.

Some theorems in mathematics are best proved by using the contrapositive. It is likelythat you have seen some in matriculation mathematics or Engineering Mathematics 1 or2E.

Exercise: Show that p → q ≡ ¬ p ∨ q.

Definition. A compound proposition is a tautology if it is true regardless of the truthvalues of its component propositions.

A compound proposition is a contradiction if it is false regardless of the truth values ofits component propositions.

Example 5.p ¬ p p ∨ ¬ p p ∧ ¬ p

T F T FF T T F

So p ∨ ¬ p is a tautology, and p ∧ ¬ p is a contradiction.

Exercise: Show that (p ∧ (p → q)) → q is a tautology.

5

Quantifiers

Consider the statementp : n is a prime number.

The statement p is not a proposition, because a proposition is either true or false. Wehave that p is true if n = 7, and false if n = 8.

Definition. Let P (x) be a statement involving the variable x and let D be a set. We callP a propositional function or predicate (with respect to D) if for each x in D, P (x) is aproposition. We call D the domain of discourse of P .

Example 6. Let P (n) be the statement

P (n) : n is a prime number,

and let D be the set of positive integers.

Then P is a propositional function with domain of discourse D since for each n in D,P (n) is a proposition which is either true of false. P is true for n = 2, 3, 5, 7, . . . , and isfalse for n = 1, 4, 6, 8, . . . .

A propositional function P by itself is neither true or false, but is true or false for each x

in its domain of discourse

Example 7. Let P (x) be the statement

P (x) : x2 − 5x + 6 = 0,

and let D be the set of positive integers.

Then P is a propositional function and is true for x = 2 or x = 3, and is false for all otherpositive integers.

Definition. Let P be a propositional function with domain of discourse D. The statement

for every x, P (x)

is said to be a universally quantified statement. The symbol ∀ means “for every”. Thusthe statement

for every x, P (x)

may be written∀xP (x).

The symbol ∀ is called a universal quantifier.

The statement∀xP (x)

is true if P (x) is true for every x in D. The statement

∀xP (x)

is false if P (x) is false for at least one x in D.

6

Example 8. The universally quantified statement “for every positive integer n greaterthan 1, 2n − 1 is prime” is false.

n = 2 22− 1 = 3

n = 3 23 − 1 = 7

n = 4 24 − 1 = 15.

We only need a counter example to prove a statement false. We need to prove for all x toprove a statement true.

Definition. Let P be a propositional function with domain of discourse D. The statement

there exists x, P (x)

is said to be an existentially quantified statement. The symbol ∃ means “there exists”.Thus the statement

there exists x, P (x)

may be written∃xP (x).

The symbol ∀ is called an existential quantifier.

The statement∃xP (x)

is true if P (x) is true for at least one x in D. The statement

∃xP (x)

is false if P (x) is false every x in D.

Example 9. The existentially quantified statement “for some positive integer n, 2n − 1is divisible by 11” is true.

n = 1 21 − 1 = 1

n = 2 22 − 1 = 3

n = 3 23− 1 = 7

n = 4 24 − 1 = 15

n = 5 25 − 1 = 31

n = 6 26 − 1 = 63

n = 7 27 − 1 = 127

n = 8 28 − 1 = 255

n = 9 29 − 1 = 511

n = 10 210− 1 = 1023.

The first case where 2n − 1 is divisible by 11 is for n = 10.

7

The variable x in the propositional function P (x) is called a free variable, that is x is“free” to roam over the domain of discourse.

The variable x in the universally quantified statement

∀xP (x)

or in the existentially quantified statement

∃xP (x)

is a bound variable, in that x is “bound” by the quantifier.

Example 10. Verify that the existentially quantified statement “for some real number

x,1

x2 + 1> 1” is false.

We must show that1

x2 + 1> 1 is false for all real numbers x.

Now1

x2 + 1> 1 is false when

1

x2 + 1≤ 1 is true. Then

0 ≤ x2

1 ≤ x2 + 11

x2 + 1≤ 1,

and so1

x2 + 1≤ 1 is true for all real numbers x.

Theorem 2. Generalised De Morgan Laws for Logic

If P is a propositional function, each pair of propositions in (a) and (b) has the sametruth values (i.e. either both are true or both are false).

(a) ¬ (∀xP (x)); ∃x¬P (x)

(b) ¬ (∃xP (x)); ∀x¬P (x)

Proof of (a):

If ¬ (∀xP (x)) is true, then ∀xP (x) is false.

Hence P (x) is false for at least one x in the domain of discourse, and ¬P (x) is true forat least one x in the domain of discourse. Hence ∃x¬P (x) is true.

Similarly, if ¬ (∀xP (x)) is false, then ∃x¬P (x) is false.

8

Example 10. (Continued)

We have that P (x) is the statement1

x2 + 1> 1, and aim to show that for all real numbers

x, P (x) is false i.e.∃xP (x)

is false. We do this by verifying that for every real number x, ¬P (x) is true i.e.

∀x¬P (x)

is true. Then

∀x¬P (x) is true

¬ (∀x¬P (x)) is false

∃x¬ (¬P (x)) is false (by De Morgan’s laws)

∃xP (x) is false

Example 11. Consider the well-known proverb

All that glitters is not gold.

This can be interpreted in English in two ways:

Every object that glitters is not gold.

Some object that glitters is not gold.

The intention is that the second is correct.

If we let P (x) be the propositional function “x glitters” and Q(x) be the propositionalfunction “x is gold”, then the first interpretation is represented as

∀x (P (x) → ¬Q(x)),

and the second interpretation is represented as

∃x (P (x) ∧ ¬Q(x)).

In a similar way in which the logical equivalence of the Exercise

p → q ≡ ¬ p ∨ q

was shown, we can show that¬ (p → q) ≡ p ∧ ¬ q.

Hence

∃x (P (x) ∧ ¬Q(x)) ≡ ∃x¬ (P (x) → Q(x))

≡ ¬ (∀xP (x) → Q(x))

by De Morgan’s laws.

We can read this last line as “it is not true that for all x, if x glitters then x is gold”.This has been shown to be logically equivalent to “some object that glitters is not gold”.

The ambiguity comes from applying the negative to Q(x), rather than to the wholestatement.

9


Nested Quantifiers

Example 1. Consider the statement

The sum of any two positive real numbers is positive.

This can be restated as: If x > 0 and y > 0, then x + y > 0. We need two universalquantifiers, and can write the statement symbolically as

∀x∀y ((x > 0) ∧ (y > 0) → (x + y > 0)).

The domain of discourse is the set of all real numbers. Multiple quantifiers such as ∀x∀y

are said to be nested quantifiers.

The statement∀x∀y P (x, y),

with domain of discourse D, is true if, for every x and for every y in D, P (x, y) is true.

The statement∀x∀y P (x, y),

is false if there is at least one x and at least one y in D such that P (x, y) is false.


For any real number x, there is at least one real number y such that x + y = 0.

We know that this is true, as we can always choose y to be −x. We can write the statementsymbolically as

∀x ∈ R (∃y ∈ R, x + y = 0).

The domain of discourse is the set of all real numbers.

The statement∀x∃y P (x, y),

with domain of discourse D, is true if, for every x in D, there is at least one y in D forwhich P (x, y) is true.

The statement∀x∃y P (x, y),

is false if there is at least one x in D such that P (x, y) is false for every y in D.

1

Example 3. Consider the nested quantifier

∃y ∈ R, (∀x ∈ R, x + y = 0).

This can be stated as

There is some real number y such that x + y = 0 for all real numbers x.

This is false, for example choose x to be 1 − y.

The statement∃x∀y P (x, y),

with domain of discourse D, is true if there is at least one x in D such that P (x, y) is truefor every y in D.

The statement∃x∀y P (x, y),

is false if, for every x in D, there is at least one y in D such that P (x, y) is false.


∃x∃y ((x > 1) ∧ (y > 1) ∧ (xy = 6)),

with domain of discourse the set of positive integers. This statement is true as there is atleast one positive integer x and at least one positive integer y, both greater than 1, suchthat xy = 6 e.g. x = 2, y = 3.

The statement∃x∃y P (x, y),

with domain of discourse D, is true if there is at least one x in D and at least one y in D

such that P (x, y) is true.

The statement∃x∃y P (x, y),

is false if, for every x in D and for every y in D, P (x, y) is false.

Example 5. Using the generalized De Morgan laws for logic, the negation of

∀x∃y P (x, y)

is¬ (∀x∃y P (x, y)) ≡ ∃x¬ (∃y P (x, y)) ≡ ∃x∀y ¬P (x, y).

Note that in the negation, ∀ and ∃ are interchanged.

2

Proofs

A mathematical system consists of

axioms which are assumed true;

definitions which are used to create new concepts in terms of existing one;

undefined terms which are not specifically defined but which are implicitly definedby the axioms.

Within a mathematical system we can derive theorems.

A theorem is a proposition that has been proved to be true.

A lemma is a theorem that is not too interesting in its own right but is useful inproving another theorem.

A corollary is a theorem that follows quickly from another theorem.

A proof is an argument that establishes the truth of a theorem.

Example 6. The real numbers furnish an example of a mathematical system. Amongthe axioms are:

• For all real numbers x and y, xy = yx.

• There is a subset P of real numbers satisfying

(a) If x and y are in P , then x + y and xy are in P .

(b) If x is a real number, then exactly one of the following statements is true:

x is in P , x = 0, −x is in P .

Multiplication is implicitly defined by the first axiom.

The elements of P are called positive real numbers.

The absolute value |x| of a real number x is defined to be x if x is positive or 0, and −x

otherwise.

Example 7. Theorems about real numbers are

• x · 0 = 0 for every real number x.

• For all real numbers x, y and z, if x ≤ y and y ≤ z, then x ≤ z.

Example 8. An example of a lemma about real numbers is

• If n is a positive integer, then either n − 1 is a positive integer or n − 1 = 0.

Not too interesting in its own right, but can be used to prove other results.

3

Theorems are often of the form

For all x1, x2, . . . , xn, if p(x1, x2, . . . , x

n), then q(x1, x2, . . . , x

n).

This universally quantified statement is true provided that the conditional proposition

if p(x1, x2, . . . , xn), then q(x1, x2, . . . , x

n)

is true for all x1, x2, . . . , xn

in the domain of discourse.

A direct proof assumes that p(x1, x2, . . . , xn) is true, and using this and other axioms,

definitions and previously derived theorems, show directly that q(x1, x2, . . . , xn) is true.

Example 9. A particular lemma is

The product of two odd integers is odd.

Proof:

Let the two odd integers be 2m+1 and 2n+1, where m and n are integers. Their productis

(2m + 1)(2n + 1) = 4mn + 2m + 2n + 1

= 2(2mn + m + n) + 1,

which is odd.

A second technique of proof is proof by contradiction.

A proof by contradiction establishes by assuming that the hypothesis p is true and thatthe conclusion q is false, and then using p and ¬ q as well as axioms, definitions andtheorems, derives a contradiction. A contradiction is a proposition of the form r ∧ ¬ r.This is sometimes called an indirect proof.

Proof by contradiction is justified by noting that the propositions

p → q and p ∧ ¬ q → r ∧ ¬ r

are equivalent. The truth table is

p q r p → q p ∧ ¬ q r ∧ ¬ r p ∧ ¬ q → r ∧ ¬ r

T T T T F F TT T F T F F TT F T F T F FT F F F T F FF T T T F F TF T F T F F TF F T T F F TF F F T F F T

4

Example 10. Prove that the root mean square of two number a and b, a > 0 and b > 0,is equal to or greater than the arithmentic mean i.e.

√

a2 + b2

2≥

a + b

2.

Proof:

Assume

√

a2 + b2

2<

a + b

2.

Since both sides are positive, we can square without changing the direction of the inequal-ity.

a2 + b2

2<

(a + b)2

42(a2 + b2) < a2 + 2ab + b2

a2 − 2ab + b2 < 0

(a − b)2 < 0.

This is a contradiction, and hence

√

a2 + b2

2≥

a + b

2.

Example 11.

Theorem:√

2 is irrational, that is√

2 cannot be represented asm

n, where m and n are

integers.

The hypotheses are that rational numbers (m

nwhere m and n are integers with no common

factors, n 6= 0) and square root are defined.

Proof:

Assume that√

2 is rational, so that√

2 =m

n, where m and n are integers with no common

factors and n 6= 0. Then

√2 =

m

n

2 =m2

n2

m2 = 2n2.

It is an easily proved lemma that if m2 is is even, then m is even. Hence if m is even,m = 2k, and m2 = 4k2. Then

4k2 = 2n2

2k2 = n2,

and so n is even. Hence m and n have a common factor, namely 2, and so there is acontradiction. Hence

√2 is irrational.

5

Proof by contrapositive is based on the fact that

p → q ≡ ¬ q → ¬ p.

The idea is to show that the opposite of the conclusion implies the opposite of the hy-pothesis.

Example 12.

Theorem: If x and y are real numbers and x + y ≥ 2, then either x ≥ 1 or y ≥ 1.

Proof:

Let p be “x + y ≥ 2” and q be “either x ≥ 1 or y ≥ 1”.

Assume ¬ q: x < 1 and y < 1.

Then x + y < 1 + 1, or x + y < 2, which is ¬ p. Proven.

Proof by cases is used when the original hypothesis naturally divides into various cases.

Example 13.

Theorem: |x + y ≤ |x| + |y| for all real x and y.

Proof:

Consider the four cases as follows, where each of x, y is nonnegative or negative.

1. x, y ≥ 0:

Then x + y ≥ 0, so |x + y| = x + y = |x| + |y|.

2. x, y < 0:

Then x + y < 0, so |x + y| = −(x + y) = −x − y = |x| + |y|.

3. x ≥ 0, y < 0:

Then x + y < x < |x| + |y|, and

−(x + y) = −x − y ≤ −y = |y| ≤ |x| + |y|.

4. x < 0, y ≥ 0:

The same as case 3, swapping the roles of x and y.

Another form of proof is called an existence proof. An example of this is if we wantedto show that

∃xP (x)

is true. It is only necessary to find a member x in the domain of discourse for which P (x)is true.

6

Definition. An argument is a sequence of propositions written

p1

p2

...p

n

... q

orp1, p2, . . . , p

n/ ... q.

The propositions p1, p2, . . . , pn

are called the hypotheses and the proposition q is calledthe conclusion. The argument is valid provided that if p1, p2, . . . , p

nare all true, then q

must be true; otherwise the argument is invalid (or a fallacy).

Example 14. Determine whether the argument

p → q

¬ q

... ¬ p

is valid.

(a) Using a truth table:p q p → q ¬ q ¬ p

T T T F FT F F T FF T T F TF F T T T

Note that the important line is the last. Why?

(b) A verbal argument proceeds: ¬ q is true when q is false, so p → q is only true whenp is false (as q is false). Hence ¬ p is true.

Example 15. Is the following argument valid?

If I don’t study hard, then I don’t get high distinctions.I study hard... I get high distinctions

Let p be “I study hard”, and let q be “I get high distinctions”. Then the argument is

¬ p → ¬ q

p

... q

(a) The truth table isp q ¬ p → ¬ q p q

T T T T TT F T T FF T F F TF F T F F

7

The first two lines are the important ones. The second line implies that it is aninvalid argument.

(b) Alternatively, assume that p is true and q is false. Then p → ¬ q ≡ F → T ≡ T, sothat both hypotheses can be true with a false conclusion.

Mathematical Induction

Example 16. An arithmetic teacher sets his class the problem of adding up all theintegers from 1 to 100. Can this be done in under 10 seconds? There is a piece of folklorerelating this to the mathematician Karl Friedrich Gauss (1777–1855) as a boy.

If we pair the numbers

1 + 2 + 3 + · · · + 49 + 50100 + 99 + 98 + · · · + 52 + 51

we can see that each vertical pair adds to 101. Since there are 50 pairs, the sum is50 × 101 = 5050.

The general case, the sum of all integers from 1 to n, is

1 + 2 + 3 + · · · + n =n(n + 1)

2, n = 1, 2, 3, . . . .

The Principle of Mathematical Induction is a process by which a set of theoremscorresponding to the non-negative integers can be proven.

Definition. Suppose that we have a propositional function S(n) whose domain of dis-course is the set of positive integers. Suppose that

S(1) is true;

for all n ≥ 1, if S(n) is true, then S(n + 1) is true.

The S(n) is true for every positive integer n.

The first part, S(1) is true, is called the Basis Step, and the second part is called theInductive Step. It is not necessary to start with n = 1, sometimes n = 0, 2, 3, . . . willoccur.


Basis step: For n = 1, LHS = 1 and RHS = 1×2

2= 1. True.

Inductive step: Assume that

1 + 2 + 3 + · · · + n =n(n + 1)

2.

8

Then

1 + 2 + 3 + · · · + n + (n + 1) =n(n + 1)

2+ (n + 1), by the assumption

= 1

2(n + 1)(n + 2), factorise whenever possible

=(n + 1)(n + 2)

2, correct form.

Hence, by the Principle of Mathematical Induction, S(n) is true for all n ≥ 1.

Example 17. Show that

1 + 4 + 7 + 10 + · · · + (3n − 2) =n(3n − 1)

2, n ≥ 1.

Basis step: For n = 1, LHS = 1 and RHS = 1×2

2= 1. True.

Inductive step: Assume that

1 + 4 + 7 + · · · + (3n − 2) =n(3n − 1)

2.

Then

1 + 4 + 7 + · · · + (3n − 2) + (3n + 1) =n(3n − 1)

2+ (3n + 1), by the assumption

= 1

2

(

3n2 − n + 6n + 2)

, can’t factorise here

= 1

2

(

3n2 + 5n + 2)

=(n + 1)(3n + 2)

2

=(n + 1)(3(n + 1) − 1)

2, correct form.

Hence, by the Principle of Mathematical Induction, S(n) is true for all n ≥ 1.

9


Mathematical Induction (Continued)

Correct formulae are given in advance. How do we know the correct formula?

Experiment to find a pattern e.g. sum of the odd integers.

S2n−1 = 1 + 3 + 5 + 7 + · · · + (2n − 1).

n S2n−1

1 12 43 94 16...

...

It appears that S2n−1 = n2.


S(n) : 12 + 22 + 32 + · · · + n2 =n(n + 1)(2n + 1)

6, n ≥ 1,

is true.

Basis Step: For n = 1, LHS = 1 and RHS =1 × 2 × 3

6= 1. True.

Inductive Step: Assume that

12 + 22 + 32 + · · · + n2 =n(n + 1)(2n + 1)

6.

Then

12 + 22 + 32 + · · · + n2 + (n + 1)2

=n(n + 1)(2n + 1)

6+ (n + 1)2, by the assumption

= 1

6(n + 1)

(

2n2 + n + 6(n + 1))

, factorize whenever possible

= 1

6(n + 1)

(

2n2 + 7n + 6)

=(n + 1)(n + 2)(2n + 3)

6

=(n + 1)(n + 2)[2(n + 1) + 1]

6, correct form.

Since the Basis Step and Inductive Step have been verified, by the Principle of Mathe-matical Induction, S(n) is true for all n ≥ 1.

1


S(n) :1

1 · 2+

1

2 · 3+

1

3 · 4+ · · · +

1

n(n + 1)=

n

(n + 1), n ≥ 1,

is true.

Work a few terms:

1

1 · 2=

1

21

1 · 2+

1

2 · 3=

1

2+

1

6

=2

31

1 · 2+

1

2 · 3+

1

3 · 4=

1

2+

1

6+

1

12

=3

4

Basis Step: For n = 1, LHS =1

1 · 2=

1

2and RHS =

1

2. True.


1

1 · 2+

1

2 · 3+

1

3 · 4+ · · · +

1

n(n + 1)=

n

(n + 1).

Then

1

1 · 2+

1

2 · 3+

1

3 · 4+ · · · +

1

n(n + 1)+

1

(n + 1)(n + 2)

=n

(n + 1)+

1

(n + 1)(n + 2), by the assumption

=1

(n + 1)(n + 2)[n(n + 2) + 1] , factorize whenever possible

=1

(n + 1)(n + 2)(n + 1)2

=n + 1

n + 2, correct form.


Example 3. Divisibility example, Johnsonbaugh 1.7.5.

Show that 5n − 1 is divisible by 4 for all n ≥ 1.

Basis Step: For n = 1, 51 − 1 = 4 is divisible by 4. True.

Inductive Step: Assume that 5n − 1 is divisible by 4.

Then we wish to prove that 5n+1 − 1 is divisible by 4.

5n+1 − 1 = 5 × 5n − 1

= (5n − 1) + 4 × 5n.

2

The first part is divisible by 4 by the assumption, and 4× 5n is divisible by 4, hence true.

Alternatively, put 5n − 1 = 4m, where m is an integer. Then

5n+1 − 1 = 5 × 5n − 1

= 5 × (4m + 1) − 1, by the assumption

= 4(5m) + 4,

which is divisible by 4.


Example 4. Geometric sum, Johnsonbaugh 1.7.4.

Use induction to show that, if r 6= 1,

a + ar1 + ar2 + · · · + arn =a(rn+1 − 1)

r − 1

for all n ≥ 0.

Basis Step: For n = 0, LHS = a and RHS =a(r − 1)

r − 1= a. True.


a + ar1 + ar2 + · · · + arn =a(rn+1 − 1)

r − 1.

Then

a + ar1 + ar2 + · · · + arn + arn+1

=a(rn+1 − 1)

r − 1+ arn+1, by the assumption

=a(rn+1 − 1) + arn+1(r − 1)

r − 1

=a(rn+1 − 1 + rn+2 − rn+1)

r − 1

=a(rn+2 − 1)

r − 1.


Example 5. Show thatS(n) : 4n > 5n2, n ≥ 3,

is true.

This time the formula is true for n ≥ 3. For n = 1 we have 41 6> 5 and for n = 2,42 6> 5 × 22.

Basis Step: For n = 3, LHS = 43 = 64 and RHS = 5 × 32 = 45. True.

3

Inductive Step: Assume that 4n > 5n2, n ≥ 3.

We want to show that 4n+1 > 5(n + 1)2. Then

4n+1 = 4 (4n)

> 4(

5n2)

, by the assumption

= 5(

n2 + 2n2 + n2)

> 5(

n2 + 2n + 1)

, since n ≥ 3

= 5(n + 1)2.


Example 6. Tiling with trominos.

Solomon W. Golomb introduced polyominos in 1954. They can be used in tiling problems.For example, the tetris game uses tetrominos, which will not tile a rectangle but whichcan tile the plane. There are five free tetrominos, and seven if they are considered to beone-sided (see figure).

The Dutch artist M.C. Escher produced many woodcut and lithograph art works of tilingproblems.

There are two trominos, the right and straight trominos. We will henceforth refer to theright tromino just as a tromino.

4

Trominos can tile a deficient board, that is an n × n board with one square missing,providing n 6= 3k. We can see that if n = 3k + 1, then

n2 − 1 = (3k + 1)2 − 1

= 9k2 + 6k,

and is divisible by 3.

If n = 3k + 2, then

n2 − 1 = (3k + 2)2 − 1

= 9k2 + 12k + 3,

and is divisible by 3.

We can tile all deficient boards with n 6= 3k, except for some deficient 5× 5 boards – seethe figure below. Here we will prove by mathematical induction that all 2n × 2n deficientboards can be tiled with trominos.

Basis Step: For n = 1, the 2 × 2 deficient board is a tromino and can be tiled.

Inductive Step: Assume that any 2n × 2n deficient board can be tiled.

Then we can divide the 2n+1 × 2n+1 deficient board into four 2n × 2n deficient boards,with one board having the missing square anywhere, the other three boards having missingsquares in the corners by placement of one tromino as shown in the figure.

5

We can tile all four 2n × 2n deficient boards by the hypothesis, and can hence tile the2n+1 × 2n+1 deficient board.

Since the Basis Step and Inductive Step have been verified, by the Principle of Mathe-matical Induction, we can tile any 2n × 2n deficient board.

All examples so far have involved the Weak Form of Mathematical Induction, where ifS(n), then S(n + 1) is true. The Strong Form of Mathematical Induction allows usto assume the truth of all of the preceeding statements.

Definition. Strong Form of Mathematical Induction

Suppose that we have a propositional function S(n) whose domain of discourse is the setof integers greater than or equal to n0. Suppose that

S(n0) is true;

for all n > n0, if S(k) is true for all k, n0 ≤ k < n, then S(n) is true.

Then S(n) is true for every integer n ≥ n0.

Example 7. Recurrence relation example.

Consider the sequence a1, a2, a3, . . . with a1 = 1, a2 = 3 and an+1 = 3a

n− 2a

n−1. Then

a3 = 3 × 3 − 2 × 1

= 7

a4 = 3 × 7 − 2 × 3

= 15

a5 = 3 × 15 − 2 × 7

= 31.

It would seem that an

= 2n − 1.

Basis Steps: For n = 1, 2, a1 = 1 = 21 − 1 and a2 = 3 = 22 − 1. We require the twopreceding statements to be true.

Inductive Step: Assume that ai= 2i − 1 for 2 ≤ i ≤ n.

6

We want to prove that an+1 = 2n+1 − 1. Now

an+1 = 3a

n− 2a

n−1, using the definition

= 3 (2n − 1) − 2(

2n−1 − 1)

, using the assumption

= 2n(3 − 1) − 3 + 2

= 2n+1 − 1.

Since the Basis Steps and Inductive Step have been verified, by the Principle of Mathe-matical Induction, the formula a

n= 2n − 1 is true for all n ≥ 1.

Example 8. Show that postage of six cents or more can be achieved by using only 2-centand 7-cent stamps.

Basis Steps: For n = 6, 7. For six cent postage, use three 2-cent stamps and for sevencent postage, use one 7-cent stamp.

Inductive Step: Assume n ≥ 8 and assume that postage of k cents or more can be achievedby using only 2-cent and 7-cent stamps for 6 ≤ k < n.

By the assumption, we can make postage of n − 2 cents. Then add a 2-cent stamp tomake postage of n cents. The inductive step is complete.

Since the Basis Steps and Inductive Step have been verified, by the Principle of Mathe-matical Induction we can make postage for all n ≥ 6.

Example 9. Prime numbers example.

S(n) : Every positive integer greater than 1 is the product of primes.

Basis Step: For n = 2, 2 is prime, and is the product of one number, itself.

Inductive Step: Assume that i is the product of primes for 2 ≤ i ≤ n.

If n + 1 is prime, it is the product of one prime.

If n + 1 is not prime, then n + 1 = pq where p and q are integers,

2 ≤ p ≤ q ≤ n.

Since by the assumption, p and q are the product of primes, then n+1 = pq is the productof primes.

Since the Basis Step and Inductive Step have been verified, by the Principle of Mathe-matical Induction S(n) is true for all n ≥ 2.

7

The Language of Mathematics

Sets

A set is a collection of objects, known as elements. It is described by listing the elementsin parentheses e.g.

A = 1, 2, 3, 4 .

Order does not matter e.g.

A = 1, 2, 3, 4 = 2, 1, 4, 3 .

Elements are assumed different e.g.

A = 1, 2, 3, 4 = 1, 2, 2, 3, 4 .

A large or infinite set can be defined by a property e.g.

B = 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ,

orB = x | x is a positive odd integer less than 20 ,

andC = x | x is a positive integer divisible by 3 .

Read the symbol | as “such that”.

If X is a finite set, we let

|X| = the number of elements of X.

e.g. |A| = 4, |B| = 10.

If x is an element of A, we write x ∈ A. If not, we write x 6∈ A. e.g.

4 ∈ A, 4 6∈ B, 4 6∈ C.

The empty set is the set with no elements, denoted ∅. Hence ∅ = , |∅| = 0. e.g.

x | x ∈ R and x2 + 1 = 0

.

What is | ∅ |?

Two sets X and Y are equal if they have the same elements.

X = Y if for every x, if x ∈ X then x ∈ Y and for every x, if x ∈ Y then x ∈ X, i.e.

X = Y if ∀x ((x ∈ X → x ∈ Y ) ∧ (x ∈ Y → x ∈ X)).

8

Example 1.X =

x | x2 + x − 6 = 0

, Y = 2,−3 .

If x ∈ X, then

x2 + x − 6 = 0

(x − 2)(x + 3) = 0

x = 2,−3

x ∈ Y.

If x ∈ Y , then if x = 2,

x2 + x − 6 = 0

x ∈ X,

and if x = −3, then

x2 + x − 6 = 0

x ∈ X.

X is a subset of Y if every element of X is an element of Y . We write X ⊆ Y .

In symbols, X is a subset of Y if

∀x (x ∈ X → x ∈ Y ),

e.g. 1, 2 ⊆ 1, 2, 3, 4 .

We have

X ⊆ X

∅ ⊆ X

i.e. ∀x (x ∈ ∅ → x ∈ X)

x ∈ ∅ is false, hence x ∈ ∅ → x ∈ X is true.

If X ⊆ Y and X 6= Y , then X is a proper subset of Y , denoted X ⊂ Y . We have

If X ⊆ Y and Y ⊆ X, then X = Y .

If X ⊆ Y and Y is finite, then |X| ≤ |Y |.

If X ⊆ Y , Y is finite, and |X| = |Y |, then X = Y .

9


Sets (Continued)

The power set of the set X, denoted P(X), is the set of all subsets of X.

Example 2. B = a, b , A = a, b, c .

P(B) = ∅, a , b , a, b , |P(B)| = 4.

P(A) = ∅, a , b , c , a, b , a, c , b, c , a, b, c , |P(A)| = 8.

Theorem. If |X| = n, then |P(X)| = 2n.

Proof: Johnsonbaugh uses a proof by Mathematical Induction. The idea is that exactlyhalf of the subsets of X contain a particular element of X. This can be seen by pairingthe subsets e.g. for the set A in Example 2.

∅ a

b a, b

c a, c

b, c a, b, c

Basis Step: If n = 0, we have the empty set which has only one subset, itself. Then|∅| = 0, and 20 = 1. True.

Inductive Step: Assume that a set with n elements has a power set of size 2n.

Let X be a set with n + 1 elements. Remove one element, x, from X to form a set Y .Then Y has n elements, and by the assumption

|P(Y )| = 2n.

But since by the pairing argument

|P(X)| = 2 |P(Y )| ,

then|P(X)| = 2n+1.

Since the Basis Step and Inductive Step have been verified, by the Principle of Mathe-matical Induction, if |X| = n, then |P(X)| = 2n is true for all n ≥ 0.

An alternative proof is to suppose that a 1 represents the presence of an element in asubset, and a 0 represents its absence.

Then all subsets of X can be represented by a binary string of length |X| = n, and thereare 2n such strings.

1

Set Operations

Given two sets X and Y :

Their union is X ∪ Y = x | x ∈ X or x ∈ Y .

Their intersection is X ∩ Y = x | x ∈ X and x ∈ Y .

Their difference is X − Y = x | x ∈ X and x 6∈ Y .

Example 3. A = a, b, c, d, e and B = 1, 2, 3, 4, a, b .

A ∪ B = a, b, c, d, e, 1, 2, 3, 4

A ∩ B = a, b

A − B = c, d, e

B − A = 1, 2, 3, 4

Sets X and Y are disjoint if X ∩ Y = ∅. For example,

X = 1, 2, 3 and Y = 4, 5

are disjoint.

A collection of sets S is pairwise disjoint if X and Y are disjoint for distinct X, Y in S.For example,

S = a, b , c, d , e, f

is pairwise disjoint.

If we deal with sets which are subsets of a set U , then U is called the universal set.

The set U − X is the complement of X, denoted X.

Example 4. U = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

A = 1, 3, 5, 7, 9 , B = 1, 2, 3, 8 , C = 3, 6, 8, 9

Then

A ∪ B = 1, 2, 3, 5, 7, 8, 9

(A ∪ B) = 0, 4, 6

A ∪ B ∪ C = 1, 2, 3, 5, 6, 7, 8, 9

(A ∪ B ∪ C) = 0, 4

C ∩ (A ∪ B) = 6

Venn diagrams provide a pictorial view of sets. U , the universal set, is depicted as arectangle. Sets A, B, C, contained in U , are drawn as circles.

AAA BBB

UUU

A ∪ B A ∩ B A − B

2

AA

B

C

UU

AC ∩ (A ∪ B)

In Example 4, we have

A B

C

U

0, 4

12

3

5, 7

6

89

Theorem 2.1.12 Let U be a universal set and let A, B, and C be subsets of U . Thefollowing properties hold.

(a) Associative laws:

(A ∪ B) ∪ C = A ∪ (B ∪ C), (A ∩ B) ∩ C = A ∩ (B ∩ C)

(b) Commutative laws:A ∪ B = B ∪ A, A ∩ B = B ∩ A

(c) Distributive laws:

A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C), A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)

(d) Identity laws:A ∪ ∅ = A, A ∩ U = A

(e) Complement laws:A ∪ A = U, A ∩ A = ∅

(f) Idempotent laws:A ∪ A = A, A ∩ A = A

3

(g) Bound laws:A ∪ U = U, A ∩ ∅ = ∅

(h) Absorption laws:A ∪ (A ∩ B) = A, A ∩ (A ∪ B) = A

(i) Involution law:

A = A

(j) 0/1 laws:∅ = U, U = ∅

(k) De Morgan’s laws for sets:

(A ∪ B) = A ∩ B, (A ∩ B) = A ∪ B

Proof: Of the first distributive law.

We have the Venn diagrams:

AA BB

CC

UU

B ∪ C A ∩ (B ∪ C)

AA BB

CC

UU

A ∩ B A ∩ C

Mathematically, the proof is as follows. Let

x ∈ A ∩ (B ∪ C).

Then

x ∈ A and x ∈ B ∪ C

x ∈ A and x ∈ B or x ∈ C

x ∈ A ∩ B or x ∈ A ∩ C,

so thatx ∈ (A ∩ B) ∪ (A ∩ C).

4

This only proves thatA ∩ (B ∪ C) ⊆ (A ∩ B) ∪ (A ∩ C).

Letx ∈ (A ∩ B) ∪ (A ∩ C).

Then

x ∈ A ∩ B or x ∈ A ∩ C

x ∈ A and x ∈ B or x ∈ A and x ∈ C

x ∈ A and x ∈ B or x ∈ C

x ∈ A and x ∈ B ∪ C,

so thatx ∈ A ∩ (B ∪ C).

Proof of the first De Morgan law.

We have the Venn diagrams:

AA BB

UU

A ∪ B (A ∪ B) = A ∩ B

AA BB

UU

A B

Mathematically, the proof is as follows. Let

x ∈ (A ∪ B).

Then

x 6∈ A ∪ B

x 6∈ A and x 6∈ B

x ∈ A and x ∈ B

x ∈ A ∩ B.

This only proves that(A ∪ B) ⊆ A ∩ B.

5

Letx ∈ A ∩ B.

Then

x ∈ A and x ∈ B

x 6∈ A and x 6∈ B

x 6∈ A ∪ B

x ∈ (A ∪ B).

If S = A1, A2, . . . , An, then the union of many sets is

⋃

S = x | x ∈ Aifor some A

i∈ S .

The intersection of many sets is⋂

S = x | x ∈ Aifor all A

i∈ S .

We write⋃

S =n⋃

i=1

Ai,

⋂

S =n⋂

i=1

Ai.

For infinitely many setsS = A1, A2, A3, . . .

this becomes⋃

S =∞

⋃

i=1

Ai,

⋂

S =∞

⋂

i=1

Ai.

Example 5. LetS = A1, A2, A3, . . .

whereA

n= n, n + 1, n + 2, . . . .

That is,

A1 = 1, 2, 3, . . .

A2 = 2, 3, 4, . . .

A3 = 3, 4, 5, . . . ,

etc.

Obviously⋃

S = 1, 2, 3, . . . = A1,

and⋂

S = ∅,

as

1 /∈⋂

S as 1 /∈ A2, A3, . . .

2 /∈⋂

S as 2 /∈ A3, A4, . . .

etc.

6

A collection S of non-empty subsets of X is a partition of X if each element of X belongsto exactly one member of X. That is, S is pairwise disjoint and

⋃

S = X.

Example 6.

(a) X = 1, 2, 3, 4, 5, 6, 7, 8

ThenS = 2, 4, 8 , 1, 3, 5, 7 , 6

is a partition of S.

(b) X = x | x ∈ R

S = x | x is rational , x | x is real

is not a partition of X.

T = x | x is rational , x | x is irrational

is a partition of X.

Sets are unordered collections of elements. Sometimes order is important.

An ordered pair of elements, written (a, b), is different from the ordered pair (b, a) (unlessa = b).

Alternatively, (a, b) = (c, d) if and only if a = c and b = d.

If X and Y are sets, we let X × Y denote the set of all ordered pairs (x, y) where x ∈ X

and y ∈ Y .

We call X × Y the Cartesian product of X and Y .

Example 7. The Last Duck Vietnamese Restaurant and Takeaway sells

Entrees (set E)

a: Chicken Cold Roll

b: Vietnamese Spring Roll

c: Steamed Pork Dumplings

Mains (set M)

1: Twice Cooked Duck Leg Curry

2: Char-grilled Lemongrass Pork

3: Steamed Ginger Infused Chicken

4: Whole Prawns with Fanta Fish Sauce

7

Then E = a, b, c and M = 1, 2, 3, 4 .

The Cartesian product lists the 12 possible dinners consisting of one entree and one maincourse. So

E × M = (a, 1), (a, 2), (a, 3), (a, 4), (b, 1), (b, 2), (b, 3), (b, 4), (c, 1), (c, 2), (c, 3), (c, 4) .

This is actually a cut-down version of the menu. In fact, there are 5 entrees and 8 maincourses. How many dinners are possible?

For the general case,|X × Y | = |X| · |Y |.

8

Relations

Relations Johnsonbaugh 3.1

We can consider a relation as a table linking the elements of two sets e.g. product vs price.

Definition. A (binary) relation R from a set X to a set Y is a subset of the Cartesianproduct X × Y .

If (x, y) ∈ R, we write xR y and say that x is related to y. If X = Y , we call R a (binary)relation on X.

The set x ∈ X | (x, y) ∈ R for some y ∈ Y

is called the domain of R.

The set y ∈ Y | (x, y) ∈ R for some x ∈ X

is called the range of R.

In simpler terms, a (binary) relation R connects elements in X to elements in Y . Forexample, we have pictorially

1

2

3

4

a

b

c

XY

R = (1, a), (2, c), (3, b) .

This is called an arrow diagram.

The domain is all elements of X that occur in R i.e. 1, 2, 3 .

The range is all elements of Y that occur in R i.e. a, b, c .

Example 1. Johnsonbaugh 3.1.3

X = 2, 3, 4 and Y = 3, 4, 5, 6, 7

Define a relation from X to Y by (x, y) ∈ R if x divides y (with 0 remainder). Therefore

R = (2, 4), (2, 6), (3, 3), (3, 6), (4, 4) .

9

We could write R as a table:

X Y

2 42 63 33 64 4

The domain of R is 2, 3, 4 , and the range of R is 3, 4, 6 .


Let R be a relation on X = 1, 2, 3, 4 defined by (x, y) ∈ R if x ≤ y, x, y ∈ X.

R = (1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4) .

The domain and range of R are both X.

A relation on a set can be described by its digraph.

Draw dots or vertices as elements of X. If (x, y) ∈ R, draw an arrow from x to y – adirected edge.

An element (x, x) is called a loop.

The diagraph of Example 2 is:

1 2

3 4

Example 3. The relation R on X = a, b, c, d is

R = (a, a), (b, c), (c, b), (d, d) .

The digraph is:

a b

c d

10

Definition. A relation R on a set X is called reflexive if (x, x) ∈ R for every x ∈ X.

Example 2 is reflexive. There is a loop on every vertex.

Example 3 is not reflexive.

Definition. A relation R on a set X is called symmetric if for all x, y ∈ X, if (x, y) ∈ R

then (y, x) ∈ R.

Example 3 is symmetric. Directed edges go both ways between vertices.

Definition. A relation R on a set X is called antisymmetric if for all x, y ∈ X, if (x, y) ∈ R

and x 6= y, then (y, x) 6∈ R.

Example 2 is antisymmetric. Between any two distinct vertices there is at most onedirected edge.

Can a relation R be both symmetric and antisymmetric?

Definition. A relation R on a set X is called transitive if for all x, y, z ∈ X, if (x, y) ∈ R

and (y, z) ∈ R, then (x, z) ∈ R.

Example 2 is transitive. We need to list all pairs to verify.

(x, y) (y, z) (x, z)(1, 1) (1, 1) (1, 1)(1, 1) (1, 2) (1, 2)(1, 1) (1, 3) (1, 3)(1, 1) (1, 4) (1, 4)(1, 2) (2, 2) (1, 2)(1, 2) (2, 3) (1, 3)etc.

Do we really need to list all pairs? If x = y, and (x, y), (y, z) ∈ R, then (x, z) ∈ R isautomatically true. Hence the table need be only

(x, y) (y, z) (x, z)(1, 2) (2, 3) (1, 3)(1, 2) (2, 4) (1, 4)(1, 3) (3, 4) (1, 4)(2, 3) (3, 4) (2, 4)

Example 3 is not transitive e.g. (b, c), (c, b) ∈ R, but (b, b) 6∈ R.

The digraph of a transitive relation has the property that whenever there are directededges from x to y and from y to z, there is a directed edge from x to z.

11


Relations (Continued)

Relations can be used to order elements of a set. For example, the relation R on thepositive integers defined by

(x, y) ∈ R if x ≤ y

orders the integers, and is reflexive, antisymmetric and transitive i.e.

reflexive : x ≤ x

antisymmetric : if x ≤ y and x 6= y, then y 6≤ x

transitive : if x ≤ y and y ≤ z, then x ≤ z.

Definition. A relation R on a set X is called a partial order if R is reflexive, antisymmetricand transitive.

Example 4. X = 2, 3, 4, 5, 6

R is the relation defined by (x, y) ∈ R if y is larger than x by an even number or zero.Hence

R = (2, 2), (2, 4), (2, 6), (3, 3), (3, 5), (4, 4), (4, 6), (5, 5), (6, 6) .

The digraph is

2

3

4

5

6

This is

reflexive : loops on all vertices

antisymmetric : at most one directed arc between each pair of vertices

transitive : need only check (2, 4), (4, 6), and (2, 6) ∈ R.

If R is a partial order on X, we often write x y when (x, y) ∈ R.

If x, y ∈ X and either x y or y x, then we say x and y are comparable.

If x, y ∈ X, x 6 y and y 6 x, then we say x and y are incomparable.

1

If every pair of elements in X is comparable, we call R a total order.

Example 2 is a total order. Either (x, y) ∈ R or (y, x) ∈ R for all x, y = 1, 2, 3, 4.

More generally, the less than or equals relation on the positive integers is a total order,since either x ≤ y or y ≤ x (or both if x = y).

Example 4 is not a total order. Why?

Partial orders can be used in task scheduling.


The set T of tasks in taking an indoor flash photograph is as follows.

1. Remove lens cap.

2. Focus camera.

3. Turn off safety lock.

4. Turn on flash unit.

5. Push photo button.

Some tasks must be done before others, some can be done in either order.

Define the relation R on T by

i R j if i = j or task i must be done before task j.

Then

R = (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (1, 2), (1, 5), (2, 5), (3, 5), (4, 5) .

R is reflexive, antisymmetric and transitive, and is a partial order. Why is R not a totalorder?

Possible solutions are 1, 2, 3, 4, 5 or 3, 4, 1, 2, 5.

Definition. Let R be a relation from X to Y . The inverse of R, denoted R−1, is therelation from X to Y defined by

R−1 = (y, x) | (x, y) ∈ R .


X = 2, 3, 4 and Y = 3, 4, 5, 6, 7

We have (x, y) ∈ R if x divides y, so

R = (2, 4), (2, 6), (3, 3), (3, 6), (4, 4) ,

thenR−1 = (4, 2), (6, 2), (3, 3), (6, 3), (4, 4)

and might be described as (y, x) ∈ R−1 if y is divisible by x.

2

Definition. Let R1 be a relation from X to Y and R2 be a relation from Y to Z. Thecomposition of R1 and R2, denoted R2 R1, is the relation from X to Z defined by

R2 R1 = (x, z) | (x, y) ∈ R1 and (y, z) ∈ R2 for some y ∈ Y .

Example 6.A = 1, 2, 3, 4 B = a, b, c, d C = x, y, z

R = (1, a), (2, d), (3, a), (3, b), (3, d)

S = (b, x), (b, z), (c, y), (d, z)

The arrow diagram represents this as:

1 2 3 4

a b c d

x y z

A

B

C

The composition of the relations is

S R = (2, z), (3, x), (3, z) .

3

Matrices of Relations

For a quick revision on matrices, read Johnsonbaugh Appendix A.

A matrix is a convenient way to represent a relation R from X to Y .

Label the rows with the elements of X, label the columns with the elements of Y , andmake the entry in row x column y a 1 if xR y and a 0 otherwise.

This is the matrix of the relation R.

Example 2. (from §3.1, revisited)

X = 1, 2, 3, 4 and (x, y) ∈ R if x ≤ y.

R = (1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4) .

Then the matrix is1 2 3 4

1234

1 1 1 10 1 1 10 0 1 10 0 0 1

The matrices depend on the ordering of the elements in the sets X and Y .

A relation R on X has a square matrix.

The relation R on a set X which has a matrix A is

• reflexive if and only if (iff) A has 1’s on the main diagonal. Recall

(x, x) ∈ R for all x ∈ X.

• symmetric if and only if A is symmetric. Recall

If (x, y) ∈ R then (y, x) ∈ R.

If the (i, j) th element of A is 1, so is the (j, i) th element.

• antisymmetric if and only if any 1 in the (i, j) th entry of A is matched by a 0 inthe (j, i) th entry in any position off the main diagonal. Recall

If (x, y) ∈ R then (y, x) 6∈ R for x 6= y.


A = 1, 2, 3, 4 B = a, b, c, d C = x, y, z

R = (1, a), (2, d), (3, a), (3, b), (3, d)

S = (b, x), (b, z), (c, y), (d, z)

4

A1 = matrix of R =

a b c d

1234

1 0 0 00 0 0 11 1 0 10 0 0 0

A2 = matrix of S =

x y z

a

b

c

d

0 0 01 0 10 1 00 0 1

A1A2 =

1 0 0 00 0 0 11 1 0 10 0 0 0

0 0 01 0 10 1 00 0 1

=

x y z

1234

0 0 00 0 11 0 20 0 0

What does the 2 in (A1A2)(3,3)mean?

The composition of the relations is

S R = (2, z), (3, x), (3, z) .

So if we convert the ‘2’ to a ‘1’, we have the matrix of S R.

Theorem 3.3.6 Johnsonbaugh

Let R1 be a relation from X to Y and let R2 be a relation from Y to Z. Choose theorderings of X, Y , and Z. Let A1 be the matrix of R1 and let A2 be the matrix of R2

with respect to the orderings selected. The matrix of the relation R2 R1 with respect tothe orderings selected is obtained by replacing each nonzero term in the matrix productA1A2 by 1.

Discussion: Suppose that the (i, j) th entry in A1A2 is nonzero. We obtain this entry bymultiplying the i th row of A1 by the j th column of A2. Therefore there must be at leastone element (i, k) in the i th row of A1 and at least one element (k, j) in the j th columnof A2 which are both 1.

Then (i, k) ∈ R1 and (k, j) ∈ R2, so (i, j) ∈ R2 R1.

Note that the matrix sizes are automatically correct for matrix multiplication, since thenumber of columns of A1 and the number of rows in A2 are equal to the number ofelements of Y .

Matrix Test for Transitivity

The theorem gives a test for a relation R on a set X being transitive.

Compute A2 and compare A and A2. The relation R is transitive if and only if wheneverentry (i, j) in A2 is nonzero, entry (i, j) in A is also nonzero.

Suppose that the entry (i, j) in A2 is nonzero. Then there is at least one element (i, k) inthe i th row of A and at least one element (k, j) in the j th column of A which are both1. Hence (i, k) ∈ R and (k, j) ∈ R. If the (i, j) th entry of A is zero, then (i, j) 6∈ R.

5


R is a relation on X = 1, 2, 3, 4 defined by (x, y) ∈ R if x ≤ y, for x, y ∈ X.

R = (1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4) .

A =

1 2 3 41234

1 1 1 10 1 1 10 0 1 10 0 0 1

A2 =

1 2 3 41234

1 2 3 40 1 2 30 0 1 20 0 0 1

Then A2 has 1’s only in positions where A has 1’s, and the relation is transitive.


R = (a, a), (b, c), (c, d), (d, d)

A =

a b c d

a

b

c

d

1 0 0 00 0 1 00 1 0 00 0 0 1

A2 =

a b c d

a

b

c

d

1 0 0 00 1 0 00 0 1 00 0 0 1

Then A2 has 1’s in the (b, b) and (c, c) positions, whereas A does not. The relation is nottransitive i.e. (b, c) ∈ R and (c, b) ∈ R, but (b, b) 6∈ R and (c, c) 6∈ R.

6

Functions Johnsonbaugh 2.2

Definition. Let X and Y be sets. A function f from X to Y is a subset of the Cartesianproduct X × Y having the property that for each x ∈ X, there is exactly one y ∈ Y with(x, y) ∈ f . We sometimes denote a function f from X to Y as f : X → Y . (Can alsowrite y = f(x).)

The set X is called the domain of f . The set

y | (x, y) ∈ f

(which is a subset of Y ) is called the range of f .

Example 1. The relation f = (1, a), (2, b), (3, a) from X = 1, 2, 3 to Y = a, b, c

is a function. Why?

The relation R = (1, a), (2, b), (3, c) from X = 1, 2, 3, 4 to Y = a, b, c is not afunction. Why?

The relation R = (1, a), (1, b), (2, b), (3, c) from X = 1, 2, 3 to Y = a, b, c is not afunction. Why?

We can depict the situations using arrow diagrams. For the three cases we have:

1

2

3

a

b

c

X Y

1

2

3

4

a

b

c

XY

1

2

3

a

b

c

X Y

7

There must be exactly one arrow from every element in the domain. There cannot be noarrow, or more than one arrow.

Some Useful Functions

Definition. If x is an integer and y is a positive integer, we define x mod y to be theremainder when x is divided by y.

Some simple examples:

6 mod 3 = ?

9 mod 10 = ?

14 mod 3 = ?

365 mod 7 = ?

This last result tells us that 29th March next year will be a Thursday.

Definition. The floor of x, denoted ⌊ x ⌋, is the greatest integer less than or equal to x.

The ceiling of x, denoted ⌈ x ⌉, is the least integer greater than or equal to x.

Some simple examples:

⌊ 1.4 ⌋ = ? ⌈ 1.4 ⌉ = ?⌊ 6 ⌋ = ? ⌈ 6 ⌉ = ?

⌊−5.7 ⌋ = ? ⌈−5.7 ⌉ = ?

The graphs of the floor and ceiling functions are shown below.

0

[

[

[

[

[

)

)

)

)

)

1

1

2

2 3

−1

−1

−2

−2−3 x

y

Floor function

8

0

]

]

]

]

]

(

(

(

(

(

1

1

2

2 3

−1

−1

−2

−2−3 x

y

Ceiling function

Example 2. Hash Functions Johnsonbaugh Example 2.2.14

We wish to store nonnegative integers in computer memory cells. A hash function com-putes the location from the data item.

For example, if the cells are labelled 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, we might use

h(n) = n mod 11.

Store the numbers 15, 558, 32, 132, 102, 5 in the eleven cells.

15 = 1 × 11 + 4 so 15 mod 11 = 4558 = 50 × 11 + 8 so 558 mod 11 = 832 = 2 × 11 + 10 so 32 mod 11 = 10

132 = 12 × 11 + 0 so 132 mod 11 = 0102 = 9 × 11 + 3 so 102 mod 11 = 3

5 = 0 × 11 + 5 so 5 mod 11 = 5

These all store in the appropriate cells as shown in the diagram below.

0 1 2 3 4 6 7 8 9 10

132 102 15 5

5

558 32

Now store 257 = 23 × 11 + 4.

But location 4 is occupied. A collision has occurred.

The collision resolution policy is to use the next highest unoccupied cell. But if all highernumbered cells are occupied, start looking at the lowest numbered cell. In this case,10 → 0.

To find the data item n, compute m = h(n) and look at location m. If n is not there,look at higher numbered cells.

9

Definition. A function f from X to Y is said to be one-to-one (or injective) if for eachy ∈ Y , there is at most one x ∈ X with f(x) = y.

Example 3. f = (1, a), (2, b), (3, c) from X = 1, 2, 3 to Y = a, b, c, d is a one-to-one function.

f = (1, a), (2, b), (3, a) from X = 1, 2, 3 to Y = a, b, c, d is a function, but notone-to-one.

The arrow diagrams illustrate this.

11

22

33

aa

bb

cc

dd

XXYY

Definition. If f is a function from X to Y and the range of f is Y , f is said to be onto

Y (or an onto function or a surjective function).

Example 4. f = (1, a), (2, b), (3, c) from X = 1, 2, 3 to Y = a, b, c is onto. Thearrow diagram is:

1

2

3

a

b

c

X Y

If Y = a, b, c, d , f is not onto.

Definition. A function that is both one-to-one and onto is called a bijection.

In the preceding example, the function

f = (1, a), (2, b), (3, c)

is a bijection.

Inverse Function

Suppose that f is a one-to-one, onto function from X to Y . The inverse relation

(y, x) | (x, y) ∈ f

is a one-to-one, onto function from Y to X, the inverse function f−1.

10

Since f is onto, the range of f is Y , and hence the domain of f−1 is Y .

Since f is one-to-one, there is only one x ∈ X for which (y, x) ∈ f−1, so f−1 is a function.

We can obtain the arrow diagram for f−1 by reversing the directions of the arrows for f .

Since f is one-to-one, f−1 is one-to-one. Since f is a function, the domain of f is X andhence f−1 is onto.

Example 5. R1 = (1, b), (2, a), (3, c) from X = 1, 2, 3 to Y = a, b, c, d .

Or we could say

R1 = (1, b), (2, a), (3, c) ⊆ 1, 2, 3 × a, b, c, d .

R1 is a function, is one-to-one, but is not onto. This is evident also from the arrowdiagram.

1

2

3

a

b

c

d

XY

R−1

1= (a, 2), (b, 1), (c, 3) ⊆ a, b, c, d × 1, 2, 3

is not a function. Why?

R2 = (1, b), (2, c), (3, a), (4, d) ⊆ 1, 2, 3, 4 × a, b, c, d .

The arrow diagram is

1

2

3

4

a

b

c

d

X Y

R2 is a function, is one-to-one, and is onto.

R−1

2= (a, 3), (b, 1), (c, 2), (d, 4) ⊆ a, b, c, d × 1, 2, 3, 4 .

R−1

2 is a function, is one-to-one, and is onto.

11


Algorithms

Introduction

An algorithm is a step by step method of solving a problem i.e. a recipe.

Algorithms typically have the following characteristics.

• Input The algorithm receives input.

• Output The algorithm produces output.

• Precision The steps are precisely stated.

• Determinism The intermediate results of each step of execution are uniqueand are determined only by the inputs and the results of the preceding steps.

• Finiteness The algorithm terminates; that is, it stops after finitely manyinstructions have been executed.

• Correctness The output produced by the algorithm is correct; that is, thealgorithm correctly solves the problem.

• Generality The algorithm applies to a set of inputs.

Algorithms are written in pseudocode, which resembles real computer code. Johnsonbaughin the 6th edition has rewritten algorithms to be more like Java.

Algorithm 4.1.1 Finding the Maximum of Three Numbers

This algorithm finds the largest of the numbers a, b, and c.

Input: a, b, c

Output: large (the largest of a, b, and c)

1. max3 (a, b, c)

2. large = a

3. if (b > large) // if b is larger then large, update large

4. large = b

5. if (c > large) // if c is larger then large, update large

6. large = c

7. return large

8.

1

Algorithms consist of a title, a brief description of the algorithm, the input to and outputfrom the algorithm, and functions containing the instructions of the algorithm. Thisalgorithm has one function.

Sometimes lines are numbered to make it convenient to refer to them.

line 1 max3 is the name of the function, and a, b, c are input parameters or variables.

line 2 = is the assignment operator. Testing equality uses ==. large is assigned the valueof a.

line 3 This introduces the if statement. The structure is

if (condition)

action

If condition is true, action is executed and control passes to the statement followingaction.

If condition is false, action is not executed and control passes immediately to thestatement following action. For example,

if (x == 0)

y = 0

z = x + y

If action consists of multiple statements, enclose them in braces.

if (x ≥ 0)

y = 0

z = x + 3

We can use

arithmetic operators +, −, ∗, /, ( , )relational operators ==, ¬ =, <, >, ≤, ≥logical operators ∧, ∨, ¬

The Matlab equivalent command is

if expression

commands

end

There is also the if else statement , which has the structure

if (condition)

action 1

else

action 2

2

The notation // signals that the rest of the line is a comment . A more commonnotation is to use %, which is used, for example, by Matlab and postscript. TheMatlab equivalent command is

if expression

commands if true

else

commands if false

end

Matlab can extend the if structure further with elseif.

line 4 This assigns large the value of b if b > large.

line 6 This assigns large the value of c if c > large.

line 7 The return statement simply terminates the function.

The return large statement terminates the function and returns the value of large.

If there is no return statement, the closing brace terminates the function.

We can assign values to the input variables and use a simulation called a trace to evaluatethe operation of the algorithm. For example, we set

a = 6, b = 1, c = 9.

Then

line 2 Set large to a, namely 6.

line 3 The if statement b > large is false, so line 4 is not executed.

line 5 The if statement c > large is true, so at line 6 large is set to the value of c, namely 9.

Algorithm 4.1.2 Finding the Maximum Value in a Sequence

This algorithm finds the largest of the numbers s1, . . . sn.

Input: s, n

Output: large (the largest value in the sequence s)

max (s, n)

large = s1

for i = 2 to n

if (si> large)

large = si

return large

3

The structure of the for loop is

for var = init to limit

action

If action consists of multiple statements, enclose them in braces.

The for loop specifies the initial and final integer values between which the operations areprocessed. Alternatively, we can use a while loop.

while (condition)

action

action is repeated as long as condition is true.


for i = m:n

commands

end

Using a while loop, Algorithm 4.1.2 would become

max (s, n)

large = s1

i = 2

while i ≤ n

if (si> large)

large = si

i = i + 1

return large


while expression

commands

end

4

Examples of Algorithms

Algorithm 4.2.1 Text Search

This algorithm searches for an occurrence of the pattern p in text t. It returns the smallestindex i such that p occurs in t starting at index i. If p does not occur in t, it returns 0.

Input: p (indexed from 1 to m), m, t (indexed from 1 to n), n

Output: i

text search(p, m, t, n)

for i = 1 to n − m + 1

j = 1

// i is the index in t of the first character of the substring

// to compare with p, and j is the index in p

// the while loop compares ti· · · t

i+m−1 and p1 · · · pm

while (ti+j−1 == p

j)

j = j + 1

if (j > m)

return i

return 0

The algorithm indexes the characters in the text with i and those in the pattern with j.The search starts with the first character in the text, and if the pattern is not found,finishes with character n−m + 1 in the text, when the last characters in the pattern andtext coincide.

If there is a match for the first indices in the pattern and text, then the pattern index j

is incremented by 1, and the next characters compared. This continues until either thepattern is found, or there is not a match. In the latter case, the text index i is incrementedby 1, the pattern index j is reset to 1, and the match process repeats.

5

Example 1. This shows the operation of the text search algorithm in a search for thestring “001” in the string “010001”.

j = 1↓

0 0 10 1 0 0 0 1↑

i = 1 Y

j = 2↓

0 0 10 1 0 0 0 1↑

i = 1 N

j = 1↓

0 0 10 1 0 0 0 1

↑

i = 2 N

j = 1↓

0 0 10 1 0 0 0 1

↑

i = 3 Y

j = 2↓

0 0 10 1 0 0 0 1

↑

i = 3 Y

j = 3↓

0 0 10 1 0 0 0 1

↑

i = 3 N

j = 1↓

0 0 10 1 0 0 0 1

↑

i = 4 Y

j = 2↓

0 0 10 1 0 0 0 1

↑

i = 4 Y

j = 3↓

0 0 10 1 0 0 0 1

↑

i = 4 Y

1. The text index i and the pattern index j are initially set to 1, and the first characterscompared. There is a match.

2. The pattern index j is incremented to 2, and the pattern and text characters j andi + j − 1 compared. There is not a match.

3. The text index i is incremented to 2 and the pattern index j reset to 1. There isnot a match in the corresponding characters.

4. The text index i is incremented to 3 and the pattern index j left at 1. There is amatch.

5. The pattern index j is incremented to 2, and the pattern and text characters j andi + j − 1 compared. There is a match.

6. The pattern index j is incremented to 3, and the pattern and text characters j andi + j − 1 compared. There is not a match.

7. The text index i is incremented to 4 and the pattern index j reset to 1. There is amatch.

8. The pattern index j is incremented to 2, and the pattern and text characters j andi + j − 1 compared. There is a match.

9. The pattern index j is incremented to 3, and the pattern and text characters j andi + j − 1 compared. There is a match, and the pattern is found.

6

Example 2. Algorithm Testing Whether a Positive Integer is Prime

This algorithm finds whether a positive integer is prime or composite.

Input: m, a positive integerOutput: true, if m is prime; false, if m is not prime

is prime(m)

for i = 2 to√

m

if (m mod i == 0)

return false

return true

Note that the modulus operation need only go up to√

m, since if m is not prime, onefactor will be less than or equal to

√m and one factor will be greater than or equal to

√m. If i mod m is 0, then i divides m, false is returned, and function is terminated. If

the for loop runs to its conclusion, control passes to the fifth line and true is returned.

Example 3. Algorithm Finding a Prime Larger Than a Given Integer

This algorithm finds the first prime larger than a given positive integer.

Input: n, a positive integerOutput: m, the smallest prime greater than n

large prime(n)

m = n + 1

while (¬ is prime(m) )

m = m + 1

return m

The algorithm starts with m = n + 1, calls the function is prime to test if m is prime,and continually increments m in a while loop as long as m is not prime. If the functionis prime returns false, then ¬ false is true, and the while loop continues. If the functionis prime returns true, then ¬ true is false, the while loop terminates, and m is returned.

7

Analysis of Algorithms

Analysis of an algorithm involves deriving estimates or storage space to execute an algo-rithm. Time is more crucial. The following table shows how time varies in the executionof an algorithm with the size of the input and the complexity of the algorithm. Note thatlg n means the logarithm to the base 2 of n. We can see from the table, for example, thatwith an input of size n = 106, the time to execute is 20 seconds if the algorithm behavesas n lg n, and twelve days if the algorithm behaves as n2.

This has real implications. Cooley and Tukey derived the Fast Fourier Transform algo-rithm in the 1960’s, which reduced the time for a numerical discrete Fourier transformfrom behaving as n2 to behaving as n lg n.

Real-life algorithms are checked for the amount of time to execute. For example, it ispointless to have an air traffic control algorithm that takes two hours to run if an updateis required every 15 minutes.

Suppose that we measure the size of the “input” as n.

Best-case time

The minimum time needed to execute the algorithm among all inputs of size n.

Worst-case time

The maximum time needed to execute the algorithm among all inputs of size n.

Average-case time

The average time needed to execute the algorithm among all inputs of size n.

Example 1.

(a) The algorithm is to find the largest element in a sequence.

The number of iterations of a while loop (or comparisons) is n − 1 for input of sizen in all three cases.

(b) Search for a key word in a list of size n.

The best-case time is 1 if the key word is first in the list.

The worst-case time is n if the key word is last in the list, or not there at all.

The average-case time is the sum of all n+1 cases, where the key word is in positioni, i = 1, 2, . . . , n in the list or not there at all. This is

1 + 2 + 3 + · · · + n + n

n + 1=

n(n + 1)/2 + n

n + 1

=n2 + 3n

2(n + 1).

(c) A set X contains red items and black items. Count all subsets that contain at leastone red item. Since there are 2n subsets, the time taken behaves as 2n.

(d) In the travelling salesperson problem, the salesperson visits n towns in some order.

There are 1

2(n − 1)! possible tours of n towns.

Question: Which grows the faster, 2n or 1

2(n − 1)! .

8

Number of Stepsfor Input Time to Execute if n =of Size n 3 6 9 12

1 10−6 sec 10−6 sec 10−6 sec 10−6 seclg lg n 10−6 sec 10−6 sec 2 × 10−6 sec 2 × 10−6 seclg n 2 × 10−6 sec 3 × 10−6 sec 3 × 10−6 sec 4 × 10−6 secn 3 × 10−6 sec 6 × 10−6 sec 9 × 10−6 sec 10−5 sec

n lg n 5 × 10−6 sec 2 × 10−5 sec 3 × 10−5 sec 4 × 10−5 secn2 9 × 10−6 sec 4 × 10−5 sec 8 × 10−5 sec 10−4 secn3 3 × 10−5 sec 2 × 10−4 sec 7 × 10−4 sec 2 × 10−3 sec2n 8 × 10−6 sec 6 × 10−5 sec 5 × 10−4 sec 4 × 10−3 sec

Number of Stepsfor Input Time to Execute if n =of Size n 50 100 1000

1 10−6 sec 10−6 sec 10−6 seclg lg n 2 × 10−6 sec 3 × 10−6 sec 3 × 10−6 seclg n 6 × 10−6 sec 7 × 10−6 sec 10−5 secn 5 × 10−5 sec 10−4 sec 10−3 sec

n lg n 3 × 10−4 sec 7 × 10−4 sec 10−2 secn2 3 × 10−3 sec 0.01 sec 1 secn3 0.13 sec 1 sec 16.7 min2n 36 yr 4 × 1016 yr 3 × 10287 yr

Number of Stepsfor Input Time to Execute if n =of Size n 105 106

1 10−6 sec 10−6 seclg lg n 4 × 10−6 sec 4 × 10−6 seclg n 2 × 10−5 sec 2 × 10−5 secn 0.1 sec 1 sec

n lg n 2 sec 20 secn2 3 hr 12 daysn3 32 yr 31, 710 yr2n 3 × 1030089 yr 3 × 10301016 yr

TABLE 4.3.1 Time to execute an algorithm if one step takes 1 microsecond to execute.

9

Often we are less interested in the best- or worst-case times for an algorithm to executethan in how the times grow as n increases.

Example 2. Suppose the worst-case time is

t(n) =n3

3+ n2

−n

3.

Then for n = 10, 100, 1, 000, 10, 000 we have the table

n n3/3 + n2 − n/3 n3/3

10 430 333

100 343, 300 333, 333

1, 000 334, 333, 000 333, 333, 333

10, 000 3.334 × 1011 3.333 × 1011

For large n,

t(n) ≈n3

3.

Hence t(n) is of order n3, ignoring the constant 1

3.

Definition. Let f and g be functions with domain 1, 2, 3, . . . .

We writef(n) = O(g(n))

and say that f(n) is of order at most g(n) or f(n) is big oh of g(n) if there exists apositive constant C1 such that

|f(n)| ≤ C1 |g(n)|

for all but finitely many positive integers n.

We writef(n) = Ω(g(n))

and say that f(n) is of order at least g(n) or f(n) is omega of g(n) if there exists a positiveconstant C2 such that

|f(n)| ≥ C2 |g(n)|

for all but finitely many positive integers n.

We writef(n) = Θ(g(n))

and say that f(n) is of order g(n) or f(n) is theta of g(n) if f(n) = O(g(n)) and f(n) =Ω(g(n)).

10

Example 3. f(n) = 4n + 3. Then

f(n) ≤ 4n + 3n

= 7n,

so f(n) = O(n). Also

f(n) ≥ 4n,

so f(n) = Ω(n).

Therefore f(n) = Θ(n).

Example 4. f(n) = 2n2 + 3n + 1. Then

f(n) ≤ 2n2 + 3n2 + n2

= 6n2,

so f(n) = O (n2). Also

f(n) ≥ 2n2,

so f(n) = Ω (n2).

Therefore f(n) = Θ (n2).

Example 2. (Continued) t(n) =n3

3+ n2

−n

3.

t(n) ≤n3

3+ n2

≤n3

3+ n3

=4

3n3,

so t(n) = O (n3). Also

t(n) ≥n3

3+ n2 −

n2

3

=n3

3+

2

3n2

≥n3

3,

so t(n) = Ω (n3).

Therefore t(n) = Θ (n3).

Exercise: Try f(n) = 4n3 − n2 + 2n.

11

Theorem 4.3.4 Let

p(n) = aknk + a

k−1nk−1 + · · · + a1n + a0

be a polynomial in n of degreee k, where each aiis nonnegative (and a

k6= 0). Then

p(n) = Θ(

nk

)

.

Proof:

p(n) = aknk + a

k−1nk−1 + · · · + a1n + a0

≤ aknk + a

k−1nk + · · · + a1n

k + a0nk

= (ak

+ ak−1 + · · · + a1 + a0)nk

= C1nk,

so p(n) = O(

nk

)

. Also

p(n) ≥ aknk,

so p(n) = Ω(

nk

)

.

Therefore t(n) = Θ(

nk

)

.

12

y

n

1

1

2

2 3

4

4 5 6 7

8

8 9 10 11 13

16

32

64

128

256

y = 1

y = lg n

y = n

y = n lg n

y = n2

y = 2n

Figure 4.3.1 Growth of some common functions.

12

Example 5. f(n) = 2n + 3 lg n.

Remember that lg n represents log2n.

Does your calculator have an lg button? If not,

y = lg n

n = 2y

ln n = ln (2y)

= y ln 2, so

y =ln n

ln 2.

Now lg n < n for all n ≥ 1. (See the preceding graph). Therefore

f(n) = 2n + 3 lg n

< 2n + 3n

= 5n,

so f(n) = O (n). Also

f(n) ≥ 2n,

so f(n) = Ω (n).

Therefore f(n) = Θ (n).

Example 6. f(n) = 1 + 2 + 3 + · · · + (n − 1) + n.

This example assumes that we don’t know that the sum isn(n + 1)

2.

1 + 2 + 3 + · · · + (n − 1) + n ≤ n + n + n + · · · + n + n

= n2,

so f(n) = O (n).

Also

1 + 2 + 3 + · · · + (n − 1) + n ≤ 1 + 1 + 1 + · · · + 1 + 1

= n,

so f(n) = Ω (n).

This seems to be too low an estimate. We can do better. Let’s be trickier, and throwaway approximately the first half of the series.

1 + 2 + 3 + · · · + (n − 1) + n ≥

⌈

n

2

⌉

+⌈

n

2+ 1

⌉

+ · · · + (n − 1) + n

≥

⌈

n

2

⌉

+⌈

n

2

⌉

+ · · · +⌈

n

2

⌉

.

How many terms are there?

13

If n = 8 4 + 5 + 6 + 7 + 8 = 5 terms i.e.⌈

n + 1

2

⌉

.

If n = 9 5 + 6 + 7 + 8 + 9 = 5 terms i.e.⌈

n + 1

2

⌉

.

If n = 2k k + (k + 1) + · · · + 2k = 2k − (k − 1) = k + 1 terms.

If n = 2k + 1 (k + 1) + (k + 2) + · · · + (2k + 1) = (2k + 1) − k = k + 1 terms.

Hence there are⌈

n + 1

2

⌉

terms. Therefore

f(n) ≥

⌈

n + 1

2

⌉ ⌈

n

2

⌉

≥n

2·n

2

=n2

4,

so f(n) = Ω (n2).

Therefore f(n) = Θ (n).

If we know that f(n) =n(n + 1)

2, then

f(n) = 1

2n2 + 1

2n

≤1

2n2 + 1

2n2

= n2,

so f(n) = O (n2).

Also

f(n) ≥1

2n2,

so f(n) = Ω (n2).

Therefore f(n) = Θ (n2).

14


Analysis of Algorithms (Continued)

Example 7. We can generalize Example 6 to nk.

f(n) = 1k + 2k + 3k + · · · + (n − 1)k + nk.

Now

f(n) = 1k + 2k + 3k + · · · + (n − 1)k + nk

≤ nk + nk + nk + · · · + nk + nk

= n × nk

= nk+1.

So f(n) = O(

nk+1

)

.

f(n) ≥

⌈

n

2

⌉

k

+⌈

n

2+ 1

⌉

k

+ · · · + (n − 1)k + nk

≥

⌈

n

2

⌉

k

+⌈

n

2

⌉

k

+ · · · +⌈

n

2

⌉

k

=⌈

n + 1

2

⌉ ⌈

n

2

⌉

k

≥

(

n + 1

2

) (

n

2

)

k

=1

2k+1

[

(n + 1)nk

]

=1

2k+1

(

nk+1 + nk

)

≥1

2k+1nk+1.

So f(n) = Ω(

nk+1

)

.

·. . f(n) = Θ(

nk+1

)

.

Example 8. What is the order of n! ?

What is n! ? The basic definition is

n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1.

Hence 1! = 1

2! = 2

3! = 6

4! = 24

5! = 120

6! = 720

7! = 5040

1

etc.

My calculator runs out at 69! , as 70! > 10100. What is the limit on your calculator?

Then 0! is defined to be 1. What is 11

2! ?

Stirling’s formula gives an approximation for n! :

n! ≈√

2πn

(

n

e

)

n

.

Henceln n! ≈ 1

2ln 2π + 1

2lnn + n ln n − n.

So we suspect thatlg n! = Θ (n lg n) .

Then, taking lg of n! ,

lg n! = lg n + lg (n − 1) + · · · + lg 2 + lg 1

≤ lg n + lg n + · · · + lg n + lg n

= n lg n.

So n! = O (n lg n). Finding a lower limit,

lg n! ≥ lg n + lg (n − 1) + · · · + lg⌈

n

2

⌉

≥ lg⌈

n

2

⌉

+ lg⌈

n

2

⌉

+ · · · + lg⌈

n

2

⌉

=⌈

n + 1

2

⌉

lg⌈

n

2

⌉

≥n

2lg(

n

2

)

=n

2(lg n − lg 2)

=n

2

[

lg n

2+

lg n

2− 1

]

≥n

4lg n, as for n ≥ 4, lg n ≥ 2.

So lg n! = Ω (n lg n).

·. . lg n! = Θ (n lg n).

Definition. If an algorithm requires t(n) units of time to terminate in the best case foran input of size n, and

t(n) = O (g(n)) ,

we say that the best-case time required by the algorithm is of order at most g(n) or thatthe best-case time required by the algorithm is O (g(n)).

If an algorithm requires t(n) units of time to terminate in the worst case for an input ofsize n, and

t(n) = O (g(n)) ,

2

we say that the worst-case time required by the algorithm is of order at most g(n) or thatthe worst-case time required by the algorithm is O (g(n)).

If an algorithm requires t(n) units of time to terminate in the average case for an inputof size n, and

t(n) = O (g(n)) ,

we say that the average-case time required by the algorithm is of order at most g(n) orthat the average-case time required by the algorithm is O (g(n)).

Replace O by Ω and “at most” by “at least” to obtain the definition of what it meansfor the best-case, worst-case or average-case time of an algorithm to be of order at leastg(n).

If the best case time is O (g(n)) and Ω (g(n)), then the best-case time required by thealgorithm is Θ (g(n)).

Similar definitions apply for the worst-case and average-case times.

Example 1. (b) (Continued)

Johnsonbaugh Algorithm 4.3.17. Searching for a key in an unordered sequence.

The best-case time is 1 i.e. Θ(1).

The worst-case time is n i.e. Θ(n).

The average-case time is

(1 + 2 + 3 + · · · + n) + n

n + 1=

n2 + 3n

2(n + 1)

i.e. Θ(n).

Example 9. Consider the pseudocode

for i = 1 to n

for j = 1 to i

x = x + 1

Find a theta notation for the number of times the statement x = x + 1 is executed.

The number of times is1 + 2 + 3 + · · · + n = 1

2n2 + 1

2n,

which is Θ(n2).

Example 10. Consider the pseudocode

i = n

while (i ≥ 1)

x = x + 1

i = ⌊ i/2 ⌋

Find a theta notation for the number of times the statement x = x + 1 is executed.

3

Suppose that n = 8. Then for

i = 8 x = x + 1 is executed




i = 0 x = x + 1 is not executed.

So the statement is executed 4 times.

Suppose n = 2k. Then the statement is executed for

i = 2k, 2k−1, 2k−2, . . . , 2, 20 = 1,

i.e. k + 1 = 1 + lg n times.

If2k

≤ n < 2k+1,

then after k iterations

1 ≤ i =⌊

n

2k

⌋

< 2.

So for all n, x = x + 1 is executed 1 + lg n times, which is Θ(lg n).

Example 11. Find a theta notation for the number of times the statement x = x + 1 isexecuted in the following pseudocode.

j = n

while (j ≥ 1)

for i = 1 to j

x = x + 1

j = ⌊ j/2 ⌋

Let t(n) denote the number of times the statement x = x + 1 is executed.

The first time in the while loop, it is executed n times.

·. . t(n) ≥ n, and t(n) = Ω(n).

The second time in the while loop, the statement is executed j = ⌊ j/2 ⌋ times, and soon. Hence

t(n) ≤ n +n

2+

n

4+

n

8+ · · · +

n

2k−1,

wheren

2k

< 1. So

t(n) ≤n(

1 −1

2k

)

1 −1

2

= 2n(

1 −1

2k

)

≤ 2n.

4

·. . t(n) = O(n).

·. . t(n) = Θ(n).

A “good” algorithm has a worst-case polynomial time, and such problems are calledfeasible or tractable problems. Exponential or factorial worst-case time problems arecalled intractable, and require a long time to execute even for reletively small n.

There are problems for which there is no algorithm. These are called unsolvable. Such aproblem is the halting problem: given an arbitrary program and a set of inputs, will theprogram ever halt?

A large number of solvable problems have an undetermined status. They are thought tobe intractable, but none have ever been proved to be intractable. Such a problem is thetravelling salesperson problem.

5

Recursive Algorithms

A recursive function invokes itself. A recursive algorithm is an algorithm that contains arecursive function.

Example 1. Factorials. Johnsonbaugh Example 4.4.1

We know that n! = n(n − 1)!, and that 0! is defined as 1. We can resolve the problemof computing n! into the simpler problem of computing (n − 1)!, then into computing(n − 2)! and so on, until we get to 0! which is known i.e.

Problem Simplified Problem5! 5 · 4!4! 4 · 3!3! 3 · 2!2! 2 · 1!1! 1 · 0!0! None

Table 1: Decomposing the factorial problem.

Problem Solution0! 11! 1 · 0! = 12! 2 · 1! = 23! 3 · 2! = 64! 4 · 3! = 245! 5 · 4! = 120

Table 2: Combining subproblems of the factorial problem.

Algorithm 4.4.2 Computing n Factorial

This recursive algorithm computes n!.

Input: n, an integer greater than or equal to 0Output: n!

1. factorial(n)

2. if (n == 0)

3. return 1

4. return n∗factorial(n − 1)

5.

6

We can see how the algorithm computes n!.

If n = 1, proceed to line 4 since n 6= 0, and compute

n · (n − 1)! = 1 · 0! = 1 · 1 = 1.

If n = 2, proceed to line 4 since n 6= 0, and compute

n · (n − 1)! = 2 · 1! = 2 · 1 = 2,

and so on.

Theorem: Algorithm 4.4.2 returns the value of n! , n ≥ 0.

Proof: Basis Step: For n = 0, the algorithm correctly returns 1.

Inductive Step: Assume that the algorithm correctly returns the value of (n− 1)! , n > 0.

Suppose n is input to the algorithm. Since n > 0, proceed to line 4. By the assumption,the algorithm correctly computes (n− 1)! . Hence the algorithm correctly computes n! =n · (n − 1)! .

Recursive algorithms and their proof go hand-in-hand with mathematical induction.

Example 2. This algorithm recursively finds the smallest of a finite sequence of numbers.

Algorithm Recursively Finding the Minimum Value in a Sequence

This algorithm finds the smallest of the numbers s1, s2, . . . , sn.

Input: s, n

Output: small, (the smallest value in the sequence s)

min(s, n)

if (n == 1)

small = s1

return small

else

small = min(s, n − 1)

if small ≤ sn

return small

else

small = sn

return small

7

If n = 1, there is only one number in the sequence, and it is returned.

If n = 2, then min(s, 1) is recursively called, and this returns s1. This is compared withs2, and the smaller returned.

If n = 3, then min(s, 2) is recursively called, and this recursively calls min(s, 1), whichreturns s1. Then min(s, 2) returns the smaller of s1 and s2, and this is compared with s3

and the smaller returned.

Theorem: This algorithm correctly returns the value of the smallest of a finite sequenceof numbers.

Proof: Basis Step: For n = 1, the algorithm returns the only number in the sequence.

Inductive Step: Assume that the algorithm correctly returns the smallest value in sequenceof length n − 1.

Then for a sequence of length n, n > 1, the algorithm correctly computes the smallest ofs1, s2, . . . , s

n−1, compares this with sn, and returns the smaller.

Therefore the algorithm correctly returns the smallest number in a sequence of length n.

Example 3. Robot walk. Johnsonbaugh Example 4.4.5

A robot can take steps of 1 metre or 2 metres. In how many ways can the robot walk n

metres? Let walk(n) denote the number of ways. Then walk(1) = 1 and walk(2) = 2.

Distance Sequence of Steps Number of Ways to Walk1 1 12 1, 1 or 2 23 1, 1, 1 or 1, 2 or 2, 1 34 1, 1, 1, 1 or 1, 1, 2 5

or 1, 2, 1 or 2, 1, 1 or 2, 2

Suppose n > 2. Then

walk(n) = walk(n − 1) + walk(n − 2).

Algorithm 4.4.6 Robot Walking

This algorithm computes the function defined by

walk(n) =

1, n = 12, n = 2walk(n − 1) + walk(n − 2), n > 2.

Input: n

Output: walk(n)

walk(n)

if (n == 1 ∨ n == 2)

8

return n

return walk(n − 1) + walk(n − 2)

We can see that the sequence generated is

1, 2, 3, 5, 8, 13, 21, 34, 55, . . . .

This is the Fibonacci sequence fn, and is defined by the equations

f1 = 1

f2 = 1

fn

= fn−1 + f

n−2, for all n ≥ 3.

The Fibonacci sequences arises in many natural situations, as well as mathematical ones.For example, a pine cone can have 13 clockwise spirals and 8 counterclockwise spirals.

The ratio of successive terms in the Fibonacci sequence has a limit of the golden ratio,

limn → ∞

fn

fn−1

= φ =1 +

√5

2= 1.6180339887498 . . . .

For example,55

34= 1.6176 . . . .

A classical formula for the Fibonacci numbers is

1√

5

[(

1 +√

5

2

)

n

−

(

1 −√

5

2

)

n]

.

All those√

5 occurrences, and the answer is an integer!

9


Graph Theory

Introduction to Graphs

Graph theory was introduced by Leonhard Euler in 1736, as a means of solving theKonigsberg bridge problem. There was a revival of graph theory in the 1920’s, with thefirst text produced in 1936. Graph theory involves a lot of terminology, as will be seen.

A graph is drawn with dots and lines, the dots being vertices and the lines are edges . Theimportant information in the graph is the connections, not the positions of the verticesand edges.

Example 1. The following two graphs contain the same information.

e1

e1

e2

e2

e3

e3 e4

e4

e5

e5

e6

e6

v1

v1

v2

v2

v3

v3

v4

v4 v5v5

A path starts at one vertex v1, travels along an edge to vertex v2, and so on, arriving atv

n.

For a path to traverse every edge exactly once, and return to the original vertex, an evennumber of edges must touch each vertex, for example, the graph in Example 1. In thegraph shown below, a path can traverse each edge exactly once, but will not return to theoriginal vertex.

v1 v2

v3v4

Definition 8.1.1

A graph (or undirected graph) G consists of a set V of vertices (or nodes) and a set E ofedges (or arcs) such that each edge e ∈ E is associated with an unordered pair of vertices.If there is a unique edge e associated with the vertices v and w, we write e = (v, w) or

1

e = (w, v). In this context, (v, w) denotes an edge between v and w in an undirectedgraph and not an ordered pair.

A directed graph (or digraph) G consists of a set V of vertices (or nodes) and a set E ofedges (or arcs) such that each edge e ∈ E is associated with an ordered pair of vertices. Ifthere is a unique edge e associated with ordered pair (v, w) of vertices, we write e = (v, w),which denotes an edge from v to w.

An edge e in a graph (undirected or directed) that is associated with the pair of verticesv and w is said to be incident on v and w, and v and w are said to be incident on e andto be adjacent vertices.

If G is a graph (undirected or directed) with vertices V and edges E, we write G = (V, E).

Unless specified otherwise, the sets E and V are assumed to be finite and V is assumedto be nonempty.

Example 1. (Continued)V = v1, v2, v3, v4, v5

E = e1, e2, e3, e4, e5, e6

e2 = (v2, v3) = (v3, v2)

Edge e4 is incident on vertices v2 and v4, and vertices v2 and v4 are adjacent.

Example 2. The graph shown below is a directed graph.

e1

e2

e3

e4

e5

e6

e7

v1

v2 v3

v4

v5

v6

v7

Directed edges are indicated by arrows.

Edge e1 is associated with the ordered pair (v2, v1) of vertices.

Distinct edges can be associated with the same pair of vertices e.g. e3 and e4. These areparallel edges, and can also occur in undirected graphs.

e1

e2

v1 v2

2

An edge incident on a single vertex is called a loop e.g. e7.

A vertex not incident to any edge is called isolated e.g. v7.

A graph with neither loops nor parallel edges is called a simple graph.


Holes are being bored in a sheet of metal by a drill press, and these can be consideredas the vertices of a graph. There is a travel time associated with every edge betweenvertices, so that we have a weighted graph. This is shown in the diagram below.

a

a

b

b

c

cd

d

e

e

22

3

3

4

4

4

4

55

66

66

88

99

12

12

If an edge e is labelled k, we say that the weight of edge e is k.

The length of a path is the sum of the weighted edges in a path. A path of minimumlength that visits each vertex exactly once represents the optimum path for the drill press.

For example, for a path starting at b and finishing at e, we have path lengths:

Path Lengthb, a, c, d, e 17b, a, d, c, e 20b, c, a, d, e 16b, c, d, a, e 19b, d, a, c, e 23b, d, c, a, e 23

Additionally, all starting and finishing vertices need to be checked:

(a, b), (a, c), (a, d), (a, e), (b, c), (b, d), (b, e), (c, d), (c, e), (d, e).

It is expected that the reverse path will be of the same length.

3

Example 4. Johnsonbaugh Example 8.1.7: Similarity Graphs.

A particular algorithm in C++ is implemented by a number of persons. We wish to group“like” programs into classes based on program properties. The properties are:

1. The number of lines in the program

2. The number of return statements in the program

3. The number of function calls in the program.

Number of Number of Number ofProgram program lines return statements function calls

1 66 20 12 41 10 23 68 5 84 90 34 55 75 12 14

A similarity graph is constructed with vertices corresponding to the programs. A vertexis denoted (p1, p2, p3) where p

iis the value of property i.

A dissimilarity function s is defined as follows. For each pair of vertices v = (p1, p2, p3)and w = (q1, q2, q3) set

s(v, w) = |p1 − q1| + |p2 − q2| + |p3 − q3| .

In this example, we have

s (v1, v2) = 36 s (v2, v3) = 38 s (v3, v4) = 54s (v1, v3) = 24 s (v2, v4) = 76 s (v3, v5) = 20s (v1, v4) = 42 s (v2, v5) = 48 s (v4, v5) = 46s (v1, v5) = 30

For a fixed number S, inset an edge between vertices v and w if s(v, w) < S. For example,if S = 25 we have the graph:

v1 v2

v3

v4 v5

We say that v and w are in the same class if v = w or if there is a path from v to w. Herethe classes are 1, 3, 5 , 2 and 4 . What are the classes if S = 40?

4

Definition 8.1.9

The complete graph on n vertices, denoted Kn, is the simple graph with n vertices in

which there is an edge between each pair of distinct vertices.

Example 5. The complete graph, K4, on 4 vertices is:

Definition 8.1.11

A graph G = (V, E) is bipartite if there exist subsets V1 and V2 (either possibly empty)of V such that V1 ∩ V2 = ∅, V1 ∪ V2 = V , and each edge in E is incident on one vertex inV1 and one vertex in V2.

Example 6. The graph in the figure below is bipartite for

V1 = v1, v2, v3 and V2 = v4, v5 .

v1

v2

v3

v4

v5

Note that it is not required that there is an edge between every vertex in V1 and everyvertex in V2.

Example 7. The graph in the following figure is not bipartite.

v1 v2

v3v4

v5 v6

v7

v8 v9

5

It is often easier to prove that a graph is not bipartite by arguing a contradiction.

Suppose the graph is bipartite. Then we can partition the set of vertices into two subsetsV1 and V2.

Consider v4, v5 and v6. Since v4 and v5 are adjacent, v4 is in V1 (say) and v5 is in V2.

Since v5 and v6 are adjacent, v6 is in V1.

But v4 and v6 are adjacent and both are in V1. Contradiction. Hence the graph is notbipartite.

Definition 8.1.15

The complete bipartite graph on m and n vertices, denoted Km,n

, is the simple graphwhose vertex set is partitioned into sets V1 with m vertices and V2 with n vertices inwhich the edge set consists of all edges of the form (v1, v2) with v1 ∈ V1 and v2 ∈ V2.

Example 8. The complete bipartite graph on two and four vertices, K2,4, is:

6

Paths and Cycles

Definition 8.2.1

Let v0 and vn

be vertices in a graph. A path from v0 to vn

of length n is an alternatingsequence of n + 1 vertices and n edges beginning with vertex v0 and ending with vertexv

n,

(v0, e1, v1, e2, v2, . . . vn−1, en

, vn),

in which edge eiis incident on vertices v

i−1 and vifor i = 1, . . . , n.

The formalism in Definition 8.2.1 means: Start at vertex v0; go along edge e1 to v1; goalong edge e2 to v2; and so on.

Definition 8.2.4

A graph G is connected if given any vertices v and w in G, there is a path from v to w.

Example 1. Consider the following graph.

e1

e2

e3

e4

e5

e6

e7

e8

e9

e10

e11

e12

v1

v2

v3

v4

v5

v6

v7

v8

v9

Paths joining v1 and v2 are (v1, v4, v2) of length 2 and (v1, v4, v5, v6, v7, v2) of length 5.

The graph is connected.

The graph

e1

e2

e3

v1

v2

v3

v4

v5

is not connected.

A connected graph consists of one “piece”, and a graph that is not connected consists oftwo or more “pieces”. These “pieces” are subgraphs of the original graph called compo-

nents .

7

Definition 8.2.8

Let G = (V, E) be a graph. We call (V ′, E ′) a subgraph of G if

(a) V ′ ⊆ V and E ′ ⊆ E.

(b) For every edge e′ ∈ E ′, if e′ is incident on v′ and w′, then v′, w′ ∈ V ′.

Example 1. (Continued) A subgraph is

e1

e2

e3

e4

e11

v1

v2

v3

v4

v5

v7

Definition 8.2.11

Let G be a graph and let v be a vertex in G. The subgraph G′ of G consisting of all edgesand vertices in G that are contained in some path beginning at v is called the component

of G containing v.

A connected graph, such as Example 1, has only one component, itself.

A graph such as

e1

e1

e2e2

e3e3

v1

v1

v2v2

v3v3

v4

v4

v5v5

G :

G1 :

G2 :has two

and

components

has two components G1 and G2.

We describe G2 as G2 = (V2, E2) with V2 = v2, v3, v5 and E2 = e2, e3 .

Definition 8.2.14

Let v and w be vertices in a graph G.

A simple path from v to w is a path from v to w with no repeated vertices.

A cycle (or circuit) is a path of nonzero length from v to v with no repeated edges.

8

A simple cycle is a cycle from v to v, in which, except for the beginning and endingvertices that are both equal to v, there are no repeated vertices.


Path Simple Path? Cycle? Simple Cycle?(v1, v4, v5, v6, v5)(v1, v4, v6, v7, v2)(v4, v1, v3, v5, v4, v6, v7, v2, v4)(v4, v5, v6, v7, v2, v4)(v4)

The Konigsberg Bridge Problem

Two islands in the Pregel River in Konigsberg (now Kaliningrad in Russia) were connectedto each other and the river banks by seven bridges, as shown in the diagram below.

A

B C

D

Pregel

River

The problem is to start at any location, walk over each bridge exactly once, and finish atthe starting location.

Leonhard Euler solved the problem in 1736. The problem can be represented as thefollowing graph, where the vertices represent the locations and the edges represent thebridges.

A

B C

D

9

Euler showed that there is no solution, as all vertices have an odd number of incidentedges.

A cycle in graph G that includes all of the edges and all of the vertices of G is called anEuler cycle, in honour of Euler.

We introduce the idea of the degree of a vertex v, δ(v). This is the number of edgesincident on v.

Theorem 8.2.17

If a graph G has an Euler cycle, then G is connected and every vertex has even degree.

Proof: Suppose that G has an Euler cycle.

We have seen the argument that during the cycle, the path leaves each vertex for everytime that it is entered, and so all vertices must be of even degree.

If v and w are vertices of G, then the portion of the Euler cycle between v and w is apath from v to w, and so G is connected.

Theorem 8.2.18

If G is a connected graph and every vertex has even degree, then G has an Euler cycle.

Proof: Johnsonbaugh gives a mathematical induction proof. See the text.

Example 2. For the Konigsberg bridge problem,

δ(A) = 3, δ(B) = 5, δ(C) = 3, δ(D) = 3,

so there is not an Euler cycle.

If the graph G is as shown below, then G is connected an every vertex has even degree.

δ (v1) = δ (v2) = δ (v3) = δ (v5) = 4, δ (v4) = 6, δ (v6) = δ (v7) = 2.

v1

v2

v3

v4 v5

v6 v7

Hence an Euler cycle exists. By inspection, one is

(v6, v4, v7, v5, v1, v3, v4, v1, v2, v5, v4, v2, v3, v6) .

10

Theorem 8.2.21

If G is a graph with m edges and vertices v1, v2, . . . , vn, then

n∑

i=1

δ(vi) = 2m.

In particular, the sum of degrees of all of the vertices of the graph is even.

Proof: Each edge is counted twice, once from vito v

j, and once from v

jto v

i.

Corollary 8.2.22

In any graph, there are an even number of vertices of odd degree.

Theorem 8.2.23

A graph has a path with no repeated edges from v to w (v 6= w) containing all the edgesand vertices if and only if it is connected and v and w are the only vertices of odd degree.

For example, the graph below.

v1

v2

v3 v4

v5

Proof: Add an edge from v to w. Now thare is an Euler cycle as all vertices are of evendegree.

Theorem 8.2.24

If a graph G contains a cycle from v to v, G contains a simple cycle from v to v.

Proof: If C is a cycle from v0 to vn

(v0 = vn), and C is not a simple cycle, then there

must be vertices vi= v

jin the path where i < j < n. Remove the portion of the path

between viand v

j, and repeat the procedure if necessary. The cycle is eventually reduced

to a simple cycle.

11


Trees

Introduction

Example 1. Johnsonbaugh example on the draw for the semifinals and finals at Wim-bledon (many years ago).

SEMIFINALS

Graf

Sabatini

Navratilova

Seles

Graf

FINALS

Seles

Graf

WIMBLEDONCHAMPION

The draw can be represented as a graph called a tree. It is rotated clockwise through 90

so as to appear as is shown on the right. If it is rotated through 90 the other way thenit would appear as a natural tree.

v1

v1 v2

v2

v3

v3 v4

v4

v5

v5

v6

v6

v7v7

level 0

level 1

level 2

height = 2

Definition 9.1.1

A (free) tree T is a simple graph satisfying the following: If v and w are vertices in T ,then there is a unique simple path from v to w.

A rooted tree is a tree in which a particular vertex is designated the root.

The level of a vertex v is the length of the simple path from the root to v.

The height of a rooted tree is the maximum level number that occurs.

1

Example 2. Johnsonbaugh 9.1.4 The tree T shown below will become the rooted tree T ′

if we designate vertex e as the root.

PSfrag replacemen

a

a b

b

c

c

d

de

ef

f

g

g

hh

i

i

j

j

Huffman Codes

Character representation is often by fixed-length bit strings e.g. ASCII where eight bitstrings are used for the 256 extended character set.

Huffman codes represent characters by variable-length bit strings. Short bit strings areused to represent the most frequently used characters.

A Huffman code is most easily defined by a rooted tree. To decode a bit string, begin atthe root and move down the tree until a character is encountered. The bit, 0 or 1, meansmove right or left. When a character is encountered, begin again at the root.

Given a tree that defines a Huffman code, any bit string can be uniquely decoded eventhough the characters are represented by variable-length bit strings.

Example 3. Decode the string 010101110110 from the following tree.

A

O

R

ST

Root

0

0

0

01

1

1

1

The string decodes as:

AR ST

010 1 0111 0110

2

Algorithm 9.1.9 Constructing an Optimal Huffman Code

This algorithm constructs an optimal Huffman code from a table giving the frequencyof occurrence of the characters to be represented. The output is a rooted tree with thevertices at the lowest levels labelled with the frequencies and with the edges labelled withbits. The coding tree is obtained by replacing each frequency by a character having thatfrequency.

Input: A sequence of n frequencies, n ≥ 2Output: A rooted tree that defines an optimal Huffman code

huffman(f, n)

if (n == 2)

let f1 and f2 denote the frequencies

let T be as in the figure (on the left)

return T

let fiand f

jdenote the smallest frequencies

replace fiand f

jin the list by f

i+ f

j

T ′ = huffman(f, n − 1)

replace a vertex in T ′ labelled fi+ f

jby the

tree shown in the figure (on the right) to obtain the tree T

return T

f1 f2 fi

fj

00 11

Example 4. We have the following table of frequencies.

Letter FrequencyABCDE

1012178

22

3

The algorithm begins by repeatedly replacing the two smallest frequencies with the sumuntil a two element sequence is obtained.

10, 12, 17, 8, 22 −→ 18, 12, 17, 22A, B, C, D, E A + D, B, C, E

12, 17, 18, 22 −→ 29, 18, 22B, C, A + D, E B + C, A + D, E

18, 22, 29 −→ 40, 29A + D, E, B + C A + D + E, B + C

The algorithm then constructs trees working backward, beginning with the two elementsequence 29, 40.

0

00

0

00

00

0

0 1

11

1

1

11

1

1

1

8 10

1212 1717 1818 222222

2929 40

Now replace each frequency by a character having that frequency.

0

00

01

11

1

A

B C

D

E

Then BED is coded as 11 | 00 | 011.

4

Terminology and Characterizations of Trees

Definition 9.2.1

Let T be a tree with root v0. Suppose that x, y and z are vertices in T and that(v0, v1, . . . , v

n) is a simple path in T . Then

(a) vn−1 is the parent of v

n.

(b) v0, . . . , vn−1 are ancestors of v

n.

(c) vn

is a child of vn−1.

(d) if x is an ancestor of y, y is a descendant of x.

(e) If x and y are children of z, x and y are siblings .

(f) If x has no children, x is a terminal vertex (or a leaf ).

(g) If x is not a terminal vertex, x is an internal (or branch) vertex .

(h) The subtree of T rooted at x is the graph with vertex set V and edge set E, whereV is x with the descendants of x and

E = e | e is an edge on a simple path from x to some vertex in V .

Example 1. consider the following tree.

a

b

c

d

e

f

g

h

i

j

The parent of b is: e

The ancestors of a are: b, e, g

The children of b are: a, c

The descendants of e are: d, f, b, a, c

The siblings of d are: f, b

The terminal vertices are: d, f, a, c, h, j

The internal vertices are: g, e, i, b

5

The subtree rooted at e is:

a

b

c

d

e

f

Example 2. Greek gods. Johnsonbaugh Example 9.2.2

Uranus

Aphrodite Kronos Atlas Prometheus

Eros Zeus Poseidon Hades Ares

Apollo Athena Hermes Heracles

The parent of Eros is:

The ancesters of Hermes are:

The children of Zeus are:

The descendants of Kronos are:

The siblings of Atlas are:

The terminal vertices are:

The internal vertices are:

The subtree rooted at Kronos is:

Kronos

Zeus Poseidon Hades Ares

Apollo Athena Hermes Heracles

6

Theorem 9.2.3

Let T be a graph with n vertices. The following are equivalent.

(a) T is a tree.

(b) T is connected and acyclic.

(c) T is connected and has n − 1 edges.

(d) T is acyclic and has n − 1 edges.

Partial proof: The proof is as follows. We show

if (a), then (b); if (b), then (c); if (c), then (d); if (d), then (a),

and so all must be equivalent.

We will show:

If (a), then (b).

Let T be a tree. Then T is connected as there is a simple path from any vertex to anyother vertex.

Suppose T contains a cycle. Then by a previous theorem, Theorem 8.2.24, T contains asimple cycle i.e.

C = ( v0, v1, . . . , vn)

with v0 = vn.

C cannot be a loop since T is a simple graph (no loops or parallel edges). Hence C

contains at least two distinct vertices viand v

jwith i < j.

The paths( v

i, v

i+1, . . . , vj) and ( v

i, v

i−1, . . . , v0, vn−1, . . . , vj)

are distinct simple paths from vito v

j, contradicting the definition of a tree.

Therefore, a tree cannot contain a cycle.

7

Spanning Trees Johnsonbaugh 9.3

Definition 9.3.1

A tree T is a spanning tree of a graph G if T is a subgraph of G that contains all of thevertices of G.

Example 1. Johnsonbaugh Example 9.3.2

The spanning tree of the graph below is shown in black. Other spanning trees are possible.

a b

c d

e f

g

h

Theorem 9.3.4

A graph G has a spanning tree if and only if G is connected.

Proof:

If G has a spanning tree, G is connected.

If G is connected, progressively remove edges from cycles until G is acyclic.

This procedure is inefficient in practice.

Algorithm 9.3.6 Breadth-First Search for a Spanning Tree

This algorithm is formally given in Johnsonbaugh, but a more informal description of theprocedure is as follows.

Select an ordering, e.g. abcdefgh of the vertices of G.

Select the first vertex as the root e.g. a.

The tree T initially consists of the single vertex a and no edges.

Add to T all edges (a, x) and the vertices on which they are incident, x = b to h, that donot produce a cycle when added to T . This gives all level 1 vertices.

Repeat with the vertices on level 1, then level 2, and so on, until no further edges can beadded.

8


Select the ordering abcdefgh.

Select a as the root.

Add edges (a, b), (a, c), (a, g).

Add edges for vertices on level 1: b add (b, d)c add (c, e)g none

Add edges for vertices on level 2: d add (d, f)e none

Add edges for vertices on level 3: f add (f, h)

Add edges for vertices on level 4: h none

The spanning tree is shown in the previous diagram.

Algorithm 9.3.7 Depth-First Search for a Spanning Tree


Select an ordering, e.g. abcdefgh of the vertices of G.

Select the first vertex as the root e.g. a, the current vertex.

At each step, add to the tree an edge incident to the current vertex that doesn’t create acycle, using the vertex order to prioritize. Make the new vertex the current vertex.

If an edge can’t be added, backtrack by making the parent the current vertex.

Continue until the current vertex is again the root.


a b

c d

e f

g

h

9

Select the ordering abcdefgh.

Select a as the root.

Add edges (a, b), (b, d), (d, c), (c, e), (e, f), (f, h).

Backtrack to f , then e. Add (e, g).

Backtrack to e, c, d, b, a. Ends.

Minimal Spanning Trees Johnsonbaugh 9.4

Definition 9.4.1

Let G be a weighted graph. A minimal spanning tree of G is a spanning tree of G withminimum weight.

Example 1. There are six cities 1–6, and the costs of building roads between certainpairs of them are shown on the following graph.

1

1

2

2

2

3

3

3

4

4

5

5

6

6

6

Breadth-First Search Select the order 123456.

Select 1 as the root.

Add edges (1, 2), (1, 3), (1, 5).

Add edges for vertices on level 1: 2 add (2, 4)3 add (3, 6)5 none

Add edges for vertices on level 2: 4 none6 none

The weight is 17, and the spanning tree is as shown below.

10

1 2

2

3

3

3

4

4

5

5

6

Prim’s Algorithm Johnsonbaugh Algorithm 9.4.3


The algorithm begins with a single vertex. At each iteration, add to the current tree aminimum weight edge that does not complete a cycle.


Begin with vertex 1. Edges with one vertex in the tree and one vertex not in the tree are:

Edge Weight(1, 2) 4(1, 3) 2(1, 5) 3

Select edge (1, 3)with minimum weight.

Edges with one vertex in the tree and one vertex not in the tree are:

Edge Weight(1, 2) 4(1, 5) 3(3, 4) 1(3, 5) 6(3, 6) 3



Edge Weight(1, 2) 4(1, 5) 3(3, 5) 6(3, 6) 3(4, 2) 5(4, 6) 6

The minimum spanning treewill be constructed wheneither (1, 5) or (3, 6)is selected.Select edge (1, 5).with minimum weight.

11


Edge Weight(1, 2) 4(3, 6) 3(4, 2) 5(4, 6) 6(5, 6) 2



Edge Weight(1, 2) 4(4, 2) 5


The minimal spanning tree, shown below, has weight 12.

1

1

2

2

2

3

3

4

4

5

6

12


Binary Trees

Definition 9.5.1

A binary tree is a rooted tree in which each vertex has either no children, one child, ortwo children. If a vertex has one child, that child is designated as either a left child ora right child (but not both). If a vertex has two children, one child is designated a leftchild and the other child is designated a right child.

Example 1. Johnsonbaugh Example 9.5.2

In the binary tree below, b is the left child of vertex a, and c is the right child of vertex a.

Vertex d is the right child of vertex b; vertex b has no left child.

Vertex e is the left child of vertex c; vertex c has no right child.

a

b c

de

f g

A full binary tree is a binary tree in which each vertex has either two children or zerochildren.

Theorem 9.5.4

If T is a full binary tree with i internal vertices, then T has i + 1 terminal vertices and2i + 1 total vertices.

Proof:

The i internal vertices each have two children, so there are 2i children.

Only the root is a nonchild.

Therefore the total number of vertices is 2i + 1, and the number of terminal vertices is(2i + 1) − i = i + 1.

Example 2. In a single elimination tournament, each contestant is eliminated after oneloss. The tree structure is as shown below. Winners progress to the right, and there is asingle winner eventually at the root.

Some contestants receive byes if there are not initially 2n contestants.

1

Contestant 1

Contestant 2

Contestant 3

Contestant 4

Contestant 5

Contestant 6

Contestant 7

Winner

Theorem 9.5.6

If a binary tree of height h has t terminal vertices, then

lg t ≤ h.

(Or equivalently, t ≤ 2h.)

Proof: We will prove t ≤ 2h by mathematical induction.

Basis Step: If h = 0, the binary tree has a single vertex, and t = 1. Then t = 2h = 20 = 1.

Inductive Step: Assume true for a binary tree with height less than h.

Let T be a binary tree of height t > 0 with t terminal vertices. Suppose that the root ofT has one child. Eliminate the root and edge incident on the root. The remaining treehas height h − 1 and t terminal vertices. By the assumption,

t ≤ 2h−1 < 2h

and case h is true.

Suppose the root of T has children v1 and v2, and subtrees rooted at v1 and v2 haveheights h1 and h2 and terminal vertices number t1 and t2 respectively. Then

h1 ≤ h − 1 and h2 ≤ h − 1,

andt = t1 + t2 ≤ 2h1 + 2h2 ≤ 2h−1+ ≤ 2h−1 = 2h.

Hence the inductive step is verified.

Since the Basis Step and Inductive Step have been verified, the Principle of MathematicalInduction tells us that the theorem is true.

2

Definition 9.5.8

A binary search tree is a binary tree T in which data are associated with the vertices.The data are arranged so that, for each vertex v in T , each data item in the left subtreeof v is less than the data item in v, and each data item in the right subtree of v is greaterthan the data item in v.

Algorithm 9.5.10 in Johnsonbaugh gives a formal approach for constructing a binarysearch tree. A less formal approach is as follows.

• Start with an empty tree (no vertices or edges).

• Inspect the items in order.

• Place the first item in the root.

• Compare each following item in turn with the current vertex, beginning with theroot.

• If item < vertex, move to the left child.

• If item > vertex, move to the right child.

• Continue to compare with each new vertex.

• If there is no child in that position, place the item there.

• Move to the next item and start over, comparing with the root.

Example 3. Build a binary search tree for items in order

o, n, p, d, u, j, t, l, m,

using lexicographic order.

o

n p

d u

j t

l

m

3

Example 4. Build a binary search tree for the words

“by nineteen ninety no Australian child will be living in poverty”.

by

nineteen

ninety

no

Australian

child

will

be

living

in

poverty

Searching a Binary Search Tree

The algorithm is as follows.

• Given a data item D.

• Begin at the root as the current vertex.

• If D < vertex, go to the left child.

• If D > vertex, go to the right child.

• If D = vertex, the data item is found.

• If the child to move to is missing, D is not in the tree.

Worst-case Search Time

The worst-case search is to search the longest path from the root when the item is notpresent.

Suppose that there are n internal vertices, or data items. The terminal vertices wouldcorrespond to missing children. So if the item is not in the tree, we need to check downto the appropriate terminal vertex.

For a full binary tree with n internal vertices, the number of terminal vertices is t = n+1.We know that lg t ≤ h, where h is the height of the tree. Hence the worst-case searchtime is

⌈ lg t ⌉ = ⌈ lg (n + 1) ⌉.

4

For example, if a tree contains a million items, then

⌈ lg 1, 000, 000 ⌉ = 20,

so that a search will find an item, or determine if it is not present, in at most 20 steps.

There are algorithms to minimize the height of a binary search tree.

Tree Traversals Johnsonbaugh 9.6

Bread-first search and depth-first search are ways to traverse a tree in a systematic waysuch that each vertex is visited exactly once. This section considers three other tree-traversal methods, which are defined recursively. Johnsonbaugh gives formal algorithmsfor each of these, we will consider simpler formulations.

Preorder Traversal Recursive algorithm 9.6.1

Preorder the root: process the root, preorder the left child, preorder the right child.

Inorder Traversal Recursive algorithm 9.6.3

Inorder the root: inorder the left child, process the root, inorder the right child.

Postorder Traversal Recursive algorithm 9.6.5

Postorder the root: postorder the left child, postorder the right child, process the root.

Example 5. Consider the following tree.

A

B

C D

E

F

G

H

I J

Preorder: A B F

ABC D F G

ABCD EF G H

ABCD EF G HI J

5

Inorder: B A F

CB D AF G

CBD EAF H G

CBD EAF IHJG

Postorder: B F A

C D B G F A

CED B H G F A

CED B I JHG F A

Arithmetic Expressions

The operators +, −, × and ÷ ( or +, −, ∗, / ) operate on pairs of operands or expressions,where the operator appears between its operands. An example is

(A + B) × C − D ÷ E.

This is the infix form of an expression.

An arithmetic expression can be represented as a binary tree.

The terminal vertices correspond to the operands.

The internal vertices correspond to the operators.

An operator operates on its left and right subtrees.

Example 6. The expression (A + B)×C −D ÷E can be represented as the tree in thediagram below.

A B

C D E+

−

× ÷

For inorder traversal, parentheses are put around each operation. The parentheses dictatethe order of the operations, and the hierarchy of the operators need not be specified. Somepairs of pararentheses may not be necessary.

Inorder:(

× − ÷)

( (

+ × C)

− ( D ÷ E ))

( ( ( A + B ) × C ) − (D ÷ E ) )

6

The preorder traversal is as follows. This is known as the prefix form of the expression orPolish notation. No parentheses are required for unambiguous evaluation.

Preorder: − × ÷

− × + C ÷ D E

− × + A B C ÷ D E

The postorder traversal is as follows. This is known as the postfix form of the expression orreverse Polish notation. Again, no parentheses are required for unambiguous evaluation.

Postorder: × ÷ −

+ C × D E ÷ −

A B + C × D E ÷ −

7

Discrete Math

Documents

p q pqt t tt f ff t

denotedp q

thentheconjunctionofpandqisp

thedisjunctionofpandqisp

thenp q r

conditional propositionp

p qp qwillbeatutorialexercise

conclusions truth value