2주차

Introduction to Probability and Statistics2nd Week (3/15)

1. Basic Probability2. Conditional Probability

DefinitionsVariable is a characteristic that changes or varies

over time and/or for different individuals or objects under consideration

Experimental Units are items or objects on which measurements are taken

Measurement results when a variable is actually measured on an experimental unit

Population is the WHOLE set of all possible measurements

Sample is a subset of population

Examples

• Light bulbs

–Variable=lifetime

–Experimental unit = bulb

–Typical measurements: 1503.1 hrs, 1010.5 hrs

Examples

• Opinion polls

–Variable = opinion

–Experimental unit = person

–Typical Measurements = Magic Johnson, someone else

• Hair color

– Variable = Hair color

– Experimental unit = Person

– Typical Measurements = Brown, black, blonde

Examples

Descriptive Statistics

• When we can enumerate whole population,

We use

• DESCRIPTIVE STATISTICS: Procedures used to summarize and describe the set of measurements.

Inferential Statistics

• When we cannot enumerate the whole population, we use

• INFERENTIAL STATISTICS: Procedures used to draw conclusions or inferences about the population from information contained in the sample.

Objective of Inferential Statistics

• To make inferences about a population from information contained in a sample.

• The statistician’s job is to find the best way to do this.

But, our conclusions could be incorrect… consider this internet opinion poll…

• We need a measure of reliability.

We’ll PAY CASH For Your Opinions!(as much as $50,000 ) Click Here

and sign up FREE!

Who makes the best burgers? Votes Percent

McDonalds 123 Votes 13%

Burger King 384 Votes 39%

Wendy’s 304 Votes 31%

All three have equally good burgers 72 Votes 7%

None of these have good burgers 98 Votes 10%

The Steps in Inferential Statistics

• Define the objective of the experiment and the population of interest

• Determine the design of the experiment and the sampling plan to be used

• Collect and analyze the data• Make inferences about the population from

information in the sample• Determine the goodness or reliability of the

inference.

Key Words

Experimental Unit

Population

Sample

Descriptive Statistics

Inferential Statistics

Mathematical TermsMathematical Terms

Theorem• A statement that has been proven to be true.

Axiom,• Assumptions (often unproven) defining the structures about which

we are reasoning.

Rules of inference• Patterns of logically valid deductions from hypotheses to

conclusions.

Conjecture• A statement whose truth values has not been proven.

(A conjecture may be widely believed to be true, regardless.)

Theory• The set of all theorems that can be proven from a given set of

axioms.

Basic Probability

• Random Experiments- The results will vary from one

performance of the experiment to the next even though most of the conditions are the same

• Example- If we toss a coin, the result of

the experiment is that it will either come up “tails” or “heads

- Unless you are “Harvey Dent”….

Basic Probability

• Sample Spaces- Sample space: A set S that consists of all

possible outcomes of a random experiment- Sample point: An outcome of a random

experiment in a sample space

• Example- If we toss a die, one sample space, or set of

all possible outcomes, is given by {1, 2, 3, 4, 5, 6} while another is {odd, even}.

- It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3.

Toss a coin twice. What is the sample space?

S={(H, H), (H, T), (T, H), (T,T)}

Toss a dice twice. What is the sample space?

(1)

(2)

You have a box containing six cards. You select two cards. What is the sample space if

(1)You do not return the card after the selection(2)You return the card after the selection

Basic Probability

• Graphical representation of a sample space

- If we toss a coin twice and use 0 to represent tails and 1 to represent heads, the sample space can be portrayed by points as follows

Basic Probability

• Event- Event: A subset A of the sample space S (a

set of possible outcomes)- Elementary event: An event consisting of a

single point of S- Sure (certain) event: S itself- Impossible event: An empty set

Basic Probability

• Application of Set Theory

Basic Probability• The Concept of Probability

- CLASSICAL APPROACH: If an event can occur in h different ways out of a total number of n possible ways, all of which are equally likely, then the probability of the event is h/n

- FREQUENCY APPROACH (Empirical probability): If after n repetitions of an experiment, where n is very large, an event is observed to occur in h of these, then the probability of the event is h/n

- AXIOMATIC APPROACH: Since both the classical and frequency approaches have serious drawbacks, mathematicians have been led to an axiomatic approach to probability

Basic Probability

Tossing a coin

No. of Exp. H T H ratio (%) T ratio (%)

Basic Probability

• How to mathematically express probability?

With a sample space S and an event A: P: probability function P(A): probability of the event A

Basic Probability

• AXIOMATIC APPROACH

If these three things are true, P(A) is defined as the probability for event A.

Basic Probability

• Exclusive?

Mutually exclusive

(H,H)

(T,T)

A B

C

S

(H,T)

(T,H)

Example

Independent mutually exclusive

• Events A and ~A are mutually exclusive, but they are NOT independent.

• P(A&~A)= 0• P(A)*P(~A) 0

Conceptually, once A has happened, ~A is impossible; thus, they are completely dependent.

Basic Probability

• Theorems- Theorem 1

- Theorem 2

- Theorem 3

Basic Probability

• Theorems- Theorem 4

- Theorem 5

Basic Probability

• Assignment of Probabilities- Assignment: if we assume equal

probabilities for all simple events (A1, A2, … , An):

- and if A is any event made up of h such simple events, we have

- which is equivalent to the classical approach to probability (i.e. frequency approach).

Basic Probability

• Fundamental Principle of Counting- Combinatorial Analysis: When the number

of sample points in a sample space is very large, direct counting becomes a practical impossibility. In this case, a combinatorial analysis is required.

- Principle of Counting: If one thing can be accomplished in n1 different ways and after this a second thing can be accomplished in n2 different ways, . . . , and finally a kth thing can be accomplished in nk different ways, then all k things can be accomplished in the specified order in n1n2…nk different ways.

Basic Probability

• Fundamental Principle of Counting- Tree Diagram:

Using a probability tree

P(♀♀D=.5)

P(♀♀d=.5)

Mother’s allele

P(♂♂D=.5)

P(♂♂d=.5)

P(♂♂D=.5)

P(♂♂d=.5)

Father’s allele

______________ 1.0

P(DD)=.5*.5=.25

P(Dd)=.5*.5=.25

P(dD)=.5*.5=.25

P(dd)=.5*.5=.25

Child’s outcome

Rule of thumb: in probability, “and” means multiply, “or” means add

Mendel example: What’s the chance of having a heterozygote child (Dd) if both parents are heterozygote (Dd)?

Basic Probability

• Permutations- Suppose that we are given n distinct objects and

wish to arrange r of these objects in a line. Since there are n ways of choosing the 1st object, and after this is done, n + 1 ways of choosing the 2nd object, . . . , and finally n + r - 1 ways of choosing the rth object:

- nPr : The number of permutations of n objects taken r at a time.

Basic Probability

• Permutations- The number of different permutations of a set

consisting of n objects of which n1 are of one type (i.e., indistinguishable from each other), n2 are of a second type, . . . , nk are of a kth type:

• Example. - The number of different arrangements, or

permutations, consisting of 3 letters each that can be formed from the 7 letters A, B, C, D, E, F, G is

Basic Probability

• Combinations- When selecting or choosing objects without regard

to order is required. - nCr : The total number of combinations of r objects

selected from n (also called the combinations of n things taken r at a time).

Basic Probability

• Combinations- Binomial Coefficient: nCr are often called

binomial coefficients because they arise in the binomial expansion

• Example- The number of ways in which 3 cards can be

chosen or selected from a total of 8 different cards is:

Basic Probability

• Stirling’s Approximation to n! - An approximation formula for n!

- Example

James Stirling (1692 ~ 1770) was a Scottish mathematician. The Stirling numbers and Stirling's approximation are named after him.

Conditional Probability

• Definition and Basic Principle- Conditional Probability, P(B|A): the probability of B

given that A has occurred


• Example- Find the probability that a single toss of a die will

result in a number less than 4 if (a) no other information is given and (b) it is given that the toss resulted in an odd number.

The added knowledge that the toss results in an odd number raises the probability from 1/2 to 2/3


• The Multiplication Rule for Conditional Probability:


• Theorems- Theorem 1: For any three events A1, A2, A3, we

have

- In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1 occurs times the probability that A2 occurs given that A1 has occurred times the probability that A3 occurs given that both A1 and A2 have occurred.


• Theorems- Theorem 2: If an event A must result in one of the

mutually exclusive events A1, A2, … , An, then


• Additional theorems

For events A and B and C (P(C) > 0)

(1) P(Φ | C)=0

(2) A, B : mutually exclusive ⇒ P(A B | C)= P(A | C) + P(B | C)∪

(3) P( Ac | C) = 1- P(A | C)

(4) A B P(B-A | C) = P(B | C) - P(A | C), P(A | C) ≤ P(B | C)⊂ ⇒

(5) P(A B | C) = P(A | C) + P(B | C) - P(A ∩ B | C)∪

(6) P(A B | C) ≤ P(A | C) + P(B | C)∪


• Independent Event:- Independent Event: If P(B|A) = P(B), i.e., the

probability of B occurring is not affected by the occurrence or non-occurrence of A:

- For three events

- Conditional probability of independent event


• Prior and Posterior Probability- Prior Probability : The probability of an event

before the result is known - Posterior Probability : The probability of an

event after the result is known - Posterior probability is smaller than the prior

probability. - Computation of posterior probability in more

than one stage

Practice problem

If HIV has a prevalence of 3% in San Francisco, and a particular HIV test has a false positive rate of .001 and a false negative rate of .01, what is the probability that a random person selected off the street will test positive?

Answer

______________ 1.0

P (+, test +)=.0297

P(+, test -)=.003

P(-, test +)=.00097

P(-, test -) = .96903

P(test +)=.0297+.00097=.03067

Marginal probability of carrying the virus.

Joint probability of being + and testing +

P(+&test+)P(+)*P(test+)

.0297 .03*.03067 (=.00092)

Dependent!

Marginal probability of testing positive

Conditional probability: the probability of testing + given that a person is +

P(+)=.03

P(-)=.97

P(test +)=.99

P(test - )= .01

P(test +) = .001

P(test -) = .999

Law of total probability

)P(HIV-)/HIVP(test ))P(HIV/HIVP(test )P(test

.97)(001.)03(.99.)P(test

One of these has to be true (mutually exclusive, collectively exhaustive). They sum to 1.0.

Law of total probability

• Formal Rule: Marginal probability for event A=

)P(B)B|P(A)P(B)B|P(A)P(B)B|P(A P(A) kk2211

exclusive)(mutually 0) and 0.11

ji

k

ii &BP(BB

B2

B3 B1

Where:

%25%)25%)(50()%50)((0(50%)(25%) P(A)

A

Example 2

• A 54-year old woman has an abnormal mammogram; what is the chance that she has breast cancer?

Example: Mammography

______________1.0

P(test +)=.90

P(BC+)=.003

P(BC-)=.997

P(test -) = .10

P(test +) = .11

P (+, test +)=.0027

P(+, test -)=.0003

P(-, test +)=.10967

P(-, test -) = .88733P(test -) = .89

Marginal probabilities of breast cancer….(prevalence among all 54-year olds)

sensitivity

specificity

P(BC/test+)=.0027/(.0027+.10967)=2.4%


• Bayes’ Theorem (theorem on the probability of causes):

- Suppose that A1, A2, … , An are mutually exclusive events whose union is the sample space S, i.e., one of the events must occur. Then if A is any event:

Bayes’ Rule: derivation

)(

)&()/(

BP

BAPBAP

• Definition:Let A and B be two events with P(B) 0. The conditional probability of A given B is:

The idea: if we are given that the event B occurred, the relevant sample space is reduced to B {P(B)=1 because we know B is true} and conditional probability becomes a probability measure on B.

Bayes’ Rule: derivation

can be re-arranged to:

)()/()&( BPBAPBAP

)()/()&( )(

)&()/( APABPBAP

AP

BAPABP

)(

)()/()/(

)()/()()/(

)()/()&()()/(

BP

APABPBAP

APABPBPBAP

APABPBAPBPBAP

)(

)&()/(

BP

BAPBAP

and, since also:

Bayes’ Rule:

)(

)()/()/(

BP

APABPBAP

From the “Law of Total Probability”

OR

)(~)~/()()/(

)()/()/(

APABPAPABP

APABPBAP

Bayes’ Rule:

• Why do we care?? • Why is Bayes’ Rule useful?? • It turns out that sometimes it is very

useful to be able to “flip” conditional probabilities. That is, we may know the probability of A given B, but the probability of B given A may not be obvious. An example will help…


• Example- Three different machines (M1, M2, M3) were

used for producing a large batch of similar items (M1 – 20%, M2 – 30%, M3 – 50%)

- (a) Suppose that 1 % from M1 are defective, 2% from M2 are defective, 3% from M3 are defective.

- (b) You picked one, which was found to be defective

- Question: Probability that this item was produced by machine M2.

In-Class Exercise

• If HIV has a prevalence of 3% in San Francisco, and a particular HIV test has a false positive rate of .001 and a false negative rate of .01, what is the probability that a random person who tests positive is actually infected (also known as “positive predictive value”)?

Answer: using probability tree

______________ 1.0

P(test +)=.99

P(+)=.03

P(-)=.97

P(test - = .01)

P(test +) = .001

P (+, test +)=.0297

P(+, test -)=.003

P(-, test +)=.00097

P(-, test -) = .96903P(test -) = .999

A positive test places one on either of the two “test +” branches. But only the top branch also fulfills the event “true infection.” Therefore, the probability of being infected is the probability of being on the top branch given that you are on one of the two circled branches above.

%8.9600097.0297.

0297.

)(

)&()/(

testP

truetestPtestP

Answer: using Bayes’ rule

%8.96)97(.001.)03(.99.

)03(.99.

)()/()()/(

)()/()/(

truePtruetestPtruePtruetestP

truePtruetestPtesttrueP

Practice problem

An insurance company believes that drivers can be divided into two classes—those that are of high risk and those that are of low risk. Their statistics show that a high-risk driver will have an accident at some time within a year with probability .4, but this probability is only .1 for low risk drivers.

a) Assuming that 20% of the drivers are high-risk, what is the probability that a new policy holder will have an accident within a year of purchasing a policy?

b) If a new policy holder has an accident within a year of purchasing a policy, what is the probability that he is a high-risk type driver?

Answer to (a)

Assuming that 20% of the drivers are of high-risk, what is the probability that a new policy holder will have an accident within a year of purchasing a policy?

Use law of total probability:P(accident)=P(accident/high risk)*P(high risk) + P(accident/low risk)*P(low risk) = .40(.20) + .10(.80) = .08 + .08 = .16

Answer to (b)

If a new policy holder has an accident within a year of purchasing a policy, what is the probability that he is a high-risk type driver?

P(high-risk/accident)=P(accident/high risk)*P(high risk)/P(accident)=.40(.20)/.16 = 50%

Or use tree:

P(accident/LR)=.1

______________1.0

P( no acc/HR)=.6

P(accident/HR)=.4

P(high risk)=.20

P(accident, high risk)=.08

P(no accident, high risk)=.12)

P(accident, low risk)=.08P(low risk)=.80

P( no accident/LR)=.9

P(no accident, low risk)=.72

P(high risk/accident)=.08/.16=50%

2주차

Documents

false negative

false positive

risk type

mutually exclusive

low risk

sample space

bayes rule

accidenthigh