Introduction to Probability and Statistics 2 nd Week (3/15) 1. Basic Probability 2. Conditional Probability
Introduction to Probability and Statistics2nd Week (3/15)
1. Basic Probability2. Conditional Probability
DefinitionsVariable is a characteristic that changes or varies
over time and/or for different individuals or objects under consideration
Experimental Units are items or objects on which measurements are taken
Measurement results when a variable is actually measured on an experimental unit
Population is the WHOLE set of all possible measurements
Sample is a subset of population
Examples
• Light bulbs
–Variable=lifetime
–Experimental unit = bulb
–Typical measurements: 1503.1 hrs, 1010.5 hrs
Examples
• Opinion polls
–Variable = opinion
–Experimental unit = person
–Typical Measurements = Magic Johnson, someone else
• Hair color
– Variable = Hair color
– Experimental unit = Person
– Typical Measurements = Brown, black, blonde
Examples
Descriptive Statistics
• When we can enumerate whole population,
We use
• DESCRIPTIVE STATISTICS: Procedures used to summarize and describe the set of measurements.
Inferential Statistics
• When we cannot enumerate the whole population, we use
• INFERENTIAL STATISTICS: Procedures used to draw conclusions or inferences about the population from information contained in the sample.
Objective of Inferential Statistics
• To make inferences about a population from information contained in a sample.
• The statistician’s job is to find the best way to do this.
But, our conclusions could be incorrect… consider this internet opinion poll…
• We need a measure of reliability.
We’ll PAY CASH For Your Opinions!(as much as $50,000 ) Click Here
and sign up FREE!
Who makes the best burgers? Votes Percent
McDonalds 123 Votes 13%
Burger King 384 Votes 39%
Wendy’s 304 Votes 31%
All three have equally good burgers 72 Votes 7%
None of these have good burgers 98 Votes 10%
The Steps in Inferential Statistics
• Define the objective of the experiment and the population of interest
• Determine the design of the experiment and the sampling plan to be used
• Collect and analyze the data• Make inferences about the population from
information in the sample• Determine the goodness or reliability of the
inference.
Key Words
Experimental Unit
Population
Sample
Descriptive Statistics
Inferential Statistics
Mathematical TermsMathematical Terms
Theorem• A statement that has been proven to be true.
Axiom,• Assumptions (often unproven) defining the structures about which
we are reasoning.
Rules of inference• Patterns of logically valid deductions from hypotheses to
conclusions.
Conjecture• A statement whose truth values has not been proven.
(A conjecture may be widely believed to be true, regardless.)
Theory• The set of all theorems that can be proven from a given set of
axioms.
Basic Probability
• Random Experiments- The results will vary from one
performance of the experiment to the next even though most of the conditions are the same
• Example- If we toss a coin, the result of
the experiment is that it will either come up “tails” or “heads
- Unless you are “Harvey Dent”….
Basic Probability
• Sample Spaces- Sample space: A set S that consists of all
possible outcomes of a random experiment- Sample point: An outcome of a random
experiment in a sample space
• Example- If we toss a die, one sample space, or set of
all possible outcomes, is given by {1, 2, 3, 4, 5, 6} while another is {odd, even}.
- It is clear, however, that the latter would not be adequate to determine, for example, whether an outcome is divisible by 3.
Toss a coin twice. What is the sample space?
S={(H, H), (H, T), (T, H), (T,T)}
Toss a dice twice. What is the sample space?
(1)
(2)
You have a box containing six cards. You select two cards. What is the sample space if
(1)You do not return the card after the selection(2)You return the card after the selection
Basic Probability
• Graphical representation of a sample space
- If we toss a coin twice and use 0 to represent tails and 1 to represent heads, the sample space can be portrayed by points as follows
Basic Probability
• Event- Event: A subset A of the sample space S (a
set of possible outcomes)- Elementary event: An event consisting of a
single point of S- Sure (certain) event: S itself- Impossible event: An empty set
Basic Probability
• Application of Set Theory
Basic Probability• The Concept of Probability
- CLASSICAL APPROACH: If an event can occur in h different ways out of a total number of n possible ways, all of which are equally likely, then the probability of the event is h/n
- FREQUENCY APPROACH (Empirical probability): If after n repetitions of an experiment, where n is very large, an event is observed to occur in h of these, then the probability of the event is h/n
- AXIOMATIC APPROACH: Since both the classical and frequency approaches have serious drawbacks, mathematicians have been led to an axiomatic approach to probability
Basic Probability
Tossing a coin
No. of Exp. H T H ratio (%) T ratio (%)
Basic Probability
• How to mathematically express probability?
With a sample space S and an event A: P: probability function P(A): probability of the event A
Basic Probability
• AXIOMATIC APPROACH
If these three things are true, P(A) is defined as the probability for event A.
Basic Probability
• Exclusive?
Mutually exclusive
(H,H)
(T,T)
A B
C
S
(H,T)
(T,H)
Example
Independent mutually exclusive
• Events A and ~A are mutually exclusive, but they are NOT independent.
• P(A&~A)= 0• P(A)*P(~A) 0
Conceptually, once A has happened, ~A is impossible; thus, they are completely dependent.
Basic Probability
• Theorems- Theorem 1
- Theorem 2
- Theorem 3
Basic Probability
• Theorems- Theorem 4
- Theorem 5
Basic Probability
• Assignment of Probabilities- Assignment: if we assume equal
probabilities for all simple events (A1, A2, … , An):
- and if A is any event made up of h such simple events, we have
- which is equivalent to the classical approach to probability (i.e. frequency approach).
Basic Probability
• Fundamental Principle of Counting- Combinatorial Analysis: When the number
of sample points in a sample space is very large, direct counting becomes a practical impossibility. In this case, a combinatorial analysis is required.
- Principle of Counting: If one thing can be accomplished in n1 different ways and after this a second thing can be accomplished in n2 different ways, . . . , and finally a kth thing can be accomplished in nk different ways, then all k things can be accomplished in the specified order in n1n2…nk different ways.
Basic Probability
• Fundamental Principle of Counting- Tree Diagram:
Using a probability tree
P(♀♀D=.5)
P(♀♀d=.5)
Mother’s allele
P(♂♂D=.5)
P(♂♂d=.5)
P(♂♂D=.5)
P(♂♂d=.5)
Father’s allele
______________ 1.0
P(DD)=.5*.5=.25
P(Dd)=.5*.5=.25
P(dD)=.5*.5=.25
P(dd)=.5*.5=.25
Child’s outcome
Rule of thumb: in probability, “and” means multiply, “or” means add
Mendel example: What’s the chance of having a heterozygote child (Dd) if both parents are heterozygote (Dd)?
Basic Probability
• Permutations- Suppose that we are given n distinct objects and
wish to arrange r of these objects in a line. Since there are n ways of choosing the 1st object, and after this is done, n + 1 ways of choosing the 2nd object, . . . , and finally n + r - 1 ways of choosing the rth object:
- nPr : The number of permutations of n objects taken r at a time.
Basic Probability
• Permutations- The number of different permutations of a set
consisting of n objects of which n1 are of one type (i.e., indistinguishable from each other), n2 are of a second type, . . . , nk are of a kth type:
• Example. - The number of different arrangements, or
permutations, consisting of 3 letters each that can be formed from the 7 letters A, B, C, D, E, F, G is
Basic Probability
• Combinations- When selecting or choosing objects without regard
to order is required. - nCr : The total number of combinations of r objects
selected from n (also called the combinations of n things taken r at a time).
Basic Probability
• Combinations- Binomial Coefficient: nCr are often called
binomial coefficients because they arise in the binomial expansion
• Example- The number of ways in which 3 cards can be
chosen or selected from a total of 8 different cards is:
Basic Probability
• Stirling’s Approximation to n! - An approximation formula for n!
- Example
James Stirling (1692 ~ 1770) was a Scottish mathematician. The Stirling numbers and Stirling's approximation are named after him.
Conditional Probability
• Definition and Basic Principle- Conditional Probability, P(B|A): the probability of B
given that A has occurred
Conditional Probability
• Example- Find the probability that a single toss of a die will
result in a number less than 4 if (a) no other information is given and (b) it is given that the toss resulted in an odd number.
The added knowledge that the toss results in an odd number raises the probability from 1/2 to 2/3
Conditional Probability
• The Multiplication Rule for Conditional Probability:
Conditional Probability
• Theorems- Theorem 1: For any three events A1, A2, A3, we
have
- In words, the probability that A1 and A2 and A3 all occur is equal to the probability that A1 occurs times the probability that A2 occurs given that A1 has occurred times the probability that A3 occurs given that both A1 and A2 have occurred.
Conditional Probability
• Theorems- Theorem 2: If an event A must result in one of the
mutually exclusive events A1, A2, … , An, then
Conditional Probability
• Additional theorems
For events A and B and C (P(C) > 0)
(1) P(Φ | C)=0
(2) A, B : mutually exclusive ⇒ P(A B | C)= P(A | C) + P(B | C)∪
(3) P( Ac | C) = 1- P(A | C)
(4) A B P(B-A | C) = P(B | C) - P(A | C), P(A | C) ≤ P(B | C)⊂ ⇒
(5) P(A B | C) = P(A | C) + P(B | C) - P(A ∩ B | C)∪
(6) P(A B | C) ≤ P(A | C) + P(B | C)∪
Conditional Probability
• Independent Event:- Independent Event: If P(B|A) = P(B), i.e., the
probability of B occurring is not affected by the occurrence or non-occurrence of A:
- For three events
- Conditional probability of independent event
Conditional Probability
• Prior and Posterior Probability- Prior Probability : The probability of an event
before the result is known - Posterior Probability : The probability of an
event after the result is known - Posterior probability is smaller than the prior
probability. - Computation of posterior probability in more
than one stage
Practice problem
If HIV has a prevalence of 3% in San Francisco, and a particular HIV test has a false positive rate of .001 and a false negative rate of .01, what is the probability that a random person selected off the street will test positive?
Answer
______________ 1.0
P (+, test +)=.0297
P(+, test -)=.003
P(-, test +)=.00097
P(-, test -) = .96903
P(test +)=.0297+.00097=.03067
Marginal probability of carrying the virus.
Joint probability of being + and testing +
P(+&test+)P(+)*P(test+)
.0297 .03*.03067 (=.00092)
Dependent!
Marginal probability of testing positive
Conditional probability: the probability of testing + given that a person is +
P(+)=.03
P(-)=.97
P(test +)=.99
P(test - )= .01
P(test +) = .001
P(test -) = .999
Law of total probability
)P(HIV-)/HIVP(test ))P(HIV/HIVP(test )P(test
.97)(001.)03(.99.)P(test
One of these has to be true (mutually exclusive, collectively exhaustive). They sum to 1.0.
Law of total probability
• Formal Rule: Marginal probability for event A=
)P(B)B|P(A)P(B)B|P(A)P(B)B|P(A P(A) kk2211
exclusive)(mutually 0) and 0.11
ji
k
ii &BP(BB
B2
B3 B1
Where:
%25%)25%)(50()%50)((0(50%)(25%) P(A)
A
Example 2
• A 54-year old woman has an abnormal mammogram; what is the chance that she has breast cancer?
Example: Mammography
______________1.0
P(test +)=.90
P(BC+)=.003
P(BC-)=.997
P(test -) = .10
P(test +) = .11
P (+, test +)=.0027
P(+, test -)=.0003
P(-, test +)=.10967
P(-, test -) = .88733P(test -) = .89
Marginal probabilities of breast cancer….(prevalence among all 54-year olds)
sensitivity
specificity
P(BC/test+)=.0027/(.0027+.10967)=2.4%
Conditional Probability
• Bayes’ Theorem (theorem on the probability of causes):
- Suppose that A1, A2, … , An are mutually exclusive events whose union is the sample space S, i.e., one of the events must occur. Then if A is any event:
Bayes’ Rule: derivation
)(
)&()/(
BP
BAPBAP
• Definition:Let A and B be two events with P(B) 0. The conditional probability of A given B is:
The idea: if we are given that the event B occurred, the relevant sample space is reduced to B {P(B)=1 because we know B is true} and conditional probability becomes a probability measure on B.
Bayes’ Rule: derivation
can be re-arranged to:
)()/()&( BPBAPBAP
)()/()&( )(
)&()/( APABPBAP
AP
BAPABP
)(
)()/()/(
)()/()()/(
)()/()&()()/(
BP
APABPBAP
APABPBPBAP
APABPBAPBPBAP
)(
)&()/(
BP
BAPBAP
and, since also:
Bayes’ Rule:
)(
)()/()/(
BP
APABPBAP
From the “Law of Total Probability”
OR
)(~)~/()()/(
)()/()/(
APABPAPABP
APABPBAP
Bayes’ Rule:
• Why do we care?? • Why is Bayes’ Rule useful?? • It turns out that sometimes it is very
useful to be able to “flip” conditional probabilities. That is, we may know the probability of A given B, but the probability of B given A may not be obvious. An example will help…
Conditional Probability
• Example- Three different machines (M1, M2, M3) were
used for producing a large batch of similar items (M1 – 20%, M2 – 30%, M3 – 50%)
- (a) Suppose that 1 % from M1 are defective, 2% from M2 are defective, 3% from M3 are defective.
- (b) You picked one, which was found to be defective
- Question: Probability that this item was produced by machine M2.
In-Class Exercise
• If HIV has a prevalence of 3% in San Francisco, and a particular HIV test has a false positive rate of .001 and a false negative rate of .01, what is the probability that a random person who tests positive is actually infected (also known as “positive predictive value”)?
Answer: using probability tree
______________ 1.0
P(test +)=.99
P(+)=.03
P(-)=.97
P(test - = .01)
P(test +) = .001
P (+, test +)=.0297
P(+, test -)=.003
P(-, test +)=.00097
P(-, test -) = .96903P(test -) = .999
A positive test places one on either of the two “test +” branches. But only the top branch also fulfills the event “true infection.” Therefore, the probability of being infected is the probability of being on the top branch given that you are on one of the two circled branches above.
%8.9600097.0297.
0297.
)(
)&()/(
testP
truetestPtestP
Answer: using Bayes’ rule
%8.96)97(.001.)03(.99.
)03(.99.
)()/()()/(
)()/()/(
truePtruetestPtruePtruetestP
truePtruetestPtesttrueP
Practice problem
An insurance company believes that drivers can be divided into two classes—those that are of high risk and those that are of low risk. Their statistics show that a high-risk driver will have an accident at some time within a year with probability .4, but this probability is only .1 for low risk drivers.
a) Assuming that 20% of the drivers are high-risk, what is the probability that a new policy holder will have an accident within a year of purchasing a policy?
b) If a new policy holder has an accident within a year of purchasing a policy, what is the probability that he is a high-risk type driver?
Answer to (a)
Assuming that 20% of the drivers are of high-risk, what is the probability that a new policy holder will have an accident within a year of purchasing a policy?
Use law of total probability:P(accident)=P(accident/high risk)*P(high risk) + P(accident/low risk)*P(low risk) = .40(.20) + .10(.80) = .08 + .08 = .16
Answer to (b)
If a new policy holder has an accident within a year of purchasing a policy, what is the probability that he is a high-risk type driver?
P(high-risk/accident)=P(accident/high risk)*P(high risk)/P(accident)=.40(.20)/.16 = 50%
Or use tree:
P(accident/LR)=.1
______________1.0
P( no acc/HR)=.6
P(accident/HR)=.4
P(high risk)=.20
P(accident, high risk)=.08
P(no accident, high risk)=.12)
P(accident, low risk)=.08P(low risk)=.80
P( no accident/LR)=.9
P(no accident, low risk)=.72
P(high risk/accident)=.08/.16=50%