Introduction to Probability and Statistics 6 th Week (4/12) Special Probability Distributions (1)
Introduction to Probability and Statistics6th Week (4/12)
Special Probability Distributions (1)
Pafnuty Lvovich Chebyshev (1821 –1894)
Chebyshev’s inequality guarantees that in any data sample or probability distribution, "nearly all" values are close to the mean
The precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean.
The inequality has great utility because it can be applied to completely arbitrary distributions (unknown except for mean and variance), for example it can be used to prove the weak law of large numbers.
Chebyshev’s Inequality
Chebyshev’s Inequality
Law of Large Numbers
The law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times.
According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
Law of Large Numbers
Law of Large Numbers
Why is it important?
Other Measures of Central Tendency
Other Measures of Central Tendency
Other Measures of Central Tendency
Percentiles
Percentiles: A Practical Example
Other Measures of Dispersion
Skewness
Kurtosis
Skewness, Kurtosis, and Moment
Discrete Probability Distribution
What kinds of PD do we have to know to solve real-world problems?
Discrete Uniform Distribution
• Consider a case with rolling a fair dice
• Each random variable has same probability → Uniform distribution
Discrete Uniform Distribution
• Probability density function :
• Expectation:
• Variance :
Suppose that we have a box containing 45 numbered balls. In this case, we randomly select a ball and its number is X:
(1) Probability distribution for X (2) Expectation and Variance for X(3) P(X>40)
(2)
(3)
• Example
• Solution
(1)
(Discrete) Binomial Distribution
Bernoulli experiment: Only two kinds of results are possible
p = 0.85q = 1- p = 0.15
(Discrete) Binomial Distribution
Binomial Distribution
(Discrete) Binomial Distribution
Some Properties of the Binomial Distribution
(1)
(2)
(3)
(4)
(Discrete) Binomial Distribution
Some Properties of the Binomial Distribution
μ = n/2 을 중심으로 좌우대칭 :
대칭이항분포 (symmetric binomial distribution)
Tail in right
Tail in left
(Discrete) Binomial Distribution
Some Properties of the Binomial Distribution
(Discrete) Binomial Distribution
(Discrete) Binomial Distribution
X (from S) and Y (from L) :
Two factories, S and L, produce smart phones and their failure ratios are 5%. If you buy 7 phones from S and 13 phones from L, what is the probability to have at least one failed phone? And what is the probability that you have one failed phone? Assume that the failure rates are independent.
X B (7, 0.05) , Y B (13, 0.05) , X, Y : Independent∼ ∼
X + Y B (20, 0.05)∼
Only one phone is failed
At least one phone is failed
Example
Solution
Criteria for a Binomial Probability ExperimentCriteria for a Binomial Probability Experiment
An experiment is said to be a binomial experiment provided
1. The experiment is performed a fixed number of times. Each repetition of the experiment is called a trial.
2. The trials are independent. This means the outcome of one trial will not affect the outcome of the other trials.
3. For each trial, there are two mutually exclusive outcomes, success or failure.
4. The probability of success is fixed for each trial of the experiment.
Notation Used in the Notation Used in the Binomial Probability DistributionBinomial Probability Distribution
• There are n independent trials of the experiment
• Let p denote the probability of success so that 1 – p is the probability of failure.
• Let x denote the number of successes in n independent trials of the experiment. So, 0 < x < n.
EXAMPLE Identifying Binomial Experiments
Which of the following are binomial experiments?
(a) A player rolls a pair of fair die 10 times. The number X of 7’s rolled is recorded.
(b) The 11 largest airlines had an on-time percentage of 84.7% in November, 2001 according to the Air Travel Consumer Report. In order to assess reasons for delays, an official with the FAA randomly selects flights until she finds 10 that were not on time. The number of flights X that need to be selected is recorded.
(c ) In a class of 30 students, 55% are female. The instructor randomly selects 4 students. The number X of females selected is recorded.
EXAMPLE Constructing a Binomial Probability Distribution
According to the Air Travel Consumer Report, the 11 largest air carriers had an on-time percentage of 84.7% in November, 2001. Suppose that 4 flights are randomly selected from November, 2001 and the number of on-time flights X is recorded. Construct a probability distribution for the random variable X using a tree diagram.
(Discrete) Multinomial Distribution
(Discrete) Geometric Distribution
Repeat Bernoulli experiments until the first success. => Number of Trial is X
Slot Machine:
How many should I try if I get the jackpot?
(Discrete) Geometric Distribution
Repeat Bernoulli experiments until the first success. => Number of Trial is X
: 성공 : 실패
(Discrete) Geometric Distribution
(Discrete) Negative Binomial Distribution
Repeat Bernoulli experiments until the rth success.
Crane Game:
How many should I try if I want to get three dolls?
(Discrete) Negative Binomial Distribution
Repeat Bernoulli experiments until the rth success.
(Discrete) Negative Binomial Distribution
(Discrete) Hypergeometric Distribution
It is similar to the binomial distribution. But the difference is the method of sampling
Binomial experiment: Sampling with replacementHypergeometric experiment: Sampling without replacement
Russian rouletteNormal shooting
Each trial has same probability Each trial may have different probability
(Discrete) Hypergeometric Distribution
개
개
개개
N개의 items
n 개의 items추출
A box contains N balls, where r balls are white (r<N)Suppose that we randomly select n balls from the box, what is the number of white balls (X)?
Assumption: Sampling without replacement
(Discrete) Hypergeometric Distribution
(Discrete) Hypergeometric Distribution
(1) Random variable: X
(3) N = 50, r = 4, n = 5
(2)
Total 50 chips are in a box. Among those, 4 are out of order (failed chips). If you select 5 chips: (1) Probability distribution for the failed chip in these selected chips(2) Probability to have one or two failed chips for this case(3) Mathematical expectation and variance
Multivariate Hypergeometric Distribution
개
개개
개
개개
X1 , X2 , X3 : Joint Probability Function
(1) Joint probability function:
5 개2 개3 개
4개
x 개y 개z 개
(2)
In a box, there are 3 red balls, 2 blue balls, and 5 yellow balls. You select 4 balls.
(1) Joint probability function for X, Y, and Z(2) Probability to select 1 red ball, 1 blue ball, and 2 yellow balls.
(Discrete) Poisson Distribution
- Describe an event that rarely happens. - All events in a specific period are mutually independent.- The probability to occur is proportional to the length of the period.- The probability to occur twice is zero if the period is short.
(Discrete) Poisson Distribution
It is often used as a model for the number of events (such as the number of telephone calls at a business, number of customers in waiting lines, number of defects in a given surface area, airplane arrivals, or the number of accidents at an intersection) in a specific time period.
If z > 0
Satisfy the PF condition
Probability function :
(Discrete) Poisson Distribution
Ex.1. On an average Friday, a waitress gets no tip from 5 customers. Find the probability that she will get no tip from 7 customers this Friday.
The waitress averages 5 customers that leave no tip on Fridays: λ = 5. Random Variable : The number of customers that leave her no tip this Friday. We are interested in P(X = 7).
Ex. 2 During a typical football game, a coach can expect 3.2 injuries. Find the probability that the team will have at most 1 injury in this game.
A coach can expect 3.2 injuries : λ = 3.2. Random Variable : The number of injuries the team has in this game. We are interested in
.
(Discrete) Poisson Distribution.
Ex. 3. A small life insurance company has determined that on the average it receives 6 death claims per day. Find the probability that the company receives at least seven death claims on a randomly selected day.
P(x ≥ 7) = 1 - P(x ≤ 6) = 0.393697
Ex. 4. The number of traffic accidents that occurs on a particular stretch of road during a month follows a Poisson distribution with a mean of 9.4. Find the probability that less than two accidents will occur on this stretch of road during a randomly selected month.
P(x < 2) = P(x = 0) + P(x = 1) = 0.000860
(Discrete) Poisson Distribution
E(X) increases with parameter or .The graph becomes broadened with increasing the parameter or
Characteristics of Poisson Distribution
Probability mass function Cumulative distribution function
(Discrete) Poisson Distribution
(Discrete) Poisson Distribution
(Discrete) Poisson Distribution
(Discrete) Poisson Distribution
Comparison of the Poisson distribution (black dots) and the binomial distribution with n=10 (red line), n=20 (blue line), n=1000 (green line). All distributions have a mean of 5. The horizontal axis shows the number of events k. Notice that as n gets larger, the Poisson distribution becomes an increasingly better approximation for the binomial distribution with the same mean
Discrete Probability Distributions: Summary
• Uniform Distribution
• Binomial Distributions
• Multinomial Distributions
• Geometric Distributions
• Negative Binomial Distributions
• Hypergeometric Distributions
• Poisson Distribution
Continuous Probability Distributions
What kinds of PD do we have to know to solve real-world problems?
(Continuous) Uniform Distribution
f(x)
In a Period [a, b], f(x) is constant.
E(x):
(Continuous) Uniform Distributions
Var(X) :
F(X) :
If X U(0, 1) and∼ Y = a + (b - a) X, (1)Distribution function for Y(2)Probability function for Y(3)Expectation and Variance for Y(4) Centered value for Y
(1)
Since y = a + (b - a) x so 0 ≤ y ≤ b,
(2)
(3)
(4)
(Continuous) Uniform Distributions
(Continuous) Exponential Distribution
▶ Analysis of survival rate
▶ Period between first and second earthquakes
▶ Waiting time for events of Poisson distribution
For any positive
(Continuous) Exponential Distribution
(3) μ=1/3, accordingly 10 days.
(1)
(2)
From a survey, the frequency of traffic accidents X is given by
f(x) = 3e-3x
(0 ≤ x)
(1)Probability to observe the second accident after one month of the first
accident?
(2)Probability to observe the second accident within 2 months
(3)Suppose that a month is 30 days, what is the average day of the
accident?
• Survival function :
• Hazard rate, Failure rate:
λ=0.01 이므로 분포함수와 생존함수 :
F(x)=1-e-x/100
, S(x)=e-x/100
(1) 이 환자가 150 일 이내에 사망할 확률 :
(2) 이 환자가 200 일 이상 생존할 확률
A patient was told that he can survive average of 100 days. Suppose that the probability function is given by
(1) What is the probability that he dies within 150 days.(2) What is the probability that he survives 200 days
P(X < 150) = F(150) = 1-e-1.5
= 1-0.2231 = 0.7769
P(X ≥ 200) = S(200) = e-2.0
= 0.1353
(Continuous) Exponential Distribution
(1) If an event occurs according to Poisson process with the ratio λ , the waiting
time between neighboring events (T) follows exponential distribution with the
exponent of λ.
⊙ Relation with Poisson Process
(Continuous) Gamma Distribution
(Continuous) Gamma Distribution
α : shape parameter, α > 0β : scale parameter, β > 0
α = 1 Γ (1, β) = E(1/β)
(Continuous) Gamma Distribution
(Continuous) Gamma Distribution
IF X1 , X2 , … , Xn have independent exponential distribution with the same
exponent 1/β, the sum of these random variables S= X1 + X2 + … +Xn results in
a gamma distribution, Γ(n, β).
⊙ Relation with Exponential Distribution
Exponential distribution is a special gamma distribution with = 1.
X1 : Time for the first accidentX2 : Time between the first and second accidents
Xi Exp(1/3) , I = 1, 2∼
S = X1 + X2 : Time for two accidents
S ∼ Γ(2, 1/3) Probability function for S :
Answer:
If the time to observe an traffic accident (X) in a region have the following probability distribution
f(x) = 3e-3x
, 0 < x < ∞
Estimate the probability to observe the first two accidents between the first and second months. Assume that the all accidents are independent.
(Continuous) Chi Square Distribution
A special gamma distribution α = r/2, β = 2
PD
E(X)
Var(X)
(Continuous) Chi Square Distribution
(Continuous) Chi Square Distribution
(Continuous) Chi Square Distribution
(Continuous) Chi Square Distribution
(Continuous) Chi Square Distribution
Since P(X < x0 )=0.95, P(X > x0 )=0.05.
From the table, find the point with d.f.=5 and α=0.05
A random variable X follows a Chi Square Distribution with a degree of
freedom of 5, Calculate the critical value to satisfy P(X < x0 )=0.95
(Continuous) Chi Square Distribution
Why do we have to be bothered?