1 กกกกกกกกกกกกกกกกกกกกกกกกก กกกกกกกกกกกกก ก The Normal Distribution and Other Continuous Distributions
Dec 30, 2015
1
การแจกแจงความน่�าจะเป็�น่แบบต่�อเน่��องต่�าง ๆ
The Normal Distribution and Other Continuous Distributions
2
การแจกแจงความน่�าจะเป็�น่
Continuous Probability
Distributions
Binomial
Hypergeometric
Poisson
Probability Distributions
Discrete Probability
Distributions
Normal
Uniform
Exponential
3
Continuous Probability Distributions
A continuous random variable หมายถึ�งตั�วแป็รสุ่��มที่��ม�ค�าตั�อเน่��อง หร�อ เป็�น่เศษสุ่�วน่ได้" เช่�น่ ความหน่าของช่%&น่งาน่ เวลาการที่(างาน่ อ�ณหภู+ม% ความสุ่+ง
ระด้�บความระเอ�ยด้ของค�าที่��ว�ด้ได้"ข�&น่ก�บความสุ่ามารถึของเคร��องม�อว�ด้
4
The Normal Distribution
ร�ป็ระฆั�งคว��า (Bell Shaped) สมมาต่ร (Symmetrical) Mean, Median และ Mode ม�ค�าเท่�าก�น่
ต่�าแหน่�งของค�ากลางว�ดด"วยค�าเฉล��ย (mean, μ)
การกระจายต่�วว�ดด"วยค�าเบ��ยงเบน่มาต่รฐาน่ (standard deviation, σ)
ต่�วแป็รม�ค�าใน่ช่�วง + to
Mean = Median = Mode
X
f(X)
μ
σ
5
The Normal Distribution Shape
X
f(X)
μ
σ
การเป็ล��ยน่ค�า μ จะที่(าให"ร+ป็การกระจายตั�วเล��อน่ไป็ที่างซ้"ายหร�อขวา
การเป็ล��ยน่ค�า σ หมายถึ�งการเพิ่%�มหร�อลด้ของความผั�น่แป็ร และที่(าให"ความสุ่+งของการกระจายตั�วเป็ล��ยน่ไป็
6
The Normal Probability Density Function
ฟั2งก3ช่� �น่ความหน่าแน่�น่ (probability density function, pdf)
เม��อ e = ค�าคงที่��ที่างคณ%ตัศาสุ่ตัร3 ม�ค�าป็ระมาณ 2.71828
π = ค�าคงที่��ที่างคณ%ตัศาสุ่ตัร3 ม�ค�าป็ระมาณ 3.14159
μ = ค�าเฉล��ยของป็ระช่ากร (population mean)
σ = ค�าเบ��ยงเบน่มาตัรฐาน่ของป็ระช่ากร (population standard deviation)
X = ตั�วแป็รสุ่��มแบบตั�อเน่��อง
2μ)/σ](1/2)[(Xe2π
1f(X)
7
การแจกแจงแบบป็กตั%มาตัรฐาน่The Standardized Normal
ตั�วแป็รสุ่��มที่��แจกแจงแบบ normal (X) ที่�กตั�วสุ่ามารถึแป็ลงให"เป็�น่ตั�วแป็รสุ่��มที่��ม�การแจกแจงแบบป็กตั%มาตัรฐาน่ standardized normal distribution (Z) ได้"
8
Translation to the Standardized Normal Distribution
แป็ลง X เป็�น่ Z โด้ย subtracting the mean of X and dividing by its standard deviation ด้�งน่�&:
σ
μXZ
ตั�วแป็รสุ่��ม Z ม�ค�า mean = 0 และ standard deviation = 1 เสุ่มอ
9
The Standardized Normal Probability Density Function
probability density function ของตั�วแป็รสุ่��ม Z
เม��อ e = ค�าคงที่��ที่างคณ%ตัศาสุ่ตัร3 ม�ค�าป็ระมาณ 2.71828
π = ค�าคงที่��ที่างคณ%ตัศาสุ่ตัร3 ม�ค�าป็ระมาณ 3.14159Z = ตั�วแป็รสุ่��มแบบ standardized normal distribution
2(1/2)Ze2π
1f(Z)
10
The Standardized Normal Distribution
อาจเร�ยกว�า “Z” distribution Mean = 0 Standard Deviation = 1
Z
f(Z)
0
1
Values above the mean have positive Z-values, values below the mean have negative Z-values
11
Example
ถึ"า X แจกแจงแบบป็กตั% (normally distributed) ม�ค�า mean = 100 และ standard deviation = 50, จะได้"ค�า Z สุ่(าหร�บ X = 200 ค�อ
หมายถึ�งค�า X = 200 ม�ค�าสุ่+งกว�าค�าเฉล��ยไป็ 2 เที่�าของค�าเบ��ยงเบน่มาตัรฐาน่
2.050
100200
σ
μXZ
12
เป็ร�ยบเที่�ยบระหว�าง X และ Z units
Z100
2.00200 X
Note that the distribution is the same, only the scale has changed. We can express the problem in original units (X) or in standardized units (Z)
(μ = 100, σ = 50)
(μ = 0, σ = 1)
13
การค(าน่วณความน่�าจะเป็�น่ของการแจกแจงแบบป็กตั%
Probability is the area under thecurve!
a b X
f(X) P a X b( )≤
Probability ว�ด้ได้"จากพิ่�&น่ที่��ใตั"กราฟั (area under the curve)
≤
P a X b( )<<=(Note that the probability of any individual value is zero)
14
f(X)
Xμ
Probability as Area Under the Curve
0.50.5
The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below
1.0)XP(
0.5)XP(μ 0.5μ)XP(
15
Empirical Rules
μ ± 1σ encloses about 68% of X’s
f(X)
Xμ μ+1σμ-1σ
What can we say about the distribution of values around the mean? There are some general rules:
σσ
68.26%
16
The Empirical Rule
μ ± 2σ covers about 95% of X’s
μ ± 3σ covers about 99.7% of X’s
xμ
2σ 2σ
xμ
3σ 3σ
95.44% 99.72%
(continued)
17
The Standardized Normal Table
การหาค�าความน่�าจะเป็�น่สุ่ามารถึที่(าได้"โด้ยการใช่"ตัารางป็กตั%มาตัรฐาน่
Z0 2.00
.9772Example:
P(Z < 2.00) = .9772
18
การใช่"ตัารางป็กตั%มาตัรฐาน่
The value within the table gives the probability from Z = up to the desired Z value
.9772
2.0P(Z < 2.00) = .9772
The row shows the value of Z to the first decimal point
The column gives the value of Z to the second decimal point
2.0
.
.
.
(continued)
Z 0.00 0.01 0.02 …
0.0
0.1
19
ข�&น่ตัอน่ที่��วไป็ของการค(าน่วณความน่�าจะเป็�น่ของตั�วแป็รสุ่��มที่��แจกแจงแบบ
ป็กตั%
วาด้ร+ป็ normal curve บน่สุ่เกล X
แป็ลงค�าตั�วแป็รสุ่��ม X เป็�น่ตั�วแป็รสุ่��ม Z
หาความน่�าจะเป็�น่จาก Standardized Normal Table
จงหา P(a < X < b) เม��อ X is distributed normally:
20
Finding Normal Probabilities
Suppose X is normal with mean 8.0 and standard deviation 5.0
Find P(X < 8.6)
X
8.6
8.0
21
Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6)
Z0.12 0X8.6 8
μ = 8 σ = 10
μ = 0σ = 1
(continued)
Finding Normal Probabilities
0.125.0
8.08.6
σ
μXZ
P(X < 8.6) P(Z < 0.12)
22
Z
0.12
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
Solution: Finding P(Z < 0.12)
.5478.02
0.1 .5478
Standardized Normal Probability Table (Portion)
0.00
= P(Z < 0.12)P(X < 8.6)
23
Upper Tail Probabilities
Suppose X is normal with mean 8.0 and standard deviation 5.0.
Now Find P(X > 8.6)
X
8.6
8.0
24
Now Find P(X > 8.6)…(continued)
Z
0.12
0Z
0.12
.5478
0
1.000 1.0 - .5478 = .4522
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - .5478 = .4522
Upper Tail Probabilities
25
Probability Between Two Values
Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(8 < X < 8.6)
P(8 < X < 8.6)
= P(0 < Z < 0.12)
Z0.12 0
X8.6 8
05
88
σ
μXZ
0.125
88.6
σ
μXZ
Calculate Z-values:
26
Z
0.12
Solution: Finding P(0 < Z < 0.12)
.0478
0.00
= P(0 < Z < 0.12)P(8 < X < 8.6)
= P(Z < 0.12) – P(Z ≤ 0)= .5478 - .5000 = .0478
.5000
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
.02
0.1 .5478
Standardized Normal Probability Table (Portion)
27
Suppose X is normal with mean 8.0 and standard deviation 5.0.
Now Find P(7.4 < X < 8)
X
7.48.0
Probabilities in the Lower Tail
28
Probabilities in the Lower Tail
Now Find P(7.4 < X < 8)…
X7.4 8.0
P(7.4 < X < 8)
= P(-0.12 < Z < 0)
= P(Z < 0) – P(Z ≤ -0.12)
= .5000 - .4522 = .0478
(continued)
.0478
.4522
Z-0.12 0
The Normal distribution is symmetric, so this probability is the same as P(0 < Z < 0.12)
29
Steps to find the X value for a known probability:1. หาค�า Z สุ่(าหร�บความน่�าจะเป็�น่ที่��ที่ราบค�า จากตัารางค�า Z2. หาค�า X จากสุ่+ตัร:
การหาค�า X ที่��สุ่อด้คล"องก�บความน่�าจะเป็�น่ที่��ก(าหน่ด้
ZσμX
30
Finding the X value for a Known Probability
Example: สุ่มมตั% X is normal with mean 8.0 and standard
deviation 5.0. จงหาค�า X ที่��คาด้ว�าจะม�ตั�วแป็ร X อ��น่ ๆ ซ้��งม�ค�าน่"อยค�า
น่�&ป็ระมาณ 20%
X? 8.0
.2000
Z? 0
(continued)
31
Find the Z value for 20% in the Lower Tail
20% area in the lower tail is consistent with a Z value of -0.84Z .03
-0.9 .1762 .1736
.2033
-0.7 .2327 .2296
.04
-0.8 .2005
Standardized Normal Probability Table (Portion)
.05
.1711
.1977
.2266
…
…
…
…X? 8.0
.2000
Z-0.84 0
1. Find the Z value for the known probability
32
2. Convert to X units using the formula:
Finding the X value
80.3
0.5)84.0(0.8
ZσμX
So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80
33
การป็ระเม%น่ว�าข"อม+ลแจกแจงแบบป็กตั%หร�อไม�
ตั�วแป็รสุ่��มแบบตั�อเน่��องที่�&งหมด้ม%ได้"แจกแจงแบบป็กตั% ก�อน่การใช่"งาน่จร%ง จ�งควรศ�กษาก�อน่ว�าการแจกแจง
แบบป็กตั%สุ่ามารถึอธิ%บายพิ่ฟัตั%กรรมของข"อม+ลที่��สุ่น่ใจได้"ด้�เพิ่�ยงใด้
34
การป็ระเม%น่ว�าข"อม+ลแจกแจงแบบป็กตั%หร�อไม�
สุ่ร"าง charts or graphs For small- or moderate-sized data sets, do stem-and-
leaf display and box-and-whisker plot look symmetric?
For large data sets, does the histogram or polygon appear bell-shaped?
ค(าน่วณ descriptive summary measures mean, median และ mode ม�ค�าใกล"เค�ยงก�น่หร�อไม�? Is the interquartile range approximately 1.33 σ? ค�าพิ่%สุ่�ยม�ค�าป็ระมาณ 6 σ?
(continued)
35
การป็ระเม%น่ว�าข"อม+ลแจกแจงแบบป็กตั%หร�อไม�
Observe the distribution of the data set Do approximately 2/3 of the observations lie within
mean 1 standard deviation? Do approximately 80% of the observations lie within
mean 1.28 standard deviations? Do approximately 95% of the observations lie within
mean 2 standard deviations?
Evaluate normal probability plot Is the normal probability plot approximately linear
with positive slope?
(continued)
36
The Uniform Distribution
The uniform distribution is a probability distribution that has equal probabilities for all possible outcomes of the random variable
Also called a rectangular distribution
37
The Continuous Uniform Distribution:
otherwise 0
bXaifab
1
where
f(X) = value of the density function at any X value
a = minimum value of X
b = maximum value of X
The Uniform Distribution(continued)
f(X) =
38
Properties of the Uniform Distribution
The mean of a uniform distribution is
The standard deviation is
2
baμ
12
a)-(bσ
2
39
Uniform Distribution Example
ตั�วอย�าง: Uniform probability distribution over the range 2 ≤ X ≤ 6:
2 6
.25
f(X) = = .25 for 2 ≤ X ≤ 66 - 21
X
f(X)
42
62
2
baμ
1547.112
2)-(6
12
a)-(bσ
22
40
The Exponential Distribution
Used to model the length of time between two occurrences of an event (the time between arrivals)
Examples: เวลาระหว�างการมาถึ�งที่�าเร�อของรถึบรรที่�ก เวลาระหว�างการถึ+กใช่"งาน่โด้ยล+กค"าของเคร��อง ATM เวลาระหว�างการเข"ามาถึ�งของโที่รศ�พิ่ที่3ที่�� Operators
41
The Exponential Distribution
Xλe1X)time P(arrival
Defined by a single parameter, its mean λ (lambda) The probability that an arrival time is less than
some specified time X is
where e = mathematical constant approximated by 2.71828
λ = the population mean number of arrivals per unit
X = any value of the continuous variable where 0 < X
<
42
Exponential Distribution Example
Example: Customers arrive at the service counter at the rate of 15 per hour. What is the probability that the arrival time between consecutive customers is less than three minutes?
The mean number of arrivals per hour is 15, so λ = 15
Three minutes is .05 hours
P(arrival time < .05) = 1 – e-λX = 1 – e-(15)(.05) = .5276
So there is a 52.76% probability that the arrival time between successive customers is less than three minutes
43
Sampling Distributions
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
44
Sampling Distributions
A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a population
45
Developing a Sampling Distribution
Assume there is a population …
Population size N=4
Random variable, X,
is age of individuals
Values of X: 18, 20,
22, 24 (years)
A B C D
46
.3
.2
.1
0 18 20 22 24
A B C D
Uniform Distribution
P(x)
x
(continued)
Summary Measures for the Population Distribution:
Developing a Sampling Distribution
214
24222018
N
Xμ i
2.236N
μ)(Xσ
2i
47
1st 2nd Observation Obs 18 20 22 24
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24
16 possible samples (sampling with replacement)
Now consider all possible samples of size n=2
1st 2nd Observation Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
(continued)
Developing a Sampling Distribution
16 Sample Means
48
1st 2nd Observation Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Sampling Distribution of All Sample Means
18 19 20 21 22 23 240
.1
.2
.3 P(X)
X
Sample Means Distribution
16 Sample Means
_
Developing a Sampling Distribution
(continued)
(no longer uniform)
_
49
Summary Measures of this Sampling Distribution:
Developing aSampling Distribution
(continued)
2116
24211918
N
Xμ i
X
1.5816
21)-(2421)-(1921)-(18
N
)μX(σ
222
2Xi
X
50
Comparing the Population with its Sampling Distribution
18 19 20 21 22 23 240
.1
.2
.3 P(X)
X 18 20 22 24
A B C D
0
.1
.2
.3
PopulationN = 4
P(X)
X _
1.58σ 21μXX
2.236σ 21μ
Sample Means Distributionn = 2
_
51
Sampling Distributions of the Mean
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
52
Standard Error of the Mean
Different samples of the same size from the same population will yield different sample means
A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean:
Note that the standard error of the mean decreases as the sample size increases
n
σσ
X
53
If the Population is Normal
If a population is normal with mean μ and
standard deviation σ, the sampling distribution
of is also normally distributed with
and
(This assumes that sampling is with replacement or sampling is without replacement from an infinite population)
X
μμX
n
σσ
X
54
Z-value for Sampling Distributionof the Mean
Z-value for the sampling distribution of :
where: = sample mean
= population mean
= population standard deviation
n = sample size
Xμσ
n
σμ)X(
σ
)μX(Z
X
X
X
55
Finite Population Correction
Apply the Finite Population Correction if: the sample is large relative to the population
(n is greater than 5% of N)
and… Sampling is without replacement
Then
1NnN
n
σ
μ)X(Z
56
Normal Population Distribution
Normal Sampling Distribution (has the same mean)
Sampling Distribution Properties
(i.e. is unbiased )xx
x
μμx
μ
xμ
57
Sampling Distribution Properties
For sampling with replacement:
As n increases,
decreasesLarger sample size
Smaller sample size
x
(continued)
xσ
μ
58
If the Population is not Normal
We can apply the Central Limit Theorem:
Even if the population is not normal, …sample means from the population will be
approximately normal as long as the sample size is large enough.
Properties of the sampling distribution:
andμμx n
σσx
59
n↑
Central Limit Theorem
As the sample size gets large enough…
the sampling distribution becomes almost normal regardless of shape of population
x
60
Population Distribution
Sampling Distribution (becomes normal as n increases)
Central Tendency
Variation
(Sampling with replacement)
x
x
Larger sample size
Smaller sample size
If the Population is not Normal(continued)
Sampling distribution properties:
μμx
n
σσx
xμ
μ
61
How Large is Large Enough?
For most distributions, n > 30 will give a sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15
For normal population distributions, the sampling distribution of the mean is always normally distributed
62
Example
Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a random sample of size n = 36 is selected.
What is the probability that the sample mean is between 7.8 and 8.2?
63
Example
Solution:
Even if the population is not normally distributed, the central limit theorem can be used (n > 30)
… so the sampling distribution of is approximately normal
… with mean = 8
…and standard deviation
(continued)
x
xμ
0.536
3
n
σσx
64
Example
Solution (continued):(continued)
0.38300.5)ZP(-0.5
363
8-8.2
nσ
μ- μ
363
8-7.8P 8.2) μ P(7.8 X
X
Z7.8 8.2 -0.5 0.5
Sampling Distribution
Standard Normal Distribution .1915
+.1915
Population Distribution
??
??
?????
??? Sample Standardize
8μ 8μX
0μz xX
65
Sampling Distributions of the Proportion
Sampling Distributions
Sampling Distributions
of the Mean
Sampling Distributions
of the Proportion
66
Population Proportions, p
p = the proportion of the population having some characteristic
Sample proportion ( ps ) provides an estimate of p:
0 ≤ ps ≤ 1
ps has a binomial distribution
(assuming sampling with replacement from a finite population or without replacement from an infinite population)
size sample
interest ofstic characteri the having sample the in itemsofnumber
n
Xps
67
Sampling Distribution of p
Approximated by a
normal distribution if:
where
and
(where p = population proportion)
Sampling DistributionP( ps)
.3
.2
.1 0
0 . 2 .4 .6 8 1 ps
pμsp
n
p)p(1σ
sp
5p)n(1
5np
and
68
Z-Value for Proportions
If sampling is without replacement
and n is greater than 5% of the
population size, then must use
the finite population correction
factor:
1N
nN
n
p)p(1σ
sp
np)p(1
pp
σ
ppZ s
p
s
s
Standardize ps to a Z value with the formula:
pσ
69
Example
If the true proportion of voters who support
Proposition A is p = .4, what is the probability
that a sample of size 200 yields a sample
proportion between .40 and .45?
i.e.: if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
70
Example
if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
(continued)
.03464200
.4).4(1
n
p)p(1σ
sp
1.44)ZP(0
.03464
.40.45Z
.03464
.40.40P.45)pP(.40 s
Find :
Convert to standard normal:
spσ
71
Example
Z.45 1.44
.4251
Standardize
Sampling DistributionStandardized
Normal Distribution
if p = .4 and n = 200, what is
P(.40 ≤ ps ≤ .45) ?
(continued)
Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251
.40 0ps