Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions
Dec 14, 2015
Statistics for Managers Using Microsoft Excel
3rd Edition
Chapter 5The Normal Distribution and
Sampling Distributions
Chapter Topics
The normal distribution The standardized normal distribution Evaluating the normality assumption The exponential distribution
Chapter Topics
Introduction to sampling distribution
Sampling distribution of the mean
Sampling distribution of the proportion
Sampling from finite population
(continued)
Continuous Probability Distributions
Continuous random variable Values from interval of numbers Absence of gaps
Continuous probability distribution Distribution of continuous random variable
Most important continuous probability distribution The normal distribution
The Normal Distribution
“Bell shaped” Symmetrical Mean, median and
mode are equal Interquartile range
equals 1.33 Random variable
has infinite range
Mean Median Mode
X
f(X)
The Mathematical Model
21
2
2
1
2
: density of random variable
3.14159; 2.71828
: population mean
: population standard deviation
: value of random variable
X
f X e
f X X
e
X X
Expectation
0
)(
)(
22
22
22
2/)(
21
2/)(
21
2/)(
21
dxe
xdex
dxxeXE
x
x
x
Variance
2
)(
2
2/2
2
2/)(2
212
2
22
2
dyey
deXE
y
xxx
Many Normal Distributions
By varying the parameters and , we obtain different normal distributions
There are an infinite number of normal distributions
Finding Probabilities
Probability is the area under the curve!
c dX
f(X)
?P c X d
Which Table to Use?
An infinite number of normal distributions means an infinite number of tables to look
up!
Solution: The Cumulative Standardized Normal
Distribution
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
.5478.02
0.1 .5478
Cumulative Standardized Normal Distribution Table (Portion)
Probabilities
Shaded Area Exaggerated
Only One Table is Needed
0 1Z Z
Z = 0.12
0
Standardizing Example
6.2 50.12
10
XZ
Normal Distribution
Standardized Normal
Distribution
Shaded Area Exaggerated
10 1Z
5 6.2 X Z0Z
0.12
Example:
Normal Distribution
Standardized Normal
Distribution
Shaded Area Exaggerated
10 1Z
5 7.1 X Z0Z
0.21
2.9 5 7.1 5.21 .21
10 10
X XZ Z
2.9 0.21
.0832
2.9 7.1 .1664P X
.0832
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
.5832.02
0.1 .5478
Cumulative Standardized Normal Distribution Table (Portion)
Shaded Area Exaggerated
0 1Z Z
Z = 0.21
Example: 2.9 7.1 .1664P X
(continued)
0
Z .00 .01
-03 .3821 .3783 .3745
.4207 .4168
-0.1.4602 .4562 .4522
0.0 .5000 .4960 .4920
.4168.02
-02 .4129
Cumulative Standardized Normal Distribution Table (Portion)
Shaded Area Exaggerated
0 1Z Z
Z = -0.21
Example: 2.9 7.1 .1664P X
(continued)
0
Normal Distribution in PHStat
PHStat | probability & prob. Distributions | normal …
Example in excel spreadsheet
Microsoft Excel Worksheet
Example: 8 .3821P X
Normal Distribution
Standardized Normal
Distribution
Shaded Area Exaggerated
10 1Z
5 8 X Z0Z
0.30
8 5.30
10
XZ
.3821
Example: 8 .3821P X
(continued)
Z .00 .01
0.0 .5000 .5040 .5080
.5398 .5438
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
.6179.02
0.1 .5478
Cumulative Standardized Normal Distribution Table (Portion)
Shaded Area Exaggerated
0 1Z Z
Z = 0.30
0
.6217
Finding Z Values for Known Probabilities
Z .00 0.2
0.0 .5000 .5040 .5080
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
.6179 .6255
.01
0.3
Cumulative Standardized Normal Distribution Table
(Portion)
What is Z Given Probability = 0.1217 ?
Shaded Area Exaggerated
.6217
0 1Z Z
.31Z 0
Recovering X Values for Known Probabilities
5 .30 10 8X Z
Normal Distribution
Standardized Normal
Distribution10 1Z
5 ? X Z0Z 0.30
.3821.1179
Assessing Normality
Not all continuous random variables are normally distributed
It is important to evaluate how well the data set seems to be adequately approximated by a normal distribution
Assessing Normality Construct charts
For small- or moderate-sized data sets, do stem-and-leaf display and box-and-whisker plot look symmetric?
For large data sets, does the histogram or polygon appear bell-shaped?
Compute descriptive summary measures Do the mean, median and mode have similar
values? Is the interquartile range approximately 1.33
? Is the range approximately 6 ?
(continued)
Assessing Normality
Observe the distribution of the data set Do approximately 2/3 of the observations lie
between mean 1 standard deviation? Do approximately 4/5 of the observations lie
between mean 1.28 standard deviations? Do approximately 19/20 of the observations
lie between mean 2 standard deviations? Evaluate normal probability plot
Do the points lie on or close to a straight line with positive slope?
(continued)
Assessing Normality
Normal probability plot Arrange data into ordered array Find corresponding standardized normal
quantile values Plot the pairs of points with observed data
values on the vertical axis and the standardized normal quantile values on the horizontal axis
Evaluate the plot for evidence of linearity
(continued)
Assessing Normality
Normal Probability Plot for Normal Distribution
Look for Straight Line!
30
60
90
-2 -1 0 1 2
Z
X
(continued)
Normal Probability Plot
Left-Skewed Right-Skewed
Rectangular U-Shaped
30
60
90
-2 -1 0 1 2
Z
X
30
60
90
-2 -1 0 1 2
Z
X
30
60
90
-2 -1 0 1 2
Z
X
30
60
90
-2 -1 0 1 2
Z
X
Exponential Distributions
arrival time 1
: any value of continuous random variable
: the population average number of
arrivals per unit of time
1/ : average time between arrivals
2.71828
XP X e
X
e
e.g.: Drivers Arriving at a Toll Bridge; Customers Arriving at an ATM Machine
Exponential Distributions
Describes time or distance between events Used for queues
Density function
Parameters
(continued)
f(X)
X
= 0.5
= 2.0
1 x
f x e
Example
e.g.: Customers arrive at the check out line of a supermarket at the rate of 30 per hour. What is the probability that the arrival time between consecutive customers to be greater than five minutes?
30 5/ 60
30 5 / 60 hours
arrival time > 1 arrival time
1 1
.0821
X
P X P X
e
Exponential Distribution in PHStat
PHStat | probability & prob. Distributions | exponential
Example in excel spreadsheet
Microsoft Excel Worksheet
Why Study Sampling Distributions
Sample statistics are used to estimate population parameters e.g.: Estimates the population mean
Problems: different samples provide different estimate Large samples gives better estimate; Large
samples costs more How good is the estimate?
Approach to solution: theoretical basis is sampling distribution
50X
Sampling Distribution
Theoretical probability distribution of a sample statistic
Sample statistic is a random variable Sample mean, sample proportion
Results from taking all possible samples of the same size
Developing Sampling Distributions
Assume there is a population … Population size N=4 Random variable, X,
is age of individuals Values of X: 18, 20,
22, 24 measured inyears A
B C
D
1
2
1
18 20 22 2421
4
2.236
N
ii
N
ii
X
N
X
N
.3
.2
.1
0 A B C D (18) (20) (22) (24)
Uniform Distribution
P(X)
X
Developing Sampling Distributions
(continued)
Summary Measures for the Population Distribution
1st 2nd Observation Obs 18 20 22 24
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24
24 24,18 24,20 24,22 24,24
All Possible Samples of Size n=2
16 Samples Taken with Replacement
16 Sample Means1st 2nd Observation Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Developing Sampling Distributions
(continued)
1st 2nd Observation Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Sampling Distribution of All Sample Means
18 19 20 21 22 23 240
.1
.2
.3
P(X)
X
Sample Means
Distribution
16 Sample Means
_
Developing Sampling Distributions
(continued)
1
2
1
2 2 2
18 19 19 2421
16
18 21 19 21 24 211.58
16
N
ii
X
N
i Xi
X
X
N
X
N
Summary Measures of Sampling Distribution
Developing Sampling Distributions
(continued)
Comparing the Population with its Sampling
Distribution
18 19 20 21 22 23 240
.1
.2
.3 P(X)
X
Sample Means Distribution
n = 2
A B C D (18) (20) (22) (24)
0
.1
.2
.3
PopulationN = 4
P(X)
X_
21 2.236 21 1.58X X
Properties of Summary Measures
I.E. Is unbiased
Standard error (standard deviation) of the sampling distribution is less than the standard error of other unbiased estimators
For sampling with replacement: As n increases, decreases
X
X
Xn
X
X
Unbiasedness
BiasedUnbiased
P(X)
X X
Less Variability
Sampling Distribution of Median Sampling
Distribution of Mean
P(X)
X
Effect of Large Sample
Larger sample size
Smaller sample size
P(X)
X
When the Population is Normal
Central Tendency
Variation
Sampling with Replacement
Population Distribution
Sampling Distributions
X
Xn
X50X
4
5X
n
16
2.5X
n
50
10
When the Population is Not Normal
Central Tendency
Variation
Sampling with Replacement
Population Distribution
Sampling Distributions
X
Xn
X50X
4
5X
n
30
1.8X
n
50
10
Central Limit Theorem
As sample size gets large enough…
the sampling distribution becomes almost normal regardless of shape of population
X
How Large is Large Enough?
For most distributions, n>30 For fairly symmetric distributions, n>15 For normal distribution, the sampling
distribution of the mean is always normally distributed
Example:
8 =2 25
7.8 8.2 ?
n
P X
Sampling Distribution
Standardized Normal
Distribution2
.425
X 1Z
8X 8.2 Z
0Z 0.5
7.8 8 8.2 87.8 8.2
2 / 25 2 / 25
.5 .5 .3830
X
X
XP X P
P Z
7.8 0.5
.1915
X
Population Proportions p Categorical variable
e.g.: Gender, voted for Bush, college degree
Proportion of population having a characteristic
Sample proportion provides an estimate
If two outcomes, X has a binomial
distribution Possess or do not possess characteristic
number of successes
sample sizeS
Xp
n
p
Sampling Distribution of Sample Proportion
Approximated by normal distribution
Mean:
Standard error: p = population
proportion
Sampling DistributionP(ps)
.3
.2
.1 0
0 . 2 .4 .6 8 1ps
5np 1 5n p
Spp
1Sp
p p
n
Standardizing Sampling Distribution of Proportion
1S
S
S p S
p
p p pZ
p p
n
Sampling Distribution
Standardized Normal
Distribution
Sp 1Z
Sp Sp Z0Z
Example: 200 .4 .43 ?Sn p P p
.43 .4.43 .87 .8078
.4 1 .4
200
S
S
S pS
p
pP p P P Z
Sampling Distribution
Standardized Normal
DistributionSp
1Z
Sp
Sp Z0.43 .87
Sampling from Finite Sample
Modify standard error if sample size (n) is large relative to population size (N ) Use finite population correction factor (fpc)
Standard error with FPC
1X
N n
Nn
1
1SP
p p N n
n N
.05 or / .05n N n N