Top Banner

of 13

Sampling( probability and statistics)

Jun 04, 2018

Download

Documents

cuami
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/13/2019 Sampling( probability and statistics)

    1/13

    FUNDAMENTAL SAMPLINGDISTRIBUTIONS AND DATA

    DESCRIPTION

    20thNovember

    Previous lecture: Sampling, its methods and errors Todays lecture: 1. Sample Mean, sample variance, sample standard deviation 2. Data display and graphical Methods:

    Box and whisker plot or boxplot

    Quantile plot

    Detection of deviation from normality

    Normal quantile quantile plot

    Normal Probability Plotting

    3. Sampling Distributions

    Sampling distribution of Means

    Central limit Theorem

    Sampling Distribution of the Difference between two averages

    4. T-Distribution

    5. F-distribution

    WHAT IS SAMPLING DISTRIBUTION?

    Sampling is defined as the process of selecting a number ofobservations (subjects) from all the observations (subjects)from a particular group or population.

    Sampling distribution is defined as the frequencydistribution of the statistic for many samples.

    It is the distribution of means and is also called thesampling distribution of the mean.

    *

    FEATURES OF SAMPLING DISTRIBUTION

    The 4 features of sampling distribution include:1) The statistic of interest (Proportion, SD, orMean)2) Random selection of sample3) Size of the random sample (very important)4) The characteristics of the population being

    sampled.

  • 8/13/2019 Sampling( probability and statistics)

    2/13

    1. SAMPLE MEAN, SAMPLE VARIANCE, SAMPLE STANDARD DEVIATION

    The s mple me n from a group of observations is an estimateof the population mean . Given a sample of size n,consider n independent random variables X1, X2, ..., Xn, each

    corresponding to one randomly selected observation. Each of

    these variables has the distribution of the population, with

    mean and standard deviation .

    By the properties of means and variances of random variables,the mean and variance of the sample mean are the following:

    EXAMPLE

    Although the mean of the distribution of is identical to the mean of the population

    distribution, the variance is much smaller for large sample sizes.

    For example, suppose the random variable Xrecords a randomly selected student'sscore on a national test, where the population distribution for the score is normal

    with mean 70 and standard deviation 5 (N(70,5)). Given a simple random sample

    (SRS)of 200 students, the distribution of the sample mean score has mean 70 and

    standard deviation 5/sqrt(200) = 5/14.14 = 0.35.

    FUN

  • 8/13/2019 Sampling( probability and statistics)

    3/13

    CHARACTERISTICS OF SAMPLING

    DISTRIBUTION

    Central Limit TheoremWhen random samples of size is taken from a population, thedistribution of sample means will approach the normaldistribution.

    When the Sampling distribution of the mean has samplesizes of 30 or more then it is said to be normally distributed. So we can see that as the sample size increases to 30 or

    more then it resembles a normal distribution, therefore thesize of the random sample is a very important feature of

    sampling distribution.

    CENTRAL LIMIT THEOREM

    1. The random variable xhas a distribution (which mayor may not be normal) with mean and standarddeviation .

    2. Samples all of the same sizenare randomly selectedfrom the population of xvalues.

    Given:

  • 8/13/2019 Sampling( probability and statistics)

    4/13

    1. The distr ibut ion of samplexwill , as the

    sample size increases, approach a normaldistribution.

    2. The mean of the sample means will be the

    population mean .

    3. The standard deviation of the sample means

    will approach n

    Conclusions:

    Central Limit Theorem

  • 8/13/2019 Sampling( probability and statistics)

    5/13

    PRACTICE QS.

    A1. The mean of the sampling distribution of the mean is the mean of the population

    from which the scores were sampled, in this case 14.

    A2. The population has a mean of 30 and a standard deviation of 6. The sample size of

    your sampling distribution is N=9. What is the variance of the sampling distributionof the mean?

    A3. The standard error is the standard deviation of the population divided by the square

    root of N. In this case, 12/4 = 3

    A4. According to the central limit theorem, regardless of the shape of the parent

    population, the sampling distribution of the mean approaches a normal distribution

    as N increases. In this case, a sample size of 30 is sufficiently large to cause the

    sampling distribution of the mean to look about normal.

    A5. This problem is asking about the sampling distribution of the mean: Mean = 75, SD

    = 10/sqrt(25) = 10/5 = 2, Skew = about 0 because the central limit theorem states

    that the sampling distribution of the mean would be about normal with a largeenough N.

    PRACTICAL RULESCOMMONLY USED:1. For samples of size nlarger than 30, the distribution of the

    sample means can be approximated reasonably well by a normaldistribution. The approximation gets better as the sample size nbecomes larger.

    2. If the original population is itself normally distributed, then thesample means will be normally distributed for any sample size n(not just the values of nlarger than 30).

  • 8/13/2019 Sampling( probability and statistics)

    6/13

    the mean of the sample meansx=

    NOTATION NOTATIONthe mean of the sample means

    the standard deviation of sample meanx=

    x= n

    NOTATIONthe mean of the sample means

    the standard deviation of sample mean

    (often called standard error of the mean)

    x=

    x=

    n

    Distribution of 200 digits fromSocial Security Numbers

    (Last 4 digits from 50 students)

    Figure 5-19

  • 8/13/2019 Sampling( probability and statistics)

    7/13

    Distribution of 50 Sample Meansfor 50 Students

    Figure 5-20

    As the sample size increases, thesampling distribution of samplemeans approaches a normal

    distribution.

    EXAMPLE 1GIVEN THE POPULATION OF MEN HAS NORMALLY DISTRIBUTEDWEIGHTS WITH A MEAN OF 172 LB AND A STANDARD DEVIATION OF 29LB,A IF ONE MAN IS RANDOMLY SELECTED, FIND THE PROBABILITYTHAT HIS WEIGHT IS GREATER THAN 167 LB.B IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THEPROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.

  • 8/13/2019 Sampling( probability and statistics)

    8/13

    EXAMPLE 1: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,A IF ONE MAN IS RANDOMLY SELECTED, FIND THE PROBABILITYTHAT HIS WEIGHT IS GREATER THAN 167 LB.z = 167 172 = 0.17

    29

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,A IF ONE MAN IS RANDOMLY SELECTED, THE PROBABILITY THAT HISWEIGHT IS GREATER THAN 167 LB. IS 0.5675.

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,B IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THEPROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,B IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THEPROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.

  • 8/13/2019 Sampling( probability and statistics)

    9/13

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,B IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THEPROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.

    z = 167 172 = 0.6029

    36

    z = 167 172 = 0.6029

    36

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 143 LB AND A STANDARDDEVIATION OF 29 LB,B. IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, THEPROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB IS0.7257.

    EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLYDISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARDDEVIATION OF 29 LB,

    b) if 12 different men are randomly selected, their mean

    weight is greater than 167 lb.

    P(x > 167) = 0.7257It is much easier for an individual to deviate from themean than it is for a group of 12 to deviate from the mean.

    a) if one man is randomly selected, find the probabili ty

    that his weight is greater than 167 lb.

    P(x > 167) = 0.5675

    NON-NORMAL POPULATIONS

    What can we say about the shape of the samplingdistribution of x when the population from which thesample is selected is not normal?

    53

    490

    10272

    35 21 26 17 8 10 2 3 1 0 0 1

    0

    100

    200

    300

    400

    500

    600

    Frequency

    Salary($1,000's)

    BaseballSalaries

  • 8/13/2019 Sampling( probability and statistics)

    10/13

    THE IMPORTANCE OF THE CENTRAL

    LIMIT THEOREM

    When we select simple random samples ofsize n, the sample means we find will vary

    from sample to sample. We can model the

    distribution of these sample means with a

    probability model that is

    ,N n

    HOW LARGE SHOULD N BE?

    For the purpose of applying the central limit

    theorem, we will consider a sample size to belarge when n > 30.

    SUMMARYPopulation: mean ; stand dev. ; shape ofpopulation dist. is unknown; value of isunknown; select random sample of size n;Sampling distribution of x:mean ; stand. dev. /n;always trueBy the Central Limit Theorem:the shape of the sampling distribution isapprox normal, that is

    x ~ N(, /n)

    EXAMPLE

    ( ) 48

    A random sample of =64 observations is

    drawn from a population with mean =15

    and standard deviation =4.

    a. ( ) 15; ( ) .5

    b. The shape of the sampling distribution model for

    is approx. no

    SD X

    n

    n

    E X SD X

    x

    ( )

    rmal (by the CLT) with

    mean E(X) 15 and ( ) .5. The answer

    depends on the sample size since ( ) .SD X

    n

    SD X

    SD X

  • 8/13/2019 Sampling( probability and statistics)

    11/13

    GRAPHICALLY

    Shape of population

    dist. not known

    EXAMPLE (CONT.)

    15.5 15 .5.5 .5( )

    c. 15.5;

    1

    This means that =15.5 is one standard

    deviation above the mean ( ) 15

    x

    SD X

    x

    z

    x

    E X

    EXAMPLE 2

    The probability distribution of 6-month

    incomes of account executives has mean

    $20,000 and standard deviation $5,000.

    a) A single executives income is $20,000.Can it be said that this executives income

    exceeds 50% of all account executive

    incomes?

    ANSWER No. P(X

  • 8/13/2019 Sampling( probability and statistics)

    12/13

    EXAMPLE 3

    A sample of size n=16 is drawn from a

    normally distributed population with mean

    E(x)=20 and SD(x)=8.

    8

    16

    20 24 202 2

    16 20 24 202 2

    ~ (20,8); ~ (20, )

    ) ( 24) ( ) ( 2)

    1 .9772 .0228

    ) (16 24)

    ( 2 2) .9772 .0228 .9544

    X

    X N X N

    a P X P P z

    b P X P z

    P z

    EXAMPLE 3 (CONT.)

    c. Do we need the Central Limit Theorem to

    solve part a or part b?

    NO. We are given that the population is

    normal, so the sampling distribution of the

    mean will also be normal for any sample size n.

    The CLT is not needed.

    EXAMPLE 4

    Battery life X~N(20, 10). Guarantee: avg.battery life in a case of 24 exceeds 16 hrs.Find the probability that a randomly

    selected case meets the guarantee.

    10

    24

    20 16 20

    2.04 2.04

    ( ) 20; ( ) 2.04. ~ (20,2.04)

    ( 16) ( ) ( 1.96)

    .1 .0250 .9750

    X

    E x SD x X N

    P X P P z

    EXAMPLE 5

    Cans of salmon are supposed to have a netweight of 6 oz. The canner says that thenet weight is a random variable with

    mean =6.05 oz. and stand. dev. =.18oz.Suppose you take a random sample of 36

    cans and calculate the sample meanweight to be 5.97 oz.

    Find the probability that the mean weight

    of the sample is less than or equal to 5.97oz.

  • 8/13/2019 Sampling( probability and statistics)

    13/13

    POPULATION X: AMOUNT OF SALMON

    IN A CAN

    E(X)=6.05 OZ, SD(X) = .18 OZ

    X sampling dist: E(x)=6.05 SD(x)=.18/6=.03

    By the CLT, X sampling dist is approx. normal

    P(X 5.97) = P(z [5.97-6.05]/.03)

    =P(z -.08/.03)=P(z -2.67)= .0038

    How could you use this answer?

    Suppose you work for a consumerwatchdog group If you sampled the weights of 36 cans andobtained a sample mean x 5.97 oz., whatwould you think? Since P( x 5.97) = .0038, either

    you observed a rare event (recall: 5.97 oz is2.67 stand. dev. below the mean) and the meanfill E(x) is in fact 6.05 oz. (the value claimed bythe canner)

    the true mean fill is less than 6.05 oz., (thecanner is lying ).

    EXAMPLE 6

    X: weekly income. E(x)=600, SD(x) = 100

    n=25; X sampling dist: E(x)=600SD(x)=100/5=20

    P(X 550)=P(z [550-600]/20)

    =P(z -50/20)=P(z -2.50) = .0062

    Suspicious of claim that average is $600;

    evidence is that average income is less.

    EXAMPLE 7

    12% of students at NCSU are left-handed. Whatis the probability that in a sample of 50students, the sample proportion that are left-

    handed is less than 11%?

    .12*.88 ( ) .12; ( ) .046

    50E p p SD p

    .12 .11 .12( .11)

    .046 .046

    ( .22) .4129

    pP p P

    P z

    By the CLT, ~ (.12,.046)p N