Top Banner
1 Biostatistics SGU July 2014 2 Data is Everywhere Research Literature “Hypothesis: Surgeon-directed institutional peer review, associated with positive physician feedback, can decrease the morbidity and mortality rates associated with carotid endarterectomy. Results: Stroke rate decreased from 3.8% (1993-1994) to 0%(1997-1998). The mortality rate decreased from 2.8% (1993-1994) to 0% (1997-1998). (average) Length of stay decreased from 4.7 days (1993-1994) to 2.6 days (1997-1998). The (average) total cost decreased from $13344 (1993-1994) to $9548 (1997-1998).” Archives of Surgery, August 2000 Popular Press “For the first time, an influential doctors group is recommending that some children as young as 8 be given cholesterol-fighting drugs to ward off future heart problems... With one-third of U.S. children overweight and about 17 percent obese, the new recommendations are important, said Dr. Jennifer Li, a Duke University children’s heart specialist.” cnn.com, July 8, 2008 3 Data provides Information Good Data Can Be Analyzed and Summarized to Provide Useful Information Bad Data Can Be Analyzed and Summarized to Provide Incorrect/Harmful/Non-informative Information 4 Steps in Research Project Planning Design Data Collection Data Analysis Presentation Interpretation
82

Biostats Notes

Nov 18, 2015

Download

Documents

Haroon

This file includes basic medical school biostats notes.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1

    Biostatistics

    SGU

    July 2014

    2

    Data is Everywhere

    Research LiteratureHypothesis: Surgeon-directed institutional peer review, associated withpositive physician feedback, can decrease the morbidity and mortalityrates associated with carotid endarterectomy. Results: Stroke ratedecreased from 3.8% (1993-1994) to 0%(1997-1998). The mortality ratedecreased from 2.8% (1993-1994) to 0% (1997-1998). (average) Lengthof stay decreased from 4.7 days (1993-1994) to 2.6 days (1997-1998).The (average) total cost decreased from $13344 (1993-1994) to $9548(1997-1998).

    Archives of Surgery, August 2000

    Popular Press

    For the first time, an influential doctors group is recommending thatsome children as young as 8 be given cholesterol-fighting drugs to wardoff future heart problems... With one-third of U.S. children overweightand about 17 percent obese, the new recommendations are important,said Dr. Jennifer Li, a Duke University childrens heart specialist.

    cnn.com, July 8, 2008

    3

    Data provides Information

    Good Data Can Be Analyzed and Summarized toProvide Useful Information

    Bad Data Can Be Analyzed and Summarized toProvide Incorrect/Harmful/Non-informative

    Information

    4

    Steps in Research Project

    Planning Design Data Collection Data Analysis Presentation Interpretation

  • 5

    Biostatistics

    Design of Studies

    Sample size Selection of study participants Role of randomization

    Data Collection Variability

    Important patterns in data are obscured by variability.Distinguish real patterns from random variation.

    Inference

    Draw general conclusions from limited data e.g. survey

    Summarize

    What summary measures will best convey the results How to convey uncertainty in results

    Interpretation

    What do the results mean in terms of practice, the program,the population etc.

    6

    1954 Salk Polio Vaccine Trial

    School Children

    Vaccinatedn = 200, 745

    Placebon = 201, 229

    Polio Cases

    Vaccine 82Placebo 162

    Reference: Meier P, The Biggest Public Health Experiment Ever: The 1954 Field Trial of theSalk Poliomyelitis Vaccine, In: Statistics: A Guide to the Unknown, 1972.

    7

    Design: Features of the Polio Trial

    Comparison Group Randomized Placebo Controls Double Blind

    Objective: The groups should be equivalentexcept for the factor (vaccine) being investigated.

    Question: Could the results be due to chance?

    8

    There were almost twice as many polio casesin the placebo compared to vaccine group.

    COULD WE GET SUCH GREAT IMBALANCE BY CHANCE?

    Polio Cases

    Vaccine 82 out of 200,745Placebo 162 out of 201,229

    p-value=?

    Statistical methods tell us how to make these probabilitycalculations.

  • 9

    Types of Data

    1 Binary (dichotomous) data

    Polio: Yes/No Cure: Yes/No Gender: Male/Female

    2 Categorical data

    Race/ethnicity nominalno ordering Country of birth nominalno ordering Degree of agreement ordinalordering

    3 Continuous data (finer measurements) Blood pressure Weight Height Age

    4 Time to Event data

    Time in remission

    10

    There are Different Statistical Methods for Different Types of Data

    Binary Data To compare the number of polio cases in the 2treatment arms of the Salk Polio vaccine, you coulduse

    Fishers Exact Test Chi-Square Test

    Continuous Data To compare blood pressure in a clinical trialevaluating 2 blood pressure lowering medications, youcould use

    2-sample t-Test Wilcoxon Rank Sum (nonparametric) Test

    11

    Sample Mean (X )

    Add up data, then divide by sample size (n)The sample size nis the number of

    observations(pieces of data)

    Example: n = 5 Systolic Blood Pressures (mmHg)

    X1 = 120

    X2 = 80

    X3 = 90

    X4 = 110

    X5 = 95

    X =120 + 80 + 90 + 110 + 95

    5= 99 mmHg

    12

    Notes on Sample Mean (X )

    1 Formula

    X =

    ni=1 Xin

    2 Also called sample average or arithmetic mean

    3 Sensitive to extreme valuesOne data point could make a great change in sample mean

    4 Why is it called the sample mean?To distinguish it from population mean

  • 13

    Population Versus Sample

    Population The entire group about which you want information

    Blood pressures of all 18-year-old male collegestudents in the U.S.

    Sample A part of the population from which we actuallycollect information; used to draw conclusions aboutthe whole population

    Sample of blood pressures from n = 518-year-old male college students in the U.S.

    14

    Population Versus Sample

    The sample mean X is not the population mean

    PopulationPopulation mean

    SampleSample mean X

    We dont know the population mean but we would like to We draw a sample from the population We calculate the sample mean X How close is X to ? Statistical theory will tell us how close X is to

    15

    STATISTICAL INFERENCE IS THE PROCESS OFTRYING TO DRAW CONCLUSIONS ABOUT THE

    POPULATION FROM THE SAMPLE

    We will return to this later

    16

    Sample Median

    The median is the middle number

    80 90 95 110 120

    Median

    1 Not sensitive to extreme values.If 120 became 200

    Median no change Mean big change (becomes 115)

  • 17

    Sample Median

    2 If the sample size is an even number,Average the two middle numbers

    80 90 95 110 120 125

    Median

    95 + 110

    2= 102.5 mmHg

    18

    19

    Describing Variability

    How Can We Describe the Spreadof the Distribution?

    Minimum and Maximum Range=Max-Min SAMPLE STANDARD DEVIATION (S or SD)

    m

    20

    Describing Variability

    The sample variance is the average of thesquare of the deviations about the sample

    mean

    s2 =

    ni=1(Xi X )2

    n 1

    Sample variance (s2) Sample standard deviation (s or SD) is the square root of s2

    Why n 1?Stay tuned

  • 21

    Calculating s

    Example: n = 5Systolic Blood Pressures(mmHg)

    X1 = 120

    X2 = 80

    X3 = 90

    X4 = 110

    X5 = 95

    Sample Mean X = 99 (mmHg)Sample Variance s2 = 255Sample Standard Deviation (SD) s = 15.97 (mmHg)

    22

    Notes on s

    1 The bigger s is, the more variability there is

    2 s measures the spread about the mean

    3 s can equal 0 only if there is no spread all n observations have the same value

    4 The units of s are the same as the units of the data(e.g. mmHg)

    5 Often abbreviated SD

    6 s is the best estimate of the population standard deviation

    Interpretation

    Most of the population will be within about 2 standarddeviations (s) of the mean X

    For a normally (Gaussian) distributed population, most isabout 95%

    23

    More Notes about SD:Why do we divide by n 1 instead of n?

    We want to replace X with in the formula for s2

    s2 =

    (Xi X )2

    n 1

    Because we dont know , we use X

    But (Xi X )2 tends to be smaller than (Xi )2So, to compensate we divide by a smaller number:n 1 instead of n

    n 1 is called the degrees of freedom of the variance.Why?

    The sum of the deviations is zero The last deviation can be found once we know the other n 1 Only n 1 of the squared deviations can vary freely

    The term degrees of freedom arises in other statistics It is not always n 1, but it is in this case

    24

  • 25

    Other Measures of Variation

    Standard deviation (SD or S) Minimum and maximum observation Range=Max-Min

    What Happens To These as Sample Size Increases?

    . Tend to increase?

    . Tend to decrease?

    . Remain about the same?

    26

    Continuous Variables: Histograms

    Means and medians do not tell the whole story Differences in spread (variability) Differences in shape of the distribution

    Histograms are a way of displaying thedistribution of a set of data by charting the

    number (or percentage) of observations whosevalues fall within pre-defined numerical ranges

    27

    How to Make a Histogram

    Table 20: Resident Population by Age and State (2000)State Percent State Percent State Percent

    Alabama 13.0 Louisiana 11.6 Ohio 13.3Alaska 5.7 Maine 14.4 Oklahoma 13.2Arizona 13.0 Maryland 11.3 Oregon 12.8Arkansas 14.0 Massachusetts 13.5 Pennsylvania 15.6California 10.6 Michigan 12.3 Rhode Island 14.5Colorado 9.7 Minnesota 12.1 South Carolina 12.1Connecticut 13.8 Mississippi 12.1 South Dakota 14.3Delaware 13.0 Missouri 13.5 Tennessee 12.4Florida 17.6 Montana 13.4 Texas 9.9Georgia 9.6 Nebraska 13.6 Utah 8.5Hawaii 13.3 Nevada 11.0 Vermont 12.7Idaho 11.3 New Hampshire 12.0 Virginia 11.2Illinois 12.1 New Jersey 13.2 Washington 11.2Indiana 12.4 New Mexico 11.7 West Virginia 15.3Iowa 14.9 New York 12.9 Wisconsin 13.1Kansas 13.3 North Carolina 12.0 Wyoming 11.7Kentucky 12.5 North Dakota 14.7

    Source: Statistical Abstract of the United States, 2001.www.census.gov/prod/2002pubs/01statab/stat-ab01.html 28

    Divide into intervals (equal) Count number in each

    Count the observations in each class.Here are the counts:

    Class Count Class Count Class Count

    4.1 to 5.0 0 9.1 to 10.0 3 14.1 to 15.0 55.1 to 6.0 1 10.1 to 11.0 2 15.1 to 16.0 26.1 to 7.0 0 11.1 to 12.0 9 16.1 to 17.0 07.1 to 8.0 0 12.1 to 13.0 14 17.1 to 18.0 18.1 to 9.0 1 13.1 to 14.0 12 18.1 to 19.0 0

    www.census.gov/prod/2002pubs/01statab/stat-ab01.html

  • 29

    How to Make a Simple Histogram

    Divide range of data into intervals (bins) of equal width Count number of observations in each class Draw the histogram Label scales

    Percent of residents over 65

    Num

    ber

    of s

    tate

    s

    4 6 8 10 12 14 16 18 20

    02

    46

    810

    1214

    30

    Pictures of Data: Histograms

    80 100 120 140 160 180

    05

    1015

    Systolic Blood Pressure (mm Hg)

    Num

    ber

    of M

    en in

    Pop

    ulat

    ion

    80 100 120 140 160 180

    010

    2030

    4050

    Systolic Blood Pressure (mm Hg)

    Num

    ber

    of M

    en in

    Pop

    ulat

    ion

    80 100 120 140 160 180

    01

    23

    45

    Systolic Blood Pressure (mm Hg)

    Bin width:5 mm Hg

    Bin width:20 mm Hg

    Bin width:1 mm Hg

    31

    How many intervals (bins) should you have in a histogram?

    There is no perfect answer to thisDepends on sample size nRough Guideline: # Intervals

    n

    n Number of Intervals

    10 about 350 about 7

    100 about 10

    Histogram applet athttp://www.stat.sc.edu/~west/javahtml/Histogram.html

    32

    Other Types of Histograms

    0 1 2 3 4

    05

    1015

    2025

    3035

    IgM concentrations (g/l) in 324 childrenF

    requ

    ency

    0 1 2 3 4

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    IgM concentrations (g/l)

    Rel

    ativ

    e F

    requ

    ency

    (%

    )

    0 1 2 3 4

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    IgM concentrations (g/l)R

    elat

    ive

    Fre

    quen

    cy (

    %)

    FrequencyHistogram

    RelativeFrequencyHistogram

    RelativeFrequencyPolygon

    http://www.stat.sc.edu/~west/javahtml/Histogram.html

  • 33

    Stem and Leaf Plot

    9 | 79

    10 | 1166788999

    11 | 0112333444555667777889

    12 | 00111111233445555566777777889

    13 | 0011123333456667788999

    14 | 0111224446

    15 | 003

    16 | 05

    34

    Boxplots

    100

    110

    120

    130

    140

    150

    160

    Sample Median

    25th Percentile

    75th Percentile

    Largest Non-Outlier

    Smallest Non-Outlier

    Outlier

    35

    Shapes of the Distribution

    1988

    1976

    Source: Sibergeld, Annual Rev. Public Health, 1997.

    Many Distributions are Not Symmetric

    36

    Shapes of the Distribution

    Symmetrical andbell-shaped

    Positively skewed orskewed to the right

    Negatively skewed orskewed to the left

    Bimodal Reverse J-shaped Uniform

  • 37

    Distribution Characteristics

    Mode Median Mean

    38

    Note on Shapes of Distributions

    Right Skewed (positively skewed)Long right tailMean > Mediane.g. hospital stays

    Left Skewed (negatively skewed)Long left tailMean < Mediane.g. humidity (cant get over 100%)

    Symmetric Right and left sides are mirror imagesLeft tail looks like right tailMean Median Mode

    Outlier An individual observation that falls outside theoverall pattern of the graph.

    39

    Mean=Balancing Point

    40

    The Histogram and the Probability Density

    80 100 120 140 160 180

    01

    23

    45

    Systolic Blood Pressure (mm Hg)N

    umbe

    r of

    Men

    in P

    opul

    atio

    n

    80 100 120 140 160 180

    010

    2030

    Systolic Blood Pressure (mm Hg)

    Num

    ber

    of M

    en in

    Pop

    ulat

    ion

    80 100 120 140 160 180

    0.00

    00.

    010

    0.02

    0

    Systolic Blood Pressure (mm Hg)P

    roba

    bilit

    y D

    ensi

    ty

    MediumSample

    LargeSample

    EntirePopulation

  • 41

    The Probability Density

    The probability density is a smooth idealized curve that showsthe shape of the distribution in the population

    This is generally a theoretical distribution that we can neversee: we can only estimate it from the distribution presented bya representative (random) sample from the population

    Areas in an interval under the curve represent the percent ofthe population in the interval

    42

    What is the most well-known Distribution?

    The Normal (Gaussian) Distribution

    25 30 35 40 45 50

    0.00

    0.02

    0.04

    0.06

    0.08

    Serum Albumin (g/l)

    Fre

    quen

    cy

    43

    The Normal (Gaussian) Distribution

    Symmetric Bell-shaped Mean Median Mode

    Applet at http://stat-www.berkeley.edu/~stark/Java/Html/StandardNormal.htm

    44

    The Normal Distribution

    There are lots of normal distributions:

    You can tell which normal distribution you have by knowing themean and standard deviation:

    Mean () is the center Standard deviation () measures the spread (variability)

    http://stat-www.berkeley.edu/~stark/Java/Html/StandardNormal.htm

  • 45

    The Normal Distribution

    Areas under a normal curve represent the proportion of total valuesdescribed by the curve that fall in that range:

    3 2 1 0 1 2 3

    The shaded area isapproximately 29% of thetotal area under the curve

    46

    The 68-95-99.7 Rule

    In any normal distribution, approximately:

    . 68% of the observations fall within one standard deviation ofthe mean.

    . 95% of the observations fall within two standard deviationsof the mean.

    . 99.7% of the observations fall within three standarddeviations of the mean.

    *more precisely, 1.96

    47

    Distributions of Heights in Females Age 18-24

    Approximately normal Mean 65 inches Standard deviation 2.5 inches

    57.5 60 62.5 65 67.5 70 72.5

    68%

    57.5 60 62.5 65 67.5 70 72.5

    68%

    95%

    57.5 60 62.5 65 67.5 70 72.5

    68%

    95%

    99.7%

    The rule says that if a

    population is normally

    distributed then approximately

    68% of the population will be

    within 1 SD of X

    It doesnt guarantee that

    exactly 68% of your sample of

    data will fall within 1 SD of X .

    Why? The rule works better if

    the sample size is big.

    48

  • 49

    Standard Normal Distribution = 0 and = 1

    =0 =2

    50

    Standard Normal ScoresZ -Scores

    How many standard deviations from the population mean are you?

    Standard Score(Z ) =observation population mean

    standard deviation

    A standard score of

    Z = 1 = observation lies one SD above the mean Z = 2 = observation lies two SD above the mean Z = 1 = observation lies one SD below the mean Z = 2 = observation lies two SD below the mean

    51

    Z -Scores

    Example: Female Heights, mean= 65, s = 2.5 inches

    1 Height = 72.5 inches

    Z =72.5 65

    2.5= +3.0

    2 Height = 60 inches

    Z =60 65

    2.5= 2.0

    52

    Whats the usefulness of standard normal scores?

    1 It tells you how many SD(s) an observation is from the mean.

    2 Thus, it is a way of quickly assessing how unusual anobservation is.

    Suppose the mean height is 65 inches, and s = 2.5

    Is 72.5 inches unusually tall? If we know Z = 3.0, does that help us?

  • 53

    Assuming the population has a normal distribution:

    Fraction of Population that isWithin Z More than Z More than Z More than Z

    SDs of SDs above SDs below SDs abovethe mean the mean the mean or below

    the mean

    Z0.5 38.29 % 30.85% 30.85% 61.71%1.0 68.27 % 15.87% 15.87% 31.73%1.5 86.64 % 6.68% 6.68% 13.36%2.0 95.45 % 2.28% 2.28% 4.55%2.5 98.76 % 0.62% 0.62% 1.24%3.0 99.73 % 0.13% 0.13% 0.27%3.5 99.95 % 0.02% 0.02% 0.05%

    54

    Normal Probability Applet atwww-stat.stanford.edu/~naras/jsm/FindProbability.html

    55

    Problems

    Suppose the population is normally distributed:

    1 If you have a standard score of Z = 2, what % of thepopulation would have scores greater than you?

    2 If you have a standard score of Z = 2, what % of thepopulation would have scores less than you?

    56

    3 If you have a standard score of Z = 3, what % of thepopulation would have scores greater than you?

    4 If you have a standard score of Z = 1.5, what % of thepopulation would have scores less than you?

    www-stat.stanford.edu/~naras/jsm/FindProbability.html

  • 57

    5 Suppose we callunusualobservations those that are either atleast 2 SD above the mean or about 2 SD below the mean.What % are unusual?

    In other words, what % of the observations will have astandard score either Z > +2.0 or Z < 2.0?What % would have |Z | > 2?

    6 What % of the observations would have |Z | > 1.0 (i.e., morethan 1 SD away from the mean)?

    58

    7 What % of the observations would have |Z | > 3.0?

    8 What percent of observations would have |Z | > 1.15?

    The above results will turn out to be very important later in ourdiscussion of p-values.

    59

    Normal Distribution

    z P z P z P z P z P0.00 1.0000 0.30 0.7642 0.60 0.5485 0.90 0.3681 1.20 0.23010.01 0.9920 0.31 0.7566 0.61 0.5419 0.91 0.3628 1.21 0.22630.02 0.9840 0.32 0.7490 0.62 0.5353 0.92 0.3576 1.22 0.22250.03 0.9761 0.33 0.7414 0.63 0.5287 0.93 0.3524 1.23 0.21870.04 0.9681 0.34 0.7339 0.64 0.5222 0.94 0.3472 1.24 0.21500.05 0.9601 0.35 0.7263 0.65 0.5157 0.95 0.3421 1.25 0.21130.06 0.9522 0.36 0.7188 0.66 0.5093 0.96 0.3371 1.26 0.20770.07 0.9442 0.37 0.7114 0.67 0.5029 0.97 0.3320 1.27 0.20410.08 0.9362 0.38 0.7039 0.68 0.4965 0.98 0.3271 1.28 0.20050.09 0.9283 0.39 0.6965 0.69 0.4902 0.99 0.3222 1.29 0.19710.10 0.9203 0.40 0.6892 0.70 0.4839 1.00 0.3173 1.30 0.19360.11 0.9124 0.41 0.6818 0.71 0.4777 1.01 0.3125 1.31 0.19020.12 0.9045 0.42 0.6745 0.72 0.4715 1.02 0.3077 1.32 0.18680.13 0.8966 0.43 0.6672 0.73 0.4654 1.03 0.3030 1.33 0.18350.14 0.8887 0.44 0.6599 0.74 0.4593 1.04 0.2983 1.34 0.18020.15 0.8808 0.45 0.6527 0.75 0.4533 1.05 0.2937 1.35 0.17700.16 0.8729 0.46 0.6455 0.76 0.4473 1.06 0.2891 1.36 0.17380.17 0.8650 0.47 0.6384 0.77 0.4413 1.07 0.2846 1.37 0.17070.18 0.8572 0.48 0.6312 0.78 0.4354 1.08 0.2801 1.38 0.16760.19 0.8493 0.49 0.6241 0.79 0.4295 1.09 0.2757 1.39 0.16450.20 0.8415 0.50 0.6171 0.80 0.4237 1.10 0.2713 1.40 0.16150.21 0.8337 0.51 0.6101 0.81 0.4179 1.11 0.2670 1.41 0.15850.22 0.8259 0.52 0.6031 0.82 0.4122 1.12 0.2627 1.42 0.15560.23 0.8181 0.53 0.5961 0.83 0.4065 1.13 0.2585 1.43 0.15270.24 0.8103 0.54 0.5892 0.84 0.4009 1.14 0.2543 1.44 0.14990.25 0.8026 0.55 0.5823 0.85 0.3953 1.15 0.2501 1.45 0.14710.26 0.7949 0.56 0.5755 0.86 0.3898 1.16 0.2460 1.46 0.14430.27 0.7872 0.57 0.5687 0.87 0.3843 1.17 0.2420 1.47 0.14160.28 0.7795 0.58 0.5619 0.88 0.3789 1.18 0.2380 1.48 0.13890.29 0.7718 0.59 0.5552 0.89 0.3735 1.19 0.2340 1.49 0.1362

    Tabulated values are the proportion of thestandard Normal distribution outside therange z, where z is a standard Normaldeviatealso called two-sided p-values.

    60

    Normal Distribution

    z P z P z P z P z P1.50 0.1336 1.80 0.0719 2.10 0.0357 2.40 0.0164 2.70 0.00691.51 0.1310 1.81 0.0703 2.11 0.0349 2.41 0.0160 2.71 0.00671.52 0.1285 1.82 0.0688 2.12 0.0340 2.42 0.0155 2.72 0.00651.53 0.1260 1.83 0.0672 2.13 0.0332 2.43 0.0151 2.73 0.00631.54 0.1236 1.84 0.0658 2.14 0.0324 2.44 0.0147 2.74 0.00611.55 0.1211 1.85 0.0643 2.15 0.0316 2.45 0.0143 2.75 0.00601.56 0.1188 1.86 0.0629 2.16 0.0308 2.46 0.0139 2.76 0.00581.57 0.1164 1.87 0.0615 2.17 0.0300 2.47 0.0135 2.77 0.00561.58 0.1141 1.88 0.0601 2.18 0.0293 2.48 0.0131 2.78 0.00541.59 0.1118 1.89 0.0588 2.19 0.0285 2.49 0.0128 2.79 0.00531.60 0.1096 1.90 0.0574 2.20 0.0278 2.50 0.0124 2.80 0.00511.61 0.1074 1.91 0.0561 2.21 0.0271 2.51 0.0121 2.81 0.00501.62 0.1052 1.92 0.0549 2.22 0.0264 2.52 0.0117 2.82 0.00481.63 0.1031 1.93 0.0536 2.23 0.0257 2.53 0.0114 2.83 0.00471.64 0.1010 1.94 0.0524 2.24 0.0251 2.54 0.0111 2.84 0.00451.65 0.0989 1.95 0.0512 2.25 0.0244 2.55 0.0108 2.85 0.00441.66 0.0969 1.96 0.0500 2.26 0.0238 2.56 0.0105 2.86 0.00421.67 0.0949 1.97 0.0488 2.27 0.0232 2.57 0.0102 2.87 0.00411.68 0.0930 1.98 0.0477 2.28 0.0226 2.58 0.0099 2.88 0.00401.69 0.0910 1.99 0.0466 2.29 0.0220 2.59 0.0096 2.89 0.00391.70 0.0891 2.00 0.0455 2.30 0.0214 2.60 0.0093 2.90 0.00371.71 0.0873 2.01 0.0444 2.31 0.0209 2.61 0.0091 2.91 0.00361.72 0.0854 2.02 0.0434 2.32 0.0203 2.62 0.0088 2.92 0.00351.73 0.0836 2.03 0.0424 2.33 0.0198 2.63 0.0085 2.93 0.00341.74 0.0819 2.04 0.0414 2.34 0.0193 2.64 0.0083 2.94 0.00331.75 0.0801 2.05 0.0404 2.35 0.0188 2.65 0.0080 2.95 0.00321.76 0.0784 2.06 0.0394 2.36 0.0183 2.66 0.0078 2.96 0.00311.77 0.0767 2.07 0.0385 2.37 0.0178 2.67 0.0076 2.97 0.00301.78 0.0751 2.08 0.0375 2.38 0.0173 2.68 0.0074 2.98 0.00291.79 0.0735 2.09 0.0366 2.39 0.0168 2.69 0.0071 2.99 0.0028

  • 61

    Is every variable normally distributed?

    Absolutely not.

    Then why do we spend so much timestudying the normal distribution?

    1 Some variables are normally distributed.

    2 A bigger reason is the Central Limit Theorem(next lecture)

    62

    Population versus Sample

    The population of interest could be

    All women between ages 30 and 40 All patients with a particular disease

    The sample is a small number of individuals from the population.The sample is a subset of the population.

    63

    Population versus Sample

    Sample mean X versus population mean ()

    e.g. mean blood pressure We know the sample X (e.g., X = 99 mmHg) We dont know the population mean but we would like to

    Sample proportion versus population proportion

    e.g. proportion of individuals with health insurance We know the sample proportion (e.g. 80%) We dont know the population proportion

    Key QuestionHow close is the sample mean (or proportion) to

    the population mean (or proportion)?

    64

    Population versus Sample

    A parameter A number that describes the population.A parameter is a fixed number, but in practice we do not know itsvalue.

    Example:population meanpopulation proportion

    A statistic A number that describes a sample of data.A statistic can be calculated. We often use a statistic to estimatean unknown parameter.

    Example:sample meansample proportion

  • 65

    Sources of Error

    Errors from Biased SamplingThe study systematically favors certain outcomes

    . Voluntary response

    . Non-response

    . Convenience sampling

    Solution:Random sampling

    Errors from (Random) Sampling

    . Caused by chance occurrence

    . Get a bad sample because of bad luck(by bad I mean not representative)

    . Can be controlled by taking a larger sample

    Using mathematical statistics, we can figure out how muchpotential error there is from random sampling (standard error)

    66

    Some Examples of Potentially Biased Sampling

    Example Blood pressure study of women age 30-40 Volunteers

    Non-random; selection bias Family members

    Non-random; not independent Telephone survey; random digit dial

    Random or non-random sample?

    Example Clinic Population 100 consecutive patients

    Random or non-random sample? Convenience samples are sometimes assumed to

    be random.

    67

    Example: Literary Digest poll of 1936 presidential election

    Election result: 62% voted for RooseveltDigest prediction: 43% voted for Roosevelt

    Problem: Sampling Bias

    Selection Bias

    Mail questionnaire to 10 million people Sources: telephone books, clubs Poor people are unlikely to have telephone

    (only 25% had telephones)

    Non Response Bias

    Only about 20% responded (2.4 million) Responders different than non-responders

    68

    Bottom Line

    When a selection procedure is biased, taking a larger sampledoes not help

    . This just repeats the mistake on a larger scale

    Non-respondents can be very different from respondents. When there is a high non-response rate, look out for

    non-response bias

  • 69

    Random Sample

    When a sample is randomly selected from a population, it iscalled a random sample

    In a simple random sample each individual in the populationhas an equal chance of being chosen for the sample

    Random sampling helps control systematic bias But even with random sampling, there is still sampling

    variability or error

    70

    Sampling Variability

    If we repeatedly choose samples from the same population, astatistic will take different values in different samples

    IDEAIf the statistic does not changemuch if you repeated the study

    (you get the same answer each time),then it is fairly reliable(not a lot of variability)

    71

    Example

    Estimate the proportion of persons in a population who havehealth insurance

    Choose a sample of size n = 1373.

    Sample 1

    n = 1373 p =1100

    1373= .8012

    Is the sample proportion reliable? If we took another sample of another 1373 persons,

    would the answer bounce around a lot?

    72

    Sample 1

    p =1100

    1373= .8012

    Sample 2

    p =1090

    1373= .7939

    Sample 3p = .8347

    Sample 4p = .7786

    and so on

  • 73

    The Sampling Distribution

    Sample Proportion with Health Insurance

    0.76 0.78 0.80 0.82 0.84

    010

    2030

    Samples of size n= 1373

    Histogramof 1000Sample

    Proportions

    74

    The spread of the sampling distribution depends on the sample size

    Sample Proportion with Health Insurance

    0.70 0.75 0.80 0.85 0.90

    05

    1015

    Samples of size n= 300Sample Proportion with Health Insurance

    0.70 0.75 0.80 0.85 0.90

    05

    1015

    2025

    30

    Samples of size n= 1000

    Proportions based on

    sample size n = 300

    Proportions based on

    sample size n = 1000

    75

    Lets explore this...

    76

    Population distribution of Health Insurance

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = .80

  • 77

    Lets do an experiment...

    Take 500 separate random samples from this population ofpatients, each with n = 20 patients

    For each of the 500 samples, we will plot the health insurance status record the sample proportion

    Ready, set,go...

    78

    Sample 1

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.9

    Sample 2

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.6

    79

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 20

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    p = 0.8s = 0.11

    80

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofpatients, each with n = 50 patients

    For each of the 500 samples, we will plot the health insurance status record the sample proportion

    Ready, set,go...

  • 81

    Sample 1

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.8

    Sample 2

    No Health Insurance Health InsuranceP

    erce

    ntag

    e

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.7

    82

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 50

    0.0 0.2 0.4 0.6 0.8 1.0

    01

    23

    45

    67

    p = 0.8s = 0.06

    83

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofpatients, each with n = 100 patients

    For each of the 500 samples, we will plot the health insurance status record the sample proportion

    Ready, set,go...

    84

    Sample 1

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.76

    Sample 2

    No Health Insurance Health Insurance

    Per

    cent

    age

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    p = 0.83

  • 85

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 100

    0.0 0.2 0.4 0.6 0.8 1.0

    02

    46

    8

    p = 0.8s = 0.04

    86

    Lets Review

    No Health Insurance Health Insurance

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0 0.2 0.4 0.6 0.8 1.0

    Population

    n = 20

    n = 50

    n = 100

    p = .8

    p = 0.799

    p = 0.803

    p = 0.798

    sp = 0.11

    sp = 0.06

    sp = 0.04

    87

    Population distribution of blood pressures

    Systolic Blood Pressure (mm Hg)

    Per

    cent

    age

    of M

    en in

    Pop

    ulat

    ion

    80 100 120 140 160

    0.00

    00.

    005

    0.01

    00.

    015

    0.02

    00.

    025

    0.03

    0

    = 125 mm Hg = 14 mm Hg

    88

    Lets do an experiment...

    Take 500 separate random samples from this population ofmen, each with n = 20 subjects

    For each of the 500 samples, we will plot a histogram of the sample BP values record the sample mean sample standard deviation

    Ready, set,go...

  • 89

    Sample 1

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 125.17

    s = 12.36

    Sample 2

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 124.3

    s = 11.65

    90

    So we did this 500 times...Lets look at a histogram of the 500 sample meanseach based on a sample of size 20

    80 100 120 140 160 180

    0.00

    0.02

    0.04

    0.06

    0.08

    0.10

    0.12

    X = 125

    sX = 3.07

    91

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofmen, each with n = 50 subjects

    For each of the 500 samples, we will plot a histogram of the sample BP values record the sample mean sample standard deviation

    Ready, set,go...

    92

    Sample 1

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 124.98

    s = 14.05

    Sample 2

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 126.72

    s = 13.64

  • 93

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 50

    80 100 120 140 160 180

    0.00

    0.05

    0.10

    0.15

    0.20

    X = 125.01

    sX = 1.93

    94

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofmen, each with n = 100 subjects

    For each of the 500 samples, we will plot a histogram of the sample BP values record the sample mean sample standard deviation

    Ready, set,go...

    95

    Sample 1

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 127.32

    s = 14.93

    Sample 2

    80 100 120 140 160 180

    0.00

    0.01

    0.02

    0.03

    0.04

    X = 125.06

    s = 13.15

    96

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 100

    80 100 120 140 160 180

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    X = 124.93

    sX = 1.41

  • 97

    Lets Review

    80 100 120 140 16090 100 110 120 130 140 150 160

    80 100 120 140 160

    80 100 120 140 160

    80 100 120 140 160

    Population

    n = 20

    n = 50

    n = 100

    = 125

    X = 124.997

    X = 125.015

    X = 124.934

    = 14

    sX = 3.07

    sX = 1.93

    sX = 1.41

    98

    Population distribution of hospital length of stay

    Length of Stay (in days)

    Per

    cent

    age

    0 5 10 15 20 25 30

    0.00

    0.05

    0.10

    0.15

    0.20

    = 4 days = 3 days

    99

    Lets do an experiment...

    Take 500 separate random samples from this population ofhospital admissions, each with n = 16 patients

    For each of the 500 samples, we will plot a histogram of the sample LOS values record the sample mean sample standard deviation

    Ready, set,go...

    100

    Sample 1

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 4.7

    s = 2.88

    Sample 2

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 5.01

    s = 2.73

  • 101

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 16

    0 5 10 15 20 25

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    X = 4.08

    sX = 0.74

    102

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofmen, each with n = 64 subjects

    For each of the 500 samples, we will plot a histogram of the sample LOS values record the sample mean sample standard deviation

    Ready, set,go...

    103

    Sample 1

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 4.26

    s = 2.72

    Sample 2

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 4.08

    s = 2.45

    104

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 64

    0 5 10 15 20 25

    0.0

    0.2

    0.4

    0.6

    0.8

    X = 4.1

    sX = 0.37

  • 105

    Lets do ANOTHER experiment...

    Take 500 separate random samples from this population ofmen, each with n = 256 subjects

    For each of the 500 samples, we will plot a histogram of the sample LOS values record the sample mean sample standard deviation

    Ready, set,go...

    106

    Sample 1

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 4.48

    s = 3.32

    Sample 2

    0 5 10 15 20 25

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    X = 4.29

    s = 2.76

    107

    So we did this 500 times...Lets look at a histogram of the 500 sample meansEach based on a sample of size 256

    0 5 10 15 20 25

    0.0

    0.5

    1.0

    1.5

    2.0

    X = 4.1

    sX = 0.19

    108

    Lets Review

    0 5 10 15 20 250 10 20

    0 5 10 15 20 25

    0 5 10 15 20 25

    0 5 10 15 20 25

    Population

    n = 16

    n = 64

    n = 256

    = 4

    X = 4.081

    X = 4.104

    X = 4.1

    s = 3

    sX = 0.74

    sX = 0.37

    sX = 0.19

  • 109

    Variation in sample mean values tied to size of each sampleNOT the number of samples

    500 5000 500 5000 500 5000

    23

    45

    67

    8

    Simulations Simulations Simulations

    n=16 n=64 n=256

    110

    The Sampling Distribution

    The sampling distribution of a sample statistic refers to what thedistribution of the statistic would look like if we chose a largenumber of samples from the same populationwww.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

    111

    Sampling Distribution of a Sample Mean

    The sampling distribution of a sample mean is a theoreticalprobability distributionIt describes the distribution of

    all sample means from all possible random samples of the same size taken from a population

    112

    In real research it is impossible to estimate the samplingdistribution of a sample mean by actually taking multiplerandom samples from the same population

    no research would ever happen if a study needed to berepeated multiple times to understand this sampling behavior

    Simulations are useful to illustrate a concept, but not tohighlight a practical approach!

    Luckily, there is some mathematical machinery thatgeneralizes some of the patterns we saw in the simulationresults

    www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

  • 113

    Amazing Result

    Mathematical statisticians have figured out how to predict whatthe sampling distribution will look like without actually repeatingthe study numerous times and having to choose a sample each time

    Often, the sampling distribution willlook normal

    Sample Proportion with Health Insurance

    0.76 0.78 0.80 0.82 0.84

    010

    2030

    Samples of size n= 1373 114

    The Big Idea

    Its not practical to keep repeating a study to evaluate samplingvariability and to determine the sampling distribution.Mathematical statisticians have figured out how to calculate itwithout doing multiple studies.

    The sampling distribution of a statistic is often normallydistributed.

    This mathematical result comes from the CENTRAL LIMITTHEOREM. For the theorem to work, it requires the samplesize (n) to be large (usually n > 60 suffices).

    Statisticians have derived formulas to calculate the standarddeviation of the sampling distribution and its called thestandard error of the statistic.

    115

    Central Limit Theorem

    If the sample size is large, the distribution of sample meansapproximates a normal distribution:

    Mean Value

    Num

    ber

    of O

    ccur

    renc

    es

    1 2 3 4 5 6

    Mean Value

    1 2 3 4 5 6

    Mean Value

    1 2 3 4 5 6

    One Die Two Dice Five Dice

    116

    Illustration of the Central Limit Theorem

    0 2 4 6 8 10

    0 2 4 6 8 10

    0 2 4 6 8 10

    0 2 4 6 8 10

    Population

    Means based

    on n = 16

    Means based

    on n = 32

    Means based

    on n = 64

  • 117

    Why is the normal distribution so important in the study of statistics?

    Its not because things in nature are always normally distributed(although sometimes they are)

    Its because of the Central Limit Theorem:The sampling distribution of statistics (like a sample mean) oftenfollows a normal distribution if the sample sizes are large

    118

    Why is the sampling distribution so important?

    If a sampling distribution has a lot of variability (i.e. has a bigstandard error), then if you took another sample its likely youwould get a very different result

    About 95% of the time the sample mean (or proportion) willbe within 2 standard errors of the population mean (orproportion)

    This tells us how close the sample statistic should be to thepopulation parameter

    119

    Standard Errors (SE)

    Measures the precision of your sample statistic A small SE means it is more precise The SE is the standard deviation of the sampling distribution

    of the statistic

    Mathematical statisticians have come up with formulas for thestandard error. There are different formulae for:

    Standard error of the mean (SEM) Standard error of a proportion

    These formulae always involve the sample size n. As thesample size gets bigger, the standard error gets smaller.

    120

    The standard deviationIS NOT

    The standard error of a statistic

    Standard deviation measures the variability among individualobservations.

    Standard error measures the precision of a statistic such as thesample mean or proportion that is calculated from anumber (n) of different observations. The samplemean and sample proportion are trying to estimatethe population mean or population proportion.

  • 121

    Standard Error of the MeanSEM

    This is a measure of the precision of the sample mean:

    SEM =sn

    Example

    Measure systolic blood pressure on random sample of 100 studentsSample size n = 100Sample mean X = 123.4 mmHgSample SD s = 14.0 mmHg

    SEM =14100

    = 1.4mmHg

    122

    Notes on SEM

    1 The smaller SEM is, the more precise X is.

    2 SEM depends on n and s.

    3 SEM gets smaller if s gets smaller n gets bigger

    123

    Question:

    How close to population mean () is sample mean (X)?

    ANSWER

    The standard error of the sample mean tells us 95% of the timethe population mean will lie within about 2 standard errors of thesample mean

    X 2SEM

    123.4 2 1.4

    123.4 2.8

    Why is this true? Because of the Central Limit Theorem

    INTERPRETATION

    We are 95% confident that the sample mean is within 2.8 mmHgof the population mean. The 95% error bound is 2.8.

    124

    95% Confidence Interval for Population Mean

    X 2SEM

    (More accurately: X 1.96SEM)

    The CI gives the range of plausible values for Example: Blood pressure n = 100, X = 123.4 mmHg, s = 1495% CI is

    123.4 2 1.4123.4 2.8

    Ways to write a confidence interval:

    120.6 to 126.2 (120.6, 126.2) (120.6126.2) We are highly confident that the population mean falls in the

    range 120.6 to 126.2.

  • 125

    Notes on Confidence Intervals

    1 Interpretation:Plausible values for the population mean with highconfidence

    2 Are all CIs 95%? No.

    It is the most commonly used A 99% CI is wider A 90% CI is narrower To be more confident you need a bigger interval For a 99% CI

    you need 2.576 SEM95% CI you need 2 SEM (actually its 1.96 SEM)90% CI you need 1.645 SEMWhere do these come from?

    126

    Notes on Confidence Intervals

    3 The length of CI decreases when n increases s decreases Level of confidence decreases (e.g. 90%, 80% vs 95%)

    4 Confidence interval is only accounting for random samplingerror not other systematic sources of error of bias

    Examples

    BP measurement is always +5 too highOnly those with high BP agree to participate(non response bias)

    127

    Notes on Confidence Intervals

    5 Technical InterpretationThe CI works (includes ) 95% of the time

    6 Confidence Interval Appletwww.stat.sc.edu/~west/javahtml/ConfidenceInterval.html

    128

    Underlying Assumptions for a 95% CI for the Population Mean

    X 2SEM

    X 2 sn

    Random sample of populationImportant

    Sample size n is at least 60 to use 2SEMCentral Limit Theorem requires large n

    www.stat.sc.edu/~west/javahtml/ConfidenceInterval.html

  • 129

    What if the sample size is smaller than 60?

    There needs to be a small correction in the formulaX 2SEM needs to be slightly bigger.

    How much bigger 2 needs to be depends on the sample size.Computers or statistical tables refer to the degrees offreedom = n 1. One looks up the correct number in at-table or t-distribution with n 1 degrees of freedom. You can think of degrees of freedom like a corrected sample

    size. In this case its n 1 because we had to estimate oneparameter by X . But its not always n 1.

    X t SEM

    X t sn

    130

    Value of t.95 used for 95% Confidence Interval for Mean

    df t df t

    1 12.706 12 2.1792 4.303 13 2.1603 3.182 14 2.1454 2.776 15 2.1315 2.571 20 2.0866 2.447 25 2.0607 2.365 30 2.0428 2.306 40 2.0219 2.262 60 2.000

    10 2.228 120 1.98011 2.201 1.960

    Notes

    Most people use t = 2 once n gets above 60 or so Sometimes people use 1.96 when n gets bigger (> 120) Value of t depends on the level of confidence and sample size

    131

    Students t-Distribution

    df 0.2 0.1 0.05 0.02 0.01 0.0011 3.078 6.314 12.706 31.821 63.657 636.6192 1.886 2.920 4.303 6.965 9.925 31.5993 1.638 2.353 3.182 4.541 5.841 12.9244 1.533 2.132 2.776 3.747 4.604 8.6105 1.476 2.015 2.571 3.365 4.032 6.8696 1.440 1.943 2.447 3.143 3.707 5.9597 1.415 1.895 2.365 2.998 3.499 5.4088 1.397 1.860 2.306 2.896 3.355 5.0419 1.383 1.833 2.262 2.821 3.250 4.781

    10 1.372 1.812 2.228 2.764 3.169 4.58711 1.363 1.796 2.201 2.718 3.106 4.43712 1.356 1.782 2.179 2.681 3.055 4.31813 1.350 1.771 2.160 2.650 3.012 4.22114 1.345 1.761 2.145 2.624 2.977 4.14015 1.341 1.753 2.131 2.602 2.947 4.07316 1.337 1.746 2.120 2.583 2.921 4.01517 1.333 1.740 2.110 2.567 2.898 3.96518 1.330 1.734 2.101 2.552 2.878 3.92219 1.328 1.729 2.093 2.539 2.861 3.88320 1.325 1.725 2.086 2.528 2.845 3.85021 1.323 1.721 2.080 2.518 2.831 3.81922 1.321 1.717 2.074 2.508 2.819 3.79223 1.319 1.714 2.069 2.500 2.807 3.76824 1.318 1.711 2.064 2.492 2.797 3.74525 1.316 1.708 2.060 2.485 2.787 3.72526 1.315 1.706 2.056 2.479 2.779 3.70727 1.314 1.703 2.052 2.473 2.771 3.69028 1.313 1.701 2.048 2.467 2.763 3.67429 1.311 1.699 2.045 2.462 2.756 3.65930 1.310 1.697 2.042 2.457 2.750 3.646

    Tabulated values correspond toa given two-tailed p-value fordifferent degrees of freedom.

    132

    Students t-Distribution

    df 0.2 0.1 0.05 0.02 0.01 0.00131 1.309 1.696 2.040 2.453 2.744 3.63332 1.309 1.694 2.037 2.449 2.738 3.62233 1.308 1.692 2.035 2.445 2.733 3.61134 1.307 1.691 2.032 2.441 2.728 3.60135 1.306 1.690 2.030 2.438 2.724 3.59136 1.306 1.688 2.028 2.434 2.719 3.58237 1.305 1.687 2.026 2.431 2.715 3.57438 1.304 1.686 2.024 2.429 2.712 3.56639 1.304 1.685 2.023 2.426 2.708 3.55840 1.303 1.684 2.021 2.423 2.704 3.55141 1.303 1.683 2.020 2.421 2.701 3.54442 1.302 1.682 2.018 2.418 2.698 3.53843 1.302 1.681 2.017 2.416 2.695 3.53244 1.301 1.680 2.015 2.414 2.692 3.52645 1.301 1.679 2.014 2.412 2.690 3.52046 1.300 1.679 2.013 2.410 2.687 3.51547 1.300 1.678 2.012 2.408 2.685 3.51048 1.299 1.677 2.011 2.407 2.682 3.50549 1.299 1.677 2.010 2.405 2.680 3.50050 1.299 1.676 2.009 2.403 2.678 3.49651 1.298 1.675 2.008 2.402 2.676 3.49252 1.298 1.675 2.007 2.400 2.674 3.48853 1.298 1.674 2.006 2.399 2.672 3.48454 1.297 1.674 2.005 2.397 2.670 3.48055 1.297 1.673 2.004 2.396 2.668 3.47656 1.297 1.673 2.003 2.395 2.667 3.47357 1.297 1.672 2.002 2.394 2.665 3.47058 1.296 1.672 2.002 2.392 2.663 3.46659 1.296 1.671 2.001 2.391 2.662 3.46360 1.296 1.671 2.000 2.390 2.660 3.460 1.282 1.645 1.960 2.326 2.576 3.291

  • 133

    t-distribution Applets

    http://www.stat.sc.edu/~west/applets/tdemo.html

    http:

    //www.econtools.com/jevons/java/Graphics2D/tDist.html

    134

    Example: Blood pressure

    n = 5 X = 99 mmHg s = 15.97

    95% CI is X 2.776 SEM

    99 2.776 7.142

    99 19.83

    The 95% CI for mean blood pressure is

    (79.17, 118.83)

    (79.17 118.83)

    Rounding off is okay too: (79, 119)

    135

    Confusion between SD and SEM

    Standard deviation (s) - measures spread in the data

    Standard error (s/n) - measures the precision of the sample

    mean

    The standard error of the sample mean depends on the sample size.

    Does the standard deviation depend on the sample size too?

    136

    PROPORTIONS (p)

    Proportion of individuals with health insurance Proportion of patients who became infected Proportion of patients who are cured Proportion of individuals who are hypertensive Proportion of individuals positive on a blood test Proportion of adverse drug reactions Proportion of premature infants who survive

    On each individual in the study, we record a binary outcome(Yes/No; Success/Failure) rather than a continuous measurement

    http://www.stat.sc.edu/~west/applets/tdemo.htmlhttp://www.econtools.com/jevons/java/Graphics2D/tDist.htmlhttp://www.econtools.com/jevons/java/Graphics2D/tDist.html

  • 137

    Proportions

    How accurate of an estimate is the sample proportion of thepopulation proportion?

    What is the standard error of a proportion?

    138

    Example

    n = 200 patientsX = 90 adverse drug reactionThe estimated proportion who experience an adverse drug reactionis

    p = 90/200 = .45

    or 45%

    NOTES

    There is uncertainty about this rate because it involved onlyn = 200 patients

    If we had studied a much larger number of patients, would wehave gotten a much different answer?

    The sample proportion is p = .45 But it is not the true rate of adverse drug reactions in the

    population

    139

    The Sampling Distribution of a Proportion

    Sample Proportion

    Num

    ber

    of S

    ampl

    es

    0.35 0.40 0.45 0.50 0.55

    02

    46

    810

    12

    The standard error of asample proportion is

    SE (p) =

    p (1 p)

    n

    140

    95% CI for a Proportion

    p 1.96SE (p)

    p 1.96

    p (1 p)n

    p is the sample proportionn is the sample size

    Example

    n = 200 patientsX = 90 adverse drug reactions

    p = 90/200 = .45

    .45 1.96.45 .55

    200.45 1.96 0.035

    .45 0.07The 95% confidence interval is (.38 .52).

  • 141

    Interpreting a 95% CI for a Proportion

    Plausible range of values for population proportion Highly confident that population proportion is in the interval The method works 95% of the time

    p

    142

    Notes on 95% CI for Proportions

    1 Random (or representative) sampleSuppose the 200 patients were sicker?Suppose the 200 patients were consecutive?

    2 The confidence interval does not address your definition ofdrug reaction and whether thats a good or bad definition. Itaccounts only for sampling variation.

    3 Can also have CI with different levels of confidence

    143

    4 Sometimes 1.96SE (p) is called

    95% Error BoundMargin of Error

    5 The formula for a 95% CI is ONLY APPROXIMATE. It workswell if the number of failures (drug reactions) and successes(non-reactions) are both at least 5.Otherwise, you need to use a computer to perform somethingcalled exact binomial calculations.You do NOT use the t-correction for small sample sizes likewe did for sample means. We use exact binomialcalculations.

    144

    Example

    Study of survival of premature infants. All premature babies born at Johns Hopkins during a 3 year

    period (Allen, et al. NEJM, 1993)

    n = 39 infants born at 25 weeks gestation 31 survived 6 months

    p =31

    39= 0.79

    95% CI .63 .91(based on exact binomial calculations)

    Source: Motulsky, Intuitive Biostatistics

  • 145

    Are confidence intervals needed even though all infants werestudied?

    Are the 39 infants a sample? Seems like its the whole population.

    It makes sense to calculate a CI when the sample isrepresentative of a larger population about which you

    wish to make inferences. It is reasonable to thinkthat these data from several years at one hospital are

    representative of data from other years at otherhospitals, at least at big-city university hospitals in

    the United States.

    146

    Comparison of 2 Groups

    Are the Population Means Different(Continuous Data)

    Two Situations

    1 Paired Design

    Before-after data Twin data

    2 Two Independent Sample Design

    School Children

    Treatment A

    Treatment B

    147

    Paired Design

    BeforeAfter

    Why Pairing?

    Controls extraneous noise Everyone acts as own control

    148

    Example: Blood pressure and Oral Contraceptive Use

    Subjects Ten non-pregnant, pre-menopausal women 16-49years old who were beginning a regimen of oralcontraceptive (OC)

    Methods Measure blood pressure prior to starting OC use, andthree-months after consistent OC use

    Goal Identify any changes in average blood pressureassociated with OC use in such women

    Rosner, Fundamentals of Biostatistics, (2005).

  • 149

    Example: Blood pressure and Oral Contraceptive UseDifference

    BP Before OC BP After OC After-Before1. 115 1282. 112 1153. 107 1064. 119 1285. 115 1226. 138 1457. 126 1328. 105 1099. 104 102

    10. 115 117

    sample mean 115.6 120.4

    The sample average of the differences is 4.8. The sample standard deviation (s) of the differences iss = 4.57.

    150

    Calculate a 95% CI for the Expected Change in Blood Pressure

    95% CI for population mean BP change

    X t.95,df =9 SEM

    4.8 2.262 4.5710

    4.8 2.262 1.4451.53 mm Hg to 8.07 mm Hg

    Notes

    1 Where does 2.262 come from?See the t-distribution with 9 degrees of freedom

    2 The BP change could be due to factors other than oralcontraceptives. A control group of comparable women whowere not taking oral contraceptives would strengthen thisstudy.

    151

    3 The number 0 is NOT in confidence interval (1.53 8.07)

    0 1.53 8.07

    Because 0 is not in the interval, suggests there is a significantchange in BP over time

    There is a significant increase in blood pressure

    152

    Hypothesis TestingSignificance Testingand p-values

    Want to draw a conclusion about a population parameter:

    In a population of women who use oralcontraceptives, is the average (expected) change in

    blood pressure (After-Before) 0 or not?

    Sometimes statisticians use the term expected for thepopulation average

    is the expected (population) mean change in blood pressure

    Choose between two competing possibilities for using asingle imperfect (paired) sample

    Null hypothesis H0: = 0Alternative hypothesis H1: 6= 0

    We reject H0 if the sample mean is far away from 0.

  • 153

    The Hypotheses

    We set up mutually exclusive, exhaustive possibilities for the truth:

    The null hypothesis H0Typically represents the hypothesis that there is noeffect or difference.It represents current beliefs or state of knowledge.For example, there is no effect of oral contraceptiveson blood pressure:

    H0 : = 0

    The alternative hypothesis H1Typically represents what you are trying to prove.For example, oral contraceptives affect bloodpressure:

    H1 : 6= 0

    154

    Do we have sufficient evidence to reject H0 and claim H1 is true?

    If X is close to zero, it is consistent with H0 If X is far from zero, is it consistent with H1

    How do we decide if X = 4.8 ismore consistent with H0 or H1?

    155

    The p-value

    What is the probability of observing an extreme sample meanlike4.8 mm Hgif the null hypothesis (H0: = 0 ) were true?

    The answer is called the p-value

    If that probability (p-value) is small, it suggests the observedresult was unlikely if H0 is true.

    This would provide evidence against H0 If that probability (p-value) is large, it suggests the observed

    result quite probably if H0 is true. This would provide evidence for H0

    156http://xkcd.com/892/

  • 157

    How are p-values calculated?

    1 First, measure the distance between the sample mean andwhat you would expect the sample mean to be if H0: = 0were true:

    t =sample mean 0

    SEM

    t =4.8

    4.57/

    10=

    4.8

    1.45= 3.31

    The value t = 3.31 is called the test statistic We observed a sample mean that was 3.31 standard

    deviations of the mean (SEM) away from what we would haveexpected the mean to be if OC has no effect (i.e., d = 0)

    158

    The t-statistic is analogous to the Z -score on pages 37-44

    Z =observation mean

    SDt =

    X 0SEM

    Z t

    observation sample meanstandard deviation standard error

    mean 0 because we are calculating p-values under the scenario thatH0: = 0

    159

    How are p-values calculated?

    2 Next, calculate the probability of getting a test statistic as ormore extreme than what you observed (t=3.31) if H0 wastrue:

    This p-value comes from the normal distribution. How unusual is it to get a standard normal score as extreme as

    3.31? Not likely at all (p < .01)

    3.313.31

    160

    How are p-values calculated?

    If the sample size is small (n < 60), a small t-correction mustbe made

    Instead of a normal distribution, t-distribution is used withn 1 degrees of freedom

    The p-values gets a little larger

    Use the t table. This procedure is called a paired t-test with n 1 degrees of

    freedom. In the oral contraceptive example, we performed apaired t-test with 9 degrees of freedom.

  • 161

    Interpreting the p-value

    The p-value in the blood pressure/OC example is .0089

    Interpretation If the true before OC/after OC blood pressuredifference is 0 amongst all women taking OCs, thenthe chance of seeing a mean difference asextreme/more extreme as 4.8 in a sample of 10women is .0089

    162

    Using the p-value to make a decision

    1 p-values are probabilities (numbers between 0 and 1).Small p-values are measures of evidence against H0 in favor ofH1.

    2

    The p-value is the probability of obtaining a resultas/or more extreme than you did by chance alone

    assuming the null hypothesis H0 is true.

    3 If the p-value is small either

    (a) A very rare event occurred and H0 is trueOR

    (b) H0 is false

    163

    Using the p-value to make a decision

    The p-value in the blood pressure/OC example is .0089

    . This p-value is small

    . So there is a small probability of observing our data (orsomething more extreme) if H0 is true

    . We reject H0

    164

    Using the p-value to make a decision

    4 p-value is a continuum of evidenceGuidelines?

    p = .10: suggestive p = .05: magical cutoff p = .01: strong evidence

    5 How precise should p-values be?

    2 decimal places suffice (p = .07) Sometimes 3 decimal places if p < .01

    . p = .007

    If the p-value is really small, p < .001 is fine If the p-value is really big, p > .20 is fine

  • 165

    Blood PressureOC ExampleSummary

    Methods The changes in blood pressures after oral contraceptive usewere calculated for 10 women. A paired t-test was used todetermine if there was a significant change in blood pressureand a 95% confidence was calculated for the mean bloodpressure change (after-before).

    Result Blood pressure measurements increased on average 4.8 mm Hgwith standard deviation 4.57. The 95% confidence interval forthe mean change was 1.5 8.1. There was evidence that bloodpressure measurements after oral contraceptive use weresignificantly higher than before oral contraceptive use(p = .0089).

    Discussion A limitation of this study is that there was no comparisongroup of women who did not use oral contraceptives. We donot know if blood pressures may have risen even without oralcontraceptive usage.

    166

    SummaryPaired t-test

    1 Designate null and alternative hypotheses

    2 Collect data Compute change for each paired set of observations Compute Xd , the sample mean of the paired differences Compute sd , the sample standard deviation of the differences

    3 Calculate the test statistic

    t =Xd 0SEM

    =Xd 0sd/n

    4 Compare t to a t-distribution to get a p-value If p is small, Reject H0 If p is large, Fail to Reject H0

    167

    Two Types of Errors

    Type I error: Claim H1 is true when in fact H0 is true

    Type II error: Do not claim H1 is true when in fact H1 is true

    The probability of making a Type I error is called the -levelThe probability of making a Type II error is called the -levelThe probability of NOT making a Type II error is called the power

    168

    The p-value and the -level

    Some people will only call a p-value significant if it is less thansome cutoff (e.g., .05). This cutoff is called the -level

    The -level is the probability of a type I error. It is the probabilityof falsely rejecting H0.

    Statistically significantThe p-value is less than a preset threshold value, .

    Do Not Say

    The result is statistically significantThe result is statistically significant at = .05The result is significant (p < .05)

    Instead Give the p-value and Interpret

    The result is statistically significant (p = .009)

  • 169

    One-Sided versus Two-Sided p-values

    Two-sided p-value: (p = .009) Probability of a result as or moreextreme than observed (either X < 4.8 or X > 48)

    One-sided p-value: (p = .0045) Probability of a more extremepositive result than observed (X > 4.8)

    You never know what direction study results will go...In this course, we will use two-sided p-values exclusively.This is what is typically done in the scientific/medical literature.

    170

    Connection between CIs and HTs

    The CI gives plausible values for the population parameter data take me to the truth

    Hypothesis testing postulates two choice for the populationparameter

    here are two possibilities for the truth, data help me choose one

    171

    Connection between CIs and HTs

    If 0 is not in the 95% CI, then we reject H0 that = 0 atlevel = .05 (the p-value < .05)

    0 1.53 8.07

    Why? CI starts at Xd and captures 2 standard errors in either

    direction If 0 is not in the 95% CI, the d is more than 2 standard errors

    from 0 (either above or below) So the distance (t) will be > 2 or < 2

    and the resulting p-value< .05

    172

    Connection between CIs and HTs

    In this BP-OC example, the 95% confidence interval tells usthat the p-value is less than .05, but it doesnt tell us that itis p = .009

    The confidence interval and the p-value are complementary.You cant get an exact p-value from just looking at aconfidence interval

    I like to report both

  • 173

    More on the p-value

    STATISTICAL SIGNIFICANCE DOESNOT IMPLY CAUSATION

    Blood Pressure Example

    There could be other factors that could explain the change inblood pressure.

    A significant p-value is only ruling out random sampling asthe explanation.

    Need a Comparison Group

    Self-selected (may be okay)Randomized (better)

    174

    More on the p-value

    STATISTICAL SIGNIFICANCEIS NOT THE SAME AS

    SCIENTIFIC SIGNIFICANCE

    Example: Blood Pressure and Oral Contraceptives

    n = 100, 000

    X = .03 mmHg

    s = 4.57

    p-value = .04

    Big n can sometimes produce a small p-value even though themagnitude of the effect is very small (not scientificallysignificant)

    Supplement with a CI: 95% CI is .002 .058 mmHg

    175

    The Language of Hypothesis (Significance) Testing

    Suppose the p-value is p = .40How might this result be described?

    Not statistically significant Do not reject H0

    Can we also say?

    Accept H0 Claim H0 is true

    Statisticians much prefer the double negativeDo not reject H0

    176

    More on the p-value

    NOT REJECTING H0 IS NOT THE SAMEAS ACCEPTING H0

    Example: Blood Pressure and Oral Contraceptives

    n = 5

    X = 5.0 mmHg

    s = 4.57

    p-value = .07

    We cannot reject H0 at significance level = .05.Are we convinced there is no effect of OC on BP?

    Maybe we should have taken a bigger sample. Interesting trend, but not proven beyond a reasonable doubt

    Look at the confidence interval 95% CI (-.67, 10.7)

    Innocent until proven guilty

  • 177

    Comparing Two Independent Groups

    Controlled Trial in Peru of Bismuth Subsalicylate (Pepto Bismol)in Infants with Diarrheal Disease

    Infants

    Controls n = 84

    Treatment n = 85

    Control Tx

    n 84 85Mean stool output ml/kg 260 182Standard deviation(s) 254 197

    Scientific Question: Is there a treatment effect?

    178

    Note

    The data are not paired. There are different infants in each group.

    2 independent groups How do we calculate

    Confidence interval for difference p-value to determine if the difference in two groups is

    significant 2-sample (unpaired) t-test

    179

    95% CI for the Difference in Means of Two Independent (Unpaired) Groups

    Generic CI formula:

    estimate 1.96 SE

    (X1 X2) 1.96 SE (X1 X2)

    SE (X1 X2) = standard error of thedifference of 2 sample means

    The standard error of the difference for two independentsamples is calculated differently than we did for paired designs.

    Statisticians have developed formulae for the standard error ofthe difference. These formulae depend on sample size in bothgroups and standard deviations in both groups.

    180

    The SE of the Difference in Sample Means

    Principle: Variation from independent sources can be added

    Variance(X1 X2) = (SE (X1))2 + (SE (X2))2

    SE (X1 X2) =

    (SE (X1))2 + (SE (X2))2

    Formula depends on n1, n2, s1, s2 There are other slightly different equations for SE (X1 X2)

    But they all give similar answers

  • 181

    The SE of the Difference in Sample Mean

    Control Tx

    n 84 85Mean stool output ml/kg 260 182Standard deviation(s) 254 197

    SE (X1 X2) =(

    254/

    84)2

    +(

    197/

    85)2

    =

    27.712 + 21.372

    = 34.94

    182

    Example: Pepto Bismol RCT

    95% CI for Difference in Mean

    78 1.96 SE (X1 X2)

    78 1.96 34.94

    78 68.48

    9 to 147

    Note

    The confidence interval does not include 0. Thus, p < .05

    183

    Hypothesis Test to Compare Two Independent Groups

    Two-Sample (Unpaired) t-test

    Are the expected stool outputs equal in the two groups?

    H0 : 1 = 2

    H1 : 1 6= 2

    t =difference in means 0SE of the difference

    t =260 182

    34.94=

    78

    34.94= 2.23

    184

    Notes on the 2-sample t-test

    1 This is a 2-sample (unpaired) t-test

    2 The value t = 2.23 is the test statistic

    3 We calculate a p-value which is the probability of obtaining atest statistic as extreme as we did if H0 was true.

    2.232.23

    4 How is the probability computed? If sample sizes are large(both greater than 60) a normal distribution is used.

    5 If sample sizes are small, a small t correction is required (a tdistribution is used with n1 + n2 2 degrees of freedom; thatis the degrees of freedom is the total sample size from bothgroups minus 2).An assumption that is also required is that both populationsare approximately normally distributed. (Results can be highlyinfluenced by wild observations or outliers.)

  • 185

    DiarrheaPepto BismolSummary

    Question Is there a difference in mean stool output between the twotreatment groups?

    Methods The stool output was calculated for 84 infants randomized toplacebo and 85 infants randomized to Pepto Bismol. A 95%confidence interval was calculated for the difference in meanstool output between the two groups and a two-sample t-testwas used to determine if there was a significant differentbetween the two groups.

    Result The mean stool outputs in the treated and control groups were182 and 260 respectively. The control group stool output wassignificantly higher than the treated group (p = .03). Thecontrol group was 78 ml/kg higher than the treated group(95% confidence interval 9 147 ml/kg).

    186

    Nonparametric Alternative to the 2-sample t-test

    Mann-Whitney-Wilcoxon Rank Sum Test

    Objective Assess if the two populations are different?Advantages Does not assume populations are normally

    distributed. The two-sample t-test requires thatassumption with small sample sizes

    Uses only ranksdo not need precise numericaloutcomes

    Not sensitive to outliersDisadvantage of the Nonparametric Test

    Nonparametric methods are often less sensitive(powerful) for finding true differences becausethey throw away information (they use onlyranks)

    Need full data set, not just summary statistics

    187

    Example: Health Intervention Study

    Evaluate an intervention to educate high school students abouthealth and lifestyle

    Y = Post Pretest Score

    Randomize

    Intervention (I) 5 0 7 2 19

    Control (C) 6 -5 -6 1 4

    Only 5 individuals in eachsample

    With such a small sample, we need to be sure scoreimprovements are normally distributed if we want to use t-testBIG assumption

    Alternative: Mann- Whitney-Wilcoxon non-parametric test!

    188

    Rank the pooled data:

    Order: Rank: Group:

    Find the average rank in the 2 groups:

    Intervention group average rank: 3+5+7+9+105 = 6.8

    Control group average rank: 1+2+4+6+85 = 4.2

    p-value calculations:

    Statisticians have developed formulae and tables to determine theprobability of observing such an extreme discrepancy (6.8 vs 4.2)by chance alone. Thats the p-value.In the example, p = .22.

  • 189

    Health Intervention StudySummary

    Question Is there a difference in test score change between theintervention and control groups?

    Design 10 high school students were randomized to either receive atwo-month health and lifestyle education program (or noprogram). Each student was administered a test regardinghealth and lifestyle issues prior to randomization and after thetwo-month period.

    Statistics Differences in the two test scores (after-before) were computedfor each student. Mean and median test score changes werecomputed for each study group. A Mann-Whitney rank sumtest was used to determine if there was a statistically significantdifference in test score change between the intervention andcontrol groups at the end of the two-month study period.

    Result The median score change was four points higher in theintervention group than in the control group. The difference intest score improvements between the intervention and controlgroups was not statistically significant (p = .17)

    190

    Note

    In the health insurance study, the p-value was .22.. No significant difference in test scores between the intervention

    and control group (p = .22)

    The two-sample t-test would give a different answer (p = .11). Different statistical procedures can give different p-values. If the largest observation 19 was changed to 100, the p-value

    based on the Mann-Whitney test would not change but thetwo-sample t-test would.

    191

    The t-test or the nonparametric test?

    Statisticians will not always agree, but there are some guidelines:

    Use nonparametric test if sample size is small and distributionlooks skewed. You might also do a t-test, too, and compare.

    Only ranks availableOtherwise, use t-test

    192

    Example: Exposure of Young Infants to Environmental Tobacco Smoke

    Objective This study examined the degree to which breast-feeding andcigarette smoking by mothers and smoking by other householdmembers contributed to the exposure of infants to theproducts of tobacco smoker (urinary cotinine level).

    Method We report median values and interquartile ranges for eachgroup. Comparisons between groups are made with theWilcoxon rank sum test because the distributions of urinecotinine values are positively skewed.

    Source: Mascola et al., AJPH, 1998, 88:893-895.

  • 193

    Extension of the 2-sample t-testAnalysis of VarianceOne-Way ANOVA

    The t-test compares two populations Analysis of variance is a generalization of the two-samplet-test to compare three or more populations

    The test statistic from ANOVA calculations is called theF -test

    A p-value is then calculated Are there any differences among the populations?

    An alternative strategy is to perform lots of two-samplet-tests (pairwise)

    That could be a lot of statistical testing! Instead, perform an ANOVA

    No significant differences Stop. No further analysis necessarySignificant differences Do two-sample t-tests to find them

    194

    Example: Pulmonary Disease

    Goal: Does passive smoking have a measurable effect onpulmonary health?

    Methods: Measure mid-expiratory flow (FEF) in liters persecond (amount of air expelled per second) in sixsmoking groups.

    1 Nonsmokers (NS)

    2 Passive Smokers (PS)

    3 Noninhaling Smokers (NI)

    4 Light Smokers (LS)

    5 Moderate Smokers (MS)

    6 Heavy Smokers (HS)

    White and Froeb. Small-Airways Dysfunction in Non-Smokers Chronically Exposedto Tobacco Smoke, NEJM 302: 13 (1980)

    195

    One strategy is to perform lots of two-sample t-tests...

    Group Group Mean FEF sd FEFnumber name (L/s) (L/s) n

    1 NS 3.78 0.79 2002 PS 3.30 0.77 2003 NI 3.32 0.86 504 LS 3.23 0.78 2005 MS 2.73 0.81 2006 HS 2.59 0.82 200

    In this example, there would be 15 comparisons... It would be nice to have one catch-all test which would tell

    you whether there were any differences amongst the six groups

    196

    Mean FEF 2 SE

    Group

    FE

    F (

    L/s)

    NS PS NI LS MS HS

    2.5

    2.7

    2.9

    3.1

    3.3

    3.5

    3.7

    Based on a one-way analysis of variance, there are significantdifferences in pulmonary function among these groups(p < .001).

    Pairwise two-sample t-tests show very significant differencesbetween nonsmokers and all other groups.

    There were no significant differences between passive smokers,noninhalers and light smokers; and between moderate andheavy smokers.

  • 197

    Smoking and FEFSummary

    Subjects A sample of over 3,000 persons was classified into one of sixsmoking categorizations based on responses to smoking relatedquestions

    Methods 200 men were randomly selected from each of five smokingclassification groups (non-smoker, passive smokers, lightsmokers, moderate smokers, and heavy smokers), as well as 50men classified as non-inhaling smokers for a study designed toanalyze the relationship between smoking and respiratoryfunction

    198

    Smoking and FEFSummary

    Statistics ANOVA was used to test for any differences in FEF levelsamongst the six groups of men

    Individual group comparisons were performed with aseries of two sample t-tests, and 95% confidence intervalswere constructed for the mean difference in FEF betweeneach combination of groups

    Results Analysis of variance showed statistically significant(p < .001) differences in FEF between the six groups ofsmokers.

    Non-smokers had the highest mean FEF value, 3.78 L/s,and this was statistically significantly larger than the fiveother smoking-classification groups

    The mean FEF value for non-smokers was 1.19 L/shigher than the mean FEF for heavy smokers (95% CI1.031.35 L/s), the largest mean difference between anytwo smoking groups

    199

    Whats the rationale behind analysis of variance?

    H0 : 1 = 2 = = k

    H1 : at least one mean is different

    The variation in the sample means between groups is compared tothe variation within a group.

    Group

    FE

    F (

    L/s)

    NS PS NI LS MS HS

    2.5

    2.7

    2.9

    3.1

    3.3

    3.5

    3.7

    If the between group variation is a lot bigger than the within groupvariation, that suggests there are some differences among thepopulations.

    http://www.ruf.rice.edu/~lane/stat_sim/one_way/index.html 200

    Overuse of Hypothesis TestsBad Statistics!!

    SampleAge n Mean

    < 20 97 17.8 20 88 24.6

    http://www.ruf.rice.edu/~lane/stat_sim/one_way/index.html

  • 201

    Comparing Two Proportions

    Study: Clinical trial of AZT to prevent maternal-infanttransmission of HIV.

    Randomize

    AZTn = 121

    9 infectedinfants

    Placebon = 127

    31 infectedinfants

    Conner et al. New England J. of Medicine 331:1173-1190 (1994)

    202

    Notes on Design

    Random assignment of TxHelps insure 2 groups are comparablePatient & physician could not request particular Tx

    Double blindPatient & physician did not know Tx assignment

    Definition of infectionTwo positive cultures (infant > 32 weeks)

    203

    HIV Transmission Rates

    AZT 9/121 = .074 (7.4%)Placebo 31/127 = .244 (24.4%)

    Note

    These are NOT the true population parameters for thetransmission rates.There is sampling variability

    204

    HIV Transmission Rates

    95% confidence intervals

    AZT 95% CI .03 .14Placebo 95% CI .17 .32

    1 Is the difference significant, or can it be explained by chance?

    2 As CIs do not overlap, suggests significance. But whats thep-value?

    3 Note: if the CIs did overlap, it would still be possible to get ap < .05.

    Want a direct method for testing 2 independent proportions

  • 205

    Display the Data in a 2 2 Table(2 rows and 2 columns)

    AZT Placebo

    HIV transmission(infected)

    Yes 9 31 40

    No 112 96 208

    121 127 248

    206

    Hypothesis Testing

    H0 : p1 = p2

    H1 : p1 6= p2

    p1 = Proportion infected on AZTp2 = Proportion infected on placebo

    1 Fishers Exact Test

    2 (Pearsons) Chi-Square Test (2)

    207

    Fishers Exact Test

    As with all hypothesis tests, start by assuming H0 is true:AZT is not effective

    Imagine putting 40 red balls (the infected) and 208 blue balls(non-infected) in a jar. Shake it up.

    Now choose 121 ballsthats AZT group. The remaining balls are the placebo group.

    We can calculate the probability you get 9 or fewer red ballsamong the 121. That is the one-sided p-value.

    The two-sided p-value is just about (but not exactly) twicethe one-sided p-value. It accounts for the probability ofgetting either extremely few red balls or a lot of red balls inthe AZT group.

    The p-value is the probability of obtaining a result as or moreextreme (more imbalance) than you did by chance alone.

    208

    Notes on Fishers Exact Test

    Calculations are difficult Always appropriate to test equality of two proportions Computers are usually used Exact p-value (no approximations)

    no minimum sample size requirements

  • 209

    HIV-AZTSummary

    Study We conducted a randomized, double-blind, placebo-controlledtrial of the efficacy and safety of zidovudine (AZT) in reducingthe risk of maternal-infant HIV transmission

    Methods HIV transmission rates for both the placebo and AZT groupswere calculated as the ratio of HIV infected infants (based oncultures at 32 weeks) divided by the total number of infantsand 95% confidence intervals were calculated. The transmissionrates for the two groups were compared by Fishers Exact Test.

    Results The maternal infant HIV transmission rate for the AZTgroup was 7.4%(95% CI 3.5% 13.7%)

    The maternal infant HIV transmission rate for theplacebo group was 24.4%(95% CI 17.2% 32.8%)

    AZT significantly reduced the rate of HIV transmissioncompared to placebo (p < .001)

    210

    The Chi-Square Approximate Method

    Works for big sample sizes If all 4 numbers in the 2 2 table are 5 or more it is okay

    The only advantage of this method over Fishers Exact Test isyou dont need a computer to do it.

    211

    The Chi-Square Approximate Method

    Looks at discrepancies between observed and expected.

    O = observed

    E = expected =row total column total

    grand total

    Expected refers to the values for the cell counts that would beexpected if the null hypothesis is true

    212

    The Chi-Square Approximate Method

    1 Calculate expected counts assuming H0 is true

    2 Calculate a test statistic to measure the difference betweenwhat we observe and what we expect

    Test Statistic 2 =

    4 cells

    (O E )2

    E

    3 Use a chi-square table with 1 degree of freedom to get ap-value

    How likely is it to get such a big discrepancy between theobserved and expected?

  • 213

    2 Distribution with 1 Degree of Freedom

    0 1 5

    3.84

    214

    Performing the 2 Test for a 22 Table

    AZT Placebo

    HIV transmission(infected)

    Yes 9 31 40

    No 112 96 208

    121 127 248

    Observed = 9

    Expected = 121 40248

    = 19.52

    215

    Performing the 2 Test for a 22 Table

    AZT Placebo

    HIVYes 9 31 40

    No 112 96 208

    121 127 248

    AZT Placebo

    Yes 40

    No 208

    121 127 248

    Observed Expected

    216

    Performing the 2 Test for a 22 TableAZT Placebo

    HIVYes 9 31 40No 112 96 208

    121 127 248

    AZT Placebo

    HIVYes 19.52 20.48 40No 101.48 106.52 208

    121 127 248

    (9 19.52)2

    19.52+

    (112 101.48)2

    101.48+

    (31 20.48)2

    20.48+

    (96 106.52)2

    106.52

    = 13.19 2

    13.19

    The p-value is about p = .0003 It is NOT a coincidence that the

    square of Z on page 141 is almost the2. One is nearly the square of theother:

    13.19 3.63

  • 217

    2 Distribution with 1 Degree of Freedom

    This table assumes that you have one degree of freedomthe case when analyzing a22 table:

    2 P 2 P 2 P 2 P 2 P 2 P0.0 1.0000 2.5 0.1138 5.0 0.0253 7.5 0.0062 10.0 0.0016 12.5 0.00040.1 0.7518 2.6 0.1069 5.1 0.0239 7.6 0.0058 10.1 0.0015 12.6 0.00040.2 0.6547 2.7 0.1003 5.2 0.0226 7.7 0.0055 10.2 0.0014 12.7 0.00040.3 0.5839 2.8 0.0943 5.3 0.0213 7.8 0.0052 10.3 0.0013 12.8 0.00030.4 0.5271 2.9 0.0886 5.4 0.0201 7.9 0.0049 10.4 0.0013 12.9 0.00030.5 0.4795 3.0 0.0833 5.5 0.0190 8.0 0.0047 10.5 0.0012 13.0 0.00030.6 0.4386 3.1 0.0783 5.6 0.0180 8.1 0.0044 10.6 0.0011 13.1 0.00030.7 0.4028 3.2 0.0736 5.7 0.0170 8.2 0.0042 10.7 0.0011 13.2 0.00030.8 0.3711 3.3 0.0693 5.8 0.0160 8.3 0.0040 10.8 0.0010 13.3 0.00030.9 0.3428 3.4 0.0652 5.9 0.0151 8.4 0.0038 10.9 0.0010 13.4 0.00031.0 0.3173 3.5 0.0614 6.0 0.0143 8.5 0.0036 11.0 0.0009 13.5 0.00021.1 0.2943 3.6 0.0578 6.1 0.0135 8.6 0.0034 11.1 0.0009 13.6 0.00021.2 0.2733 3.7 0.0544 6.2 0.0128 8.7 0.0032 11.2 0.0008 13.7 0.00021.3 0.2542 3.8 0.0513 6.3 0.0121 8.8 0.0030 11.3 0.0008 13.8 0.00021.4 0.2367 3.9 0.0483 6.4 0.0114 8.9 0.0029 11.4 0.0007 13.9 0.00021.5 0.2207 4.0 0.0455 6.5 0.0108 9.0 0.0027 11.5 0.0007 14.0 0.00021.6 0.2059 4.1 0.0429 6.6 0.0102 9.1 0.0026 11.6 0.0007 14.1 0.00021.7 0.1923 4.2 0.0404 6.7 0.0096 9.2 0.0024 11.7 0.0006 14.2 0.00021.8 0.1797 4.3 0.0381 6.8 0.0091 9.3 0.0023 11.8 0.0006 14.3 0.00021.9 0.1681 4.4 0.0359 6.9 0.0086 9.4 0.0022 11.9 0.0006 14.4 0.00012.0 0.1573 4.5 0.0339 7.0 0.0082 9.5 0.0021 12.0 0.0005 14.5 0.00012.1 0.1473 4.6 0.0320 7.1 0.0077 9.6 0.0019 12.1 0.0005 14.6 0.00012.2 0.1380 4.7 0.0302 7.2 0.0073 9.7 0.0018 12.2 0.0005 14.7 0.00012.3 0.1294 4.8 0.0285 7.3 0.0069 9.8 0.0017 12.3 0.0005 14.8 0.00012.4 0.1213 4.9 0.0269 7.4 0.0065 9.9 0.0017 12.4 0.0004 14.9 0.0001

    218

    Chi-Square for Associations in r c Tables

    219

    Summary of Methods for Comparing Proportions

    Fishers Exact Test Always works with large or small sample sizeHighly computational; need a computer

    2-Test Works with larger sample sizeCalculations easy to doOne of the most popular statistical methods inscientific literatureExtends to larger tables

    220

    Note on p-Values and Sample Size

    Will the p-value change if we have smaller sample sizes butproportions remain about the same?Suppose our sample size were about 1/4 the original:

    AZT Placebo

    HIVtransmission

    Yes 2 8 10

    No 28 24 52

    30 32 62

    AZT 2/30 = 6.7%Placebo 8/32 = 25%

    p = .083

  • 221

    Note

    The p-value depends not only on the observed differencebetween the proportions, but also on the sample sizes

    If the sample sizes that two proportions were based on werebigger, the p-value would get smaller

    222

    Relative Risk

    Ratio of Proportions:

    Relative risk = p1/p2

    AZT Example

    1 The risk of HIV with AZT relative to placebo:Relative risk= p1/p2 = .074/.244 = .30

    2 The risk of HIV with placebo relative to AZT:Relative risk= p2/p1 = 3.29The risk of HIV transmission with placebo is more than 3times higher compared to AZT

    223

    3 You can testH0: Relative Risk= 1H1: Relative Risk6= 1Using any of the methods for comparing proportions(2, Fishers)

    224

    The Relative Risk versus the p-Value

    The relative risk tells you the magnitude of the disease-exposureassociation.

    The p-value (calculated using either Fishers exact test or the 2

    statistic) tells you if the observed result can be explained bychance.

    A big relative risk does not necessarily mean that the p-value issmall.

    The p-value depends both on the magnitude of the relative risk aswell as the sample size.

  • 225

    Describing the Association Between Two Continuous Variables

    1 Scatter plot

    2 Correlation coefficient

    3 Simple linear regression

    226

    Association between body weight (X ) and plasma volume (Y )

    Body Weight (kg) Plasma Volume (l)

    1 58.0 2.752 70.0 2.863 74.0 3.374 63.5 2.765 62.0 2.626 70.5 3.497 71.0 3.058 66.0 3.12

    227

    The Scatter Plot

    60 65 70

    2.6

    2.8

    3.0

    3.2

    3.4

    Body Weight (kg)

    Pla

    sma

    Vol

    ume

    (l)

    Scatter diagram of plasma volume andbody weight showing the linear regression

    line228

    The Correlation Coefficient

    Measures the direction and magnitude ofthe linear association between X and Y

    The correlation coefficient is between -1 and +1

    r = 1 r = 0 r = 1

    r = .7 r = .7

  • 229

    Examples of the Correlation Coefficient

    Perfect Positive Uncorrelated

    Weak Positive Weak Negative

    230

    Properties of Correlation Coefficient

    1 Corr(X ,Y ) = r

    2 1 r 1

    r = 1: Perfect positive association r > 0: Positive association r = 0: No association r < 0: Negative association r = 1: Perfect negative association

    3 Closer to 1 and 1: stronger relationship4 Sign: direction of association

    5 r = 0: no linear association

    231

    Correlation Slider:noppa5.pc.helsinki.fi/koe/corr/cor7.html

    Correlation Guessing Game:http://istics.net/stat/Correlations/

    232

    Correlation measures linear association

    r = 0

    A strong relationship along a curve for which r = 0

    noppa5.pc.helsinki.fi/koe/corr/cor7.htmlhttp://istics.net/stat/Correlations/

  • 233

    Four Scatterplotsall have r = .7

    Anscombes Data

    234

    NOTES AND CAVEATS ON CORRELATION COEFFICIENT

    Measuring only linear relationships Other kinds of relationships are also important Look at and graph the data Sensitive to outliers X values are measured not controlled by the experimental

    design. That is, X and Y are random

    Example where r is appropriate

    X = height Y = weight

    Example where r is not appropriate

    Clinical study at different dosesX = dose of drug Y = Response

    235

    Body WeightBlood Plasma

    60 65 70

    2.6

    2.8

    3.0

    3.2

    3.4

    Body Weight (kg)

    Pla

    sma

    Vol

    ume

    (l)

    r = .76

    236

    How Close Do the Points Fall to the Line?

    This is measured by the correlation coefficient But what line? This is measured by regression

  • 237

    Simple Linear Regression

    Y is the dependent variable X is the independent variable

    Predictor Regressor Covariate

    We try to predict Y from X Called simple because there is only one independent variableX

    If there are several independent variables, its called multiplelinear regression

    238

    Simple Linear Regression

    Fit a straight line to the data

    55 60 65 70 75

    2.2

    2.4

    2.6

    2.8