SAS Programming in Clinical Trials Chapter 3. SAS STAT Dongfeng Li Some Statistics Background Descriptive statistics: concepts and programs Comparing the Mean Analysis of Variance SAS Programming in Clinical Trials Chapter 3. SAS STAT Dongfeng Li Autumn 2010
302
Embed
SAS Programming in Clinical Trials Chapter 3. SAS STAT
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
SAS Programming in Clinical TrialsChapter 3. SAS STAT
Dongfeng Li
Autumn 2010
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Chapter Contents
I Some statistics background;I Descriptive statistics: concepts and programs;I Comparing means and proportions;I Analysis of variance.I Students should master the basic concepts,
descriptive statistics measures and graphs, basichypothesis testing, basic analysis of variance.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Chapter Contents
I Some statistics background;I Descriptive statistics: concepts and programs;I Comparing means and proportions;I Analysis of variance.I Students should master the basic concepts,
descriptive statistics measures and graphs, basichypothesis testing, basic analysis of variance.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Chapter Contents
I Some statistics background;I Descriptive statistics: concepts and programs;I Comparing means and proportions;I Analysis of variance.I Students should master the basic concepts,
descriptive statistics measures and graphs, basichypothesis testing, basic analysis of variance.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Chapter Contents
I Some statistics background;I Descriptive statistics: concepts and programs;I Comparing means and proportions;I Analysis of variance.I Students should master the basic concepts,
descriptive statistics measures and graphs, basichypothesis testing, basic analysis of variance.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Chapter Contents
I Some statistics background;I Descriptive statistics: concepts and programs;I Comparing means and proportions;I Analysis of variance.I Students should master the basic concepts,
descriptive statistics measures and graphs, basichypothesis testing, basic analysis of variance.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Section Contents
I Review statistics concepts:I Distribution, discrete distribution, continuouse
distribution, PDF, CDF, quantile. Normal distribution.I Mean, median, variance, standard deviation,
sampling distribution.I MLE, standard error.I Hypothesis tests, two types of errors, p-value.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Distribution
I Random Variable:I Discrete, such as sex, patient/control, age group.I Continuous, such as weight, blood pressure.
I Distribution: used to describe the relative chance oftaking some value.
I For discrete variable X , use P(X = xi ), where {xi}are the value set of X . Called probability massfunction(PMF).
I For continuous variable X , use the probability densityfunction(PDF) f (x), whereP(X ∈ (x − ε, x + ε) ∝ f (x)(2ε).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
CDF
I Cumulative distribution function(CDF) F (x):
F (x) = P(X ≤ x)
P(x ∈ (a,b]) = F (b)− F (a)
I For discrete distribution,
F (x) =∑xi≤x
P(X = xi)
I For continuous distribution,
F (x) =
∫ x
−∞f (t)dt
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
CDF
I Cumulative distribution function(CDF) F (x):
F (x) = P(X ≤ x)
P(x ∈ (a,b]) = F (b)− F (a)
I For discrete distribution,
F (x) =∑xi≤x
P(X = xi)
I For continuous distribution,
F (x) =
∫ x
−∞f (t)dt
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
CDF
I Cumulative distribution function(CDF) F (x):
F (x) = P(X ≤ x)
P(x ∈ (a,b]) = F (b)− F (a)
I For discrete distribution,
F (x) =∑xi≤x
P(X = xi)
I For continuous distribution,
F (x) =
∫ x
−∞f (t)dt
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Quantile function
I Quantile function: the inverse of CDF
q(p) = F−1(p),p ∈ (0,1)
if F (x) is 1-1 mapping.I Generally, q(p) = xp where
P(X ≤ xp) ≥ p,P(X ≥ xp) ≥ 1− p
(xp can be non-unique.)
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Quantile function
I Quantile function: the inverse of CDF
q(p) = F−1(p),p ∈ (0,1)
if F (x) is 1-1 mapping.I Generally, q(p) = xp where
P(X ≤ xp) ≥ p,P(X ≥ xp) ≥ 1− p
(xp can be non-unique.)
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
The normal distribution
I Standard normal distribution(N(0,1)), PDF
ϕ(x) =1√2π
e−x22
I Normal distribution N(µ, σ2), PDF
f (x) = ϕ(x − µσ
) =1√2πσ
exp{−(x − µ)2
2σ2 }
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
The normal distribution
I Standard normal distribution(N(0,1)), PDF
ϕ(x) =1√2π
e−x22
I Normal distribution N(µ, σ2), PDF
f (x) = ϕ(x − µσ
) =1√2πσ
exp{−(x − µ)2
2σ2 }
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Numerical charasteristics
I PDF is a curve with infinite number of points.I We can use some numbers to describe the key part
of a distribution.I Firstly, the location measurement.
I The mean EX .I The meadian, x0.5 = q(0.5) where
P(x ≤ x0.5) ≥ 0.5,P(x ≥ x0.5) ≥ 0.5
(can be non-unique).I Variability measurement.
I The standard deviation σX =√
E(X − EX )2.I Interquantile range q(0.75)− q(0.25).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
I Shape characteristics. Skewness
E(
X − EXσX
)3
I Symmetric, left-skewed, right-skewed density.I Shape characteristics. Kurtosis
E(
X − EXσX
)4
− 3
I Heavy tail problem.I Multimodel distribution. E.g., the weight of 10 year
old and 20 year old mixed together.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
I Shape characteristics. Skewness
E(
X − EXσX
)3
I Symmetric, left-skewed, right-skewed density.I Shape characteristics. Kurtosis
E(
X − EXσX
)4
− 3
I Heavy tail problem.I Multimodel distribution. E.g., the weight of 10 year
old and 20 year old mixed together.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
I Shape characteristics. Skewness
E(
X − EXσX
)3
I Symmetric, left-skewed, right-skewed density.I Shape characteristics. Kurtosis
E(
X − EXσX
)4
− 3
I Heavy tail problem.I Multimodel distribution. E.g., the weight of 10 year
old and 20 year old mixed together.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
I Shape characteristics. Skewness
E(
X − EXσX
)3
I Symmetric, left-skewed, right-skewed density.I Shape characteristics. Kurtosis
E(
X − EXσX
)4
− 3
I Heavy tail problem.I Multimodel distribution. E.g., the weight of 10 year
old and 20 year old mixed together.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
I Shape characteristics. Skewness
E(
X − EXσX
)3
I Symmetric, left-skewed, right-skewed density.I Shape characteristics. Kurtosis
E(
X − EXσX
)4
− 3
I Heavy tail problem.I Multimodel distribution. E.g., the weight of 10 year
old and 20 year old mixed together.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Population
I A population is modeled by a random variable or adistribution.
I Population parameters: unknown numbers whichcould decide the distribution. E.g., for N(µ, σ2)population, (µ, σ2) are the two unknown populationparameters.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Population
I A population is modeled by a random variable or adistribution.
I Population parameters: unknown numbers whichcould decide the distribution. E.g., for N(µ, σ2)population, (µ, σ2) are the two unknown populationparameters.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Sample
I A sample(typically) is n observations drawindependently from the population, X1,X2, . . . ,Xn.
I A sample can be regarded as n numbers, or n iidrandom variables.
I Statistics: calculate some values to estimate thedistribution of the population. Can be regarded as arandom variable. Such as sample mean, samplestandard deviation. Can be used to estimateunknown population parameters, called estimates.
I Sampling distribution: The distribution of anestimator or other statistic. E.g., if X1, . . . ,Xn is asample from N(µ, σ2), then X ∼ N(µ, σ2/n).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Sample
I A sample(typically) is n observations drawindependently from the population, X1,X2, . . . ,Xn.
I A sample can be regarded as n numbers, or n iidrandom variables.
I Statistics: calculate some values to estimate thedistribution of the population. Can be regarded as arandom variable. Such as sample mean, samplestandard deviation. Can be used to estimateunknown population parameters, called estimates.
I Sampling distribution: The distribution of anestimator or other statistic. E.g., if X1, . . . ,Xn is asample from N(µ, σ2), then X ∼ N(µ, σ2/n).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Sample
I A sample(typically) is n observations drawindependently from the population, X1,X2, . . . ,Xn.
I A sample can be regarded as n numbers, or n iidrandom variables.
I Statistics: calculate some values to estimate thedistribution of the population. Can be regarded as arandom variable. Such as sample mean, samplestandard deviation. Can be used to estimateunknown population parameters, called estimates.
I Sampling distribution: The distribution of anestimator or other statistic. E.g., if X1, . . . ,Xn is asample from N(µ, σ2), then X ∼ N(µ, σ2/n).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Sample
I A sample(typically) is n observations drawindependently from the population, X1,X2, . . . ,Xn.
I A sample can be regarded as n numbers, or n iidrandom variables.
I Statistics: calculate some values to estimate thedistribution of the population. Can be regarded as arandom variable. Such as sample mean, samplestandard deviation. Can be used to estimateunknown population parameters, called estimates.
I Sampling distribution: The distribution of anestimator or other statistic. E.g., if X1, . . . ,Xn is asample from N(µ, σ2), then X ∼ N(µ, σ2/n).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
MLE(Maximimum Likelihood Estimate)
I MLE is a commonly used parameter estimationmethod. For random sample X1,X2, . . . ,Xn frompopulation X , if PDF of PMF of X is f (x ;β), then
β = arg maxβ
L(β) = arg maxβ
n∏i=1
f (Xi ;β)
is called the MLE of the unknow parameter β.I Under some conditions, when the sample size
n→∞, β has a limiting(approximate) distributionN(β, σ2
β).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
MLE(Maximimum Likelihood Estimate)
I MLE is a commonly used parameter estimationmethod. For random sample X1,X2, . . . ,Xn frompopulation X , if PDF of PMF of X is f (x ;β), then
β = arg maxβ
L(β) = arg maxβ
n∏i=1
f (Xi ;β)
is called the MLE of the unknow parameter β.I Under some conditions, when the sample size
n→∞, β has a limiting(approximate) distributionN(β, σ2
β).
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Standard Error
I If β = ψ(X1, . . . ,Xn) is a estimate of an populationparameter β, it has a sampling distribution Fβ(x).
I The standard deviation of the sampling distribution isσβ.
I An estimate of σβ is called the standard error(SE) ofβ.
I If β has a limiting normal distribution, then β isapproximately distributed N(β, SE(β)).
I SE can measure the precision of estimation.I SE can be used to construct (approximate)
confidence intervals.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Hypothesis Tests
I To test the null hypothesis H0 agaist the alternativehypothesis Ha on the population;
I Given sample X1,X2, . . . ,Xn from population F (x ; θ),construct some statistic ξ, whose distribution doesnot depend on θ, but its value can indicate thepossible choice of H0 or Ha.
I Traditionally, we first choose an significance level α,then we find a rejection area W , wheresupH0
P(ξ ∈W ) ≤ α. We reject H0 when ξ ∈W .
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Hypothesis Tests
I To test the null hypothesis H0 agaist the alternativehypothesis Ha on the population;
I Given sample X1,X2, . . . ,Xn from population F (x ; θ),construct some statistic ξ, whose distribution doesnot depend on θ, but its value can indicate thepossible choice of H0 or Ha.
I Traditionally, we first choose an significance level α,then we find a rejection area W , wheresupH0
P(ξ ∈W ) ≤ α. We reject H0 when ξ ∈W .
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Hypothesis Tests
I To test the null hypothesis H0 agaist the alternativehypothesis Ha on the population;
I Given sample X1,X2, . . . ,Xn from population F (x ; θ),construct some statistic ξ, whose distribution doesnot depend on θ, but its value can indicate thepossible choice of H0 or Ha.
I Traditionally, we first choose an significance level α,then we find a rejection area W , wheresupH0
P(ξ ∈W ) ≤ α. We reject H0 when ξ ∈W .
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Errors and P-Values
I Two types of possible errors in a hypothesis test:I Type I error, H0 true but rejected. Error rate is at
most the significance level α.I Type II error, H0 false but accepted. Error rate can be
as large as 1− α.I To reduce type II error:
I Construct theoretically “good” tests.I Don’t choose α too small.I Choose a big enough sample size n.
I p-value: the minimum α we can use if we want toreject H0, after the test statistic value is known.
I The smaller the p-value is, the more confidence wehave when we reject H0. Reject H0 if and only if thep-value is less than α.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVariance
Example hypothesis test
I X ∼ N(µ, σ2). Sample is X1, . . . ,Xn.I H0 : µ ≤ µ0 ←→ Ha : µ > µ0.I Test statistic
T =X − µ0
SE(X )
where SE(X ) = S/√
n, S is the sample standarddeviation.
I Rejection area: W = {ξ > λ}, λ = F−1(1− α,n − 1),F−1(p,n) is the quantile function of the t(n − 1)distribution.
I P-value is 1− F (T ; n − 1), where F (x ; n) is the CDFof the t(n) distribution.
I The larger T is, the smaller the p-value is, the moreconfidence we have when we reject H0.
I Measurment level of variables.I Descriptive statistics for nominal variables.I Descriptive statistics for interval variables.I Histograms, boxplots, QQ plots, probability plots,
stem-leaf plots.I Using PROC FREQ, PROC MEANS, PROC
I Measurment level of variables.I Descriptive statistics for nominal variables.I Descriptive statistics for interval variables.I Histograms, boxplots, QQ plots, probability plots,
stem-leaf plots.I Using PROC FREQ, PROC MEANS, PROC
I Measurment level of variables.I Descriptive statistics for nominal variables.I Descriptive statistics for interval variables.I Histograms, boxplots, QQ plots, probability plots,
stem-leaf plots.I Using PROC FREQ, PROC MEANS, PROC
I Measurment level of variables.I Descriptive statistics for nominal variables.I Descriptive statistics for interval variables.I Histograms, boxplots, QQ plots, probability plots,
stem-leaf plots.I Using PROC FREQ, PROC MEANS, PROC
I Measurment level of variables.I Descriptive statistics for nominal variables.I Descriptive statistics for interval variables.I Histograms, boxplots, QQ plots, probability plots,
stem-leaf plots.I Using PROC FREQ, PROC MEANS, PROC
I Probability plot is the same plot as a QQ plot, exeptthat the label of the x axis Φ(xi) instead of xi .
I Probability plots are preferable for graphicalestimation of percentiles, whereas Q-Q plots arepreferable for graphical estimation of distributionparameters.
I Probability plot is the same plot as a QQ plot, exeptthat the label of the x axis Φ(xi) instead of xi .
I Probability plots are preferable for graphicalestimation of percentiles, whereas Q-Q plots arepreferable for graphical estimation of distributionparameters.
I The stem-leaf plot is a text-based plot, which displayinformation like the histogram, but with detail on eachdata value. Each “leaf” corresponds to one datavalue.
I PROC UNIVARIATE has an option PLOT, whichgenerates text-based stem-leaf plot, boxplot and QQplot.
I The stem-leaf plot is a text-based plot, which displayinformation like the histogram, but with detail on eachdata value. Each “leaf” corresponds to one datavalue.
I PROC UNIVARIATE has an option PLOT, whichgenerates text-based stem-leaf plot, boxplot and QQplot.
I PROC MEANS and PROC SUMMARY are used toproduce summary statistics.
I They can display overall summary statistics andclassified summary statistics. PROC MEANSdisplays result by default; PROC SUMMARY needthe PRINT option to display result.
I They can output data sets with the summarystatistics. PROC SUMMARY is designed to do this,although PROC MEANS can do the same.
I PROC MEANS and PROC SUMMARY are used toproduce summary statistics.
I They can display overall summary statistics andclassified summary statistics. PROC MEANSdisplays result by default; PROC SUMMARY needthe PRINT option to display result.
I They can output data sets with the summarystatistics. PROC SUMMARY is designed to do this,although PROC MEANS can do the same.
I PROC MEANS and PROC SUMMARY are used toproduce summary statistics.
I They can display overall summary statistics andclassified summary statistics. PROC MEANSdisplays result by default; PROC SUMMARY needthe PRINT option to display result.
I They can output data sets with the summarystatistics. PROC SUMMARY is designed to do this,although PROC MEANS can do the same.
I Female mean score 110.8, male mean score 115.6,male scores better. Is it significant?
I For two-sided test, using normal approximation forthe test statistic, p-value is 0.5978, no significantdifference between the GPA scores of female andmale students.
I For one sided test, we can only test Ha : male scoresbetter. P-value is 0.2989, male is not significantlybetter than female regarding GPA scores.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
I Female mean score 110.8, male mean score 115.6,male scores better. Is it significant?
I For two-sided test, using normal approximation forthe test statistic, p-value is 0.5978, no significantdifference between the GPA scores of female andmale students.
I For one sided test, we can only test Ha : male scoresbetter. P-value is 0.2989, male is not significantlybetter than female regarding GPA scores.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
I Female mean score 110.8, male mean score 115.6,male scores better. Is it significant?
I For two-sided test, using normal approximation forthe test statistic, p-value is 0.5978, no significantdifference between the GPA scores of female andmale students.
I For one sided test, we can only test Ha : male scoresbetter. P-value is 0.2989, male is not significantlybetter than female regarding GPA scores.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Paired Comparison
I Comparing two measurements of the same subject,instead of comparing the same measurement of twogroups of subjects, is a different problem fromtwo-sample test. It is called paired-comparison, weuse paired t-test to solve the problem.
I Example: Comparing the blood pressure of the samesubject before and after treatment by some drugs; ina fitness program, comparing the heart rate atentrance and at end of the program, etc.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Paired Comparison
I Comparing two measurements of the same subject,instead of comparing the same measurement of twogroups of subjects, is a different problem fromtwo-sample test. It is called paired-comparison, weuse paired t-test to solve the problem.
I Example: Comparing the blood pressure of the samesubject before and after treatment by some drugs; ina fitness program, comparing the heart rate atentrance and at end of the program, etc.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Paired T-test
I Let X be the “before” measurement, Y be the “after”measurement, both belong to the same subject, tocompare the mean of X and Y , let Z = X − Y , wesimply compare µZ with 0.
I Program solution 1: use PROC TTEST with PAIREDstatement. Need Z normal assumption.
I Program solution 2: first compute Z , then do onesample test H0 : µZ = 0←→ Ha : µZ 6= 0 usingPROC UNIVARIATE. This could also give the signedrank test and the sign test.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Paired T-test
I Let X be the “before” measurement, Y be the “after”measurement, both belong to the same subject, tocompare the mean of X and Y , let Z = X − Y , wesimply compare µZ with 0.
I Program solution 1: use PROC TTEST with PAIREDstatement. Need Z normal assumption.
I Program solution 2: first compute Z , then do onesample test H0 : µZ = 0←→ Ha : µZ 6= 0 usingPROC UNIVARIATE. This could also give the signedrank test and the sign test.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Paired T-test
I Let X be the “before” measurement, Y be the “after”measurement, both belong to the same subject, tocompare the mean of X and Y , let Z = X − Y , wesimply compare µZ with 0.
I Program solution 1: use PROC TTEST with PAIREDstatement. Need Z normal assumption.
I Program solution 2: first compute Z , then do onesample test H0 : µZ = 0←→ Ha : µZ 6= 0 usingPROC UNIVARIATE. This could also give the signedrank test and the sign test.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMeanOne Sample Test of theMean
Comparing Two Groups
Paired Comparison
Comparing Proportions
Analysis ofVariance
Example: Paired T-Test Using PROC TTESTA stimulus is being examined to determine its effect onsystolic blood pressure. Twelve men participate in thestudy. Their systolic blood pressure is measured bothbefore and after the stimulus is applied. Program:
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
One-Way ANOVA
I Two sample t-test −→ one-way anova:I Response Y , group C(factor).I Two sample t-test: Compare the mean of Y in the
two groups of C.I One-way ANOVA: Compare the mean of Y in more
than two groups of C.I Question:
I Is there any significant difference among the meansof Y of different groups? Equvalently, does factor Chave significant effect on the mean level of Y?
I Which pair of groups have significant meandifference? This is the multiple comparison problem.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
Example: The Comparison of Veneer BrandsI Compare WEAR of 5 BRANDS. Each brand has 4
samples.I Use PROC ANOVA or PROC GLM. PROC GLM
should be used for unbalanced design.I Theoretical prerequesits: Independence, normal
I γi,j is called an interaction effect.∑i γi,j = 0,
∑j γi,j = 0.
I Use PROC ANOVA or PROC GLM. For unbalanceddesign, use PROC GLM.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
Example of InteractionsI Example with only main effects: n = m = 2, r ≥ 2, µ = 10,α1 = 4, α2 = −4, β1 = 2, β2 = −2. If interaction effect doesnot exist(γi,j ≡ 0), then
the interaction increases the expectation when A and B areboth 1 or both 2, decreases the expectation other wise.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
Example of InteractionsI Example with only main effects: n = m = 2, r ≥ 2, µ = 10,α1 = 4, α2 = −4, β1 = 2, β2 = −2. If interaction effect doesnot exist(γi,j ≡ 0), then
the interaction increases the expectation when A and B areboth 1 or both 2, decreases the expectation other wise.
SAS Programmingin Clinical TrialsChapter 3. SAS
STAT
Dongfeng Li
Some StatisticsBackground
Descriptivestatistics: conceptsand programs
Comparing theMean
Analysis ofVarianceOne-Way ANOVA
Two-Way ANOVA
Two-way ANOVA ExampleI To study the effects of different production factors on
the strenth of some rubber product, consider 3 levelsof factor A, 4 levels of factor B, completeexperiement with 3× 4 = 12 combinations, eachrepeated 2 times, so we have n = 24 experiments.
I Data:data rubber;
do a=1 to 3; do b=1 to 4; do r=1 to 2;input stren @@;output;
Two-way ANOVA ExampleI To study the effects of different production factors on
the strenth of some rubber product, consider 3 levelsof factor A, 4 levels of factor B, completeexperiement with 3× 4 = 12 combinations, eachrepeated 2 times, so we have n = 24 experiments.
I Data:data rubber;
do a=1 to 3; do b=1 to 4; do r=1 to 2;input stren @@;output;