PubH 7405: BIOSTATISTICS: REGRESSIONchap/F09-InferencesPartII.pdf · PubH 7405: REGRESSION ANALYSIS SLR: INFERENCES, Part II . We cover the topic of inference in two sessions; the

PubH 7405: REGRESSION ANALYSIS

SLR: INFERENCES, Part II

We cover the topic of inference in two sessions; the first session focused on inferences concerning the slope and the intercept; this is a continuation on estimating the mean response – and more. Applications concerning the slope and the intercept are based on the following four (4) theorems

SAMPLING DISTRIBUTION OF SLOPE

∑ −=

:Model" RegressionError Normal" Under the

εββ

Variance and Mean with Normal is bslope estimated the of ondistributi sampling The

Theorem 1A:

IMPLICATION

)()( 1

σσββ

1−=− ndfn

χdistributed as N(0,1)

freedom of degrees 2)(n with t"" as ddistribute is )( 1

11 −−bs

b β:1B Theorem

CONFIDENCE INTERVALS

11 −−bs

b β:1B Theorem

freedom of degrees 2)-(non with distributi t"" theof percentile α/2)100(1 theis 2)nα/2;t(1

)b()2;2/1(b:is

−−−−−±

−snt α

1βfor Interval Confidence α)100%(1

SAMPLING DISTRIBUTION OF INTERCEPT

∑ 2_

εββ

Variance and Mean with Normal is bintercept estimated the of ondistributi sampling The

Theorem 2A:

IMPLICATION

)()( 0

σσββ

1−=− ndfn

00 −−bs

b β:2B Theorem

00 −−bs

b β:2B Theorem

)b()2;2/1(b:is

−−−−−±

−snt α

0βfor Interval Confidence α)100%(1

xxXYE 10)|(:ResponseMean Theββ +==

A common objective in regression analysis is to estimate the mean response. For example: (1) we are interested to know the average blood pressure for women at certain age and how estimate it using the relationship between SBP and Age, and (2) in a study of the relationship between level of pay (salary, X) and worker productivity (Y), the mean productivity at high, medium, and low levels of pay may be of particular interest for any company.

POINT ESTIMATE

xxXYE 10)|(:ResponseMean Theββ +==

Let X = xh denote the level of X for which we wish to estimate the mean response, i.e. E(Y|X=xh); this xh may be a value which occurred in the sample, or it may be some other value of the predictor variable within the scope of the model. The point estimate of the response is:

xbbYxXYE

EstimatePoint

SAMPLING DISTRIBUTION

∑ 2_

)(1)Y(

)|()YE(

εββ

Variance and Mean with Normal is Y Response

Mean estimated the of ondistributi sampling The

:#3A Theorem

The sampling distribution of Ŷh is “normal” because this estimated mean response, like the intercept and the slope, Ŷh is a linear combination of the observations yi and the distribution of each observation is normal under the “normal error regression model”:

The estimated mean response is unbiased because the estimated intercept and estimated slope are both unbiased:

)()()(

bExbEYE

−+−+=

∑ ∑

)()(21

)()(121

kxxkxxnn

Taking square root to get Standard Error

)x(xn1MSE)YSE(

)(1)(xx

MSEYsi

Implication:

Our estimates are less precise toward the ends

MORE ON SAMPLING DISTRIBUTION

σσ÷

1−=− ndfn

freedom of degrees 2)(n with t"" as ddistribute is )(

−−

:#3B Theorem

freedom of degrees 2)(n with t"" as ddistribute is)(

−−

:#3B Theorem

)()2;2/1(

:is ^^

−−−−−±

hh YsntY α

h^Yfor Interval Confidence α)100%(1

x (oz) y (%)112 63111 66107 72119 5292 7580 11881 12084 114

118 42106 72103 9094 91

EXAMPLE #1: Birth weight data: Intercept = 256.972 Slope = -1.737 MSE = 75.982 Mean of X = 100.58 SS of X = 2,156.913 For children with birth weight of xh = 95 ounces, the point estimate and 95% Confidence Interval for the Mean growth between 70-100 days as % of BW is:

%)83.97%,69.85(43.7)(76.91

429.7913.156,2

)58.10095(121)982.75()(

%757.91)95)(737.1(972.2562^

=−+=

EXAMPLE #2: Age and SBP Age (x) SBP (y)

42 13046 11542 14871 10080 15674 16270 15180 15685 16272 15864 15581 16041 12561 15075 165

Intercept = 99.958 Slope = .705 MSE = 278.554 Mean of X = 65.6 SS of X = 3403.6 For xh = 60 years old women, the point estimate and 95% Confidence Interval for the Mean SBP is:

)2.152,4.132(137.21)(3.142

137.216.3403

)6.6560(151)554.278()(

26.142)60)(705(.958.992^

LotSize WorkHours80 39930 12150 22190 37670 36160 224

120 54680 352

100 35350 15740 16070 25290 38920 113

110 435100 42030 21250 26890 377

110 42130 27390 46840 24480 34270 323

EXAMPLE #3: Toluca Company Data

Intercept = 62.366 Slope = 3.570 MSE = 2,384 Mean of X = 70.0 SS of X = 19,800 For the lots’ size of xh = 65 units, the point estimate and 90% Confidence Interval for the Mean Work Hours is:

)4.311,4.277(47.98)(4.294

47.98800,19

)0.7065(251)384,2()(

4.294)65)(57.3(37.622^

In regression analysis, besides estimating the mean response, sometimes one may want to estimate a new individual response. For example: (1) In addition to estimating the average blood pressure for women at certain age using the relationship between SBP and Age, we may be interested in estimating the SBP of a particular woman/patient at that age; and (2) In a study of the relationship between pay (salary, X) and worker productivity (Y), the interest may focus on the productivity of certain particular worker.

POINT ESTIMATE Let X = xh denote the level of X under investigation, at which the mean response is E(Y|X=xh). Let Yh(new) be the value of the new individual response of interest. This new observation of Y to be predicted is often viewed as the result of a new trial independent of the trials on which the regression line is formed. The point estimate is still the same as that of the mean response:

^10)|(

+== ββ

Same as the mean

VARIANCE The point estimates of the mean response and of an individual response are the same but the variances are different. In estimating an individual response, there are two layers of variation: (a) variation in the “position of the distribution” (that is of the mean response), and (b) the variation within that distribution (that is from the individual response to the mean response)

normal. is Y ofon distributi sampling the

Model, RegressionError Normal" Under the:

)()()(

h(new)

#4A Theorem

−++=

YVarYVarYVar

hnewhnewh

−++=

)(11)(

hnewh σ

Taking square root to get Standard Error

PubH 7405: BIOSTATISTICS: REGRESSIONchap/F09-InferencesPartII.pdf · PubH 7405: REGRESSION ANALYSIS SLR: INFERENCES, Part II . We cover the topic of inference in two sessions; the

Documents

Contract No W 7405 eng 26

PUBH 6165 (WEEK 7) Application

Strategic Management PubH 6563

Ethan Taylor PUBH 6007 Project

M-7405- eng 40

Project Management Principles and Practice (PUBH...

PubH 7405: BIOSTATISTICS:...

PubH 7405: BIOSTATISTICS:...

PubH 6727 Health Leadership and Effecting Change Course...

PubH 7405: BIOSTATISTICS:...

Project Management Principles and Practice (PUBH...

PubH 7405: BIOSTATISTICS: REGRESSION

Project Management Principles and Practice (PUBH...

MGMT 6402-060, PUBH 6702-001, PA 5105-001, OLPD 6402-001...

PubH 6751-001, Principles of Management in Health...

PubH 7405: BIOSTATISTICS:...