Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE = 2 HTEXT = 1.5 GSFMODE = replace;

Post on 31-Jan-2016

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE = 2 HTEXT = 1.5 GSFMODE = replace; PROC REG DATA =agebp; MODEL sbp = age; PLOT sbp*age; RUN ; This will create file sasgraph.png Transfer file to PC (binary mode) 2. Open Word 3. Choose Insert picture from file - PowerPoint PPT Presentation

Transcript

Creating Graphs on Saturn

GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace;PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age;RUN;

This will create file sasgraph.png1. Transfer file to PC (binary mode)2. Open Word3. Choose Insert picture from file

PROC REG DATA=agebp LP; MODEL sbp = age; PLOT sbp*age;RUN;

Multiple Linear Regression

• More than 1 independent variable– See how combinations of several variables are associated

with and can predict the dependent variable. How much of the total variability can be explained?

– Control for confounding (interested in the effect of one variable but want to “adjust” for another variable)

– Explore interactions

PROC REG DATA=datasetname ; MODEL depvar = x1; MODEL depvar = x1 x2; MODEL depvar = x1 x2 x3;RUN;

Question Explored Using Multiple Regression

• How much of the variation in test scores among school districts can be explained by several district characteristics?

• Is calcium intake related to BP independent of age?• Is the relationship between age and BP the same for

men and women.

Reminder

• Y variable is continuous and is normally distributed for each combination of X’s with the same variability

• X variables can be continuous or indicator variables and do not need to be normally distributed

2 Factors

1. Y = 0 + 1X1

2. Y = 0 + 2X2

3. Y = 0 + 1X1 + 2X2

• Do you get the same slope in models 1 and 3

Control for confounding

Both SLR models for each cohort significant

Overall not significant

(negative confounding)

The equation that describes how the mean The equation that describes how the mean value of value of yy is related to is related to xx11, , xx22, . . . , . . . xxpp . .

yy = = 00 + + 11xx1 1 + + 22xx2 2 + . . . + + . . . + ppxxpp

Multiple Regression Equation Multiple Regression Equation

=Mean of y when all x variables are equal to 0

i = change in mean y corresponding to a 1 unit change in xi considering all other predictors fixed

Implied: The impact of x1 is the same for each of the other values of x2, x3, … xp

Multiple Regression ModelMultiple Regression Model

The equation that describes how the dependent The equation that describes how the dependent variable variable yy is related to the independent is related to the independent variables variables xx11, , xx22, . . . , . . . xxpp and an error term is and an error term is called the called the multiplemultiple regression modelregression model..

yy = = 00 + + 11xx11 + + 22xx2 2 ++ . . . + . . . + ppxxpp + +

reflects how individuals deviate from others reflects how individuals deviate from others with the same values of x’swith the same values of x’s

The The estimated multiple regression equation is:estimated multiple regression equation is:

yy = = bb00 + + bb11xx1 1 + + bb22xx2 2 + . . . + + . . . + bbppxxpp

Estimated Multiple Regression EquationEstimated Multiple Regression Equation

bi estimates i

yy is estimated (or predicted) value for a set of x’s

Estimation Estimation

Least Squares CriterionLeast Squares Criterion

Computation of Coefficients ValuesComputation of Coefficients Values

The formulas for the regression The formulas for the regression coefficients coefficients bb00, , bb11, , bb22, . . . , . . . bbp p involve the use of involve the use of matrix algebra. We will use SAS to perform matrix algebra. We will use SAS to perform the calculations.the calculations.

min ( iy yi )2min ( iy yi )2^̂

Find the best multidimensional plane

Testing for Significance: Global Test Testing for Significance: Global Test

Hypotheses Hypotheses

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HHaa: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero. Test StatisticTest Statistic

FF = MSR/MSE = MSR/MSE Rejection RuleRejection Rule

Reject Reject HH00 if if FF > > FF

where where FF is based on an is based on an FF distribution with distribution with pp d.f. in d.f. in

the numerator and the numerator and nn - - pp - 1 d.f. in the - 1 d.f. in the denominator.denominator.

Testing for Significance: IndividualTesting for Significance: Individual ’s’s

HypothesesHypotheses

HH00: : ii = 0 = 0

HHaa: : ii = 0 = 0 Test StatisticTest Statistic

Rejection RuleRejection Rule

Reject Reject HH00 for small or large for small or large tt

Meaning: Is XMeaning: Is Xii related to Y after taking into related to Y after taking into account all other variables in the modelaccount all other variables in the model

tbs

i

bi

tbs

i

bi

PossibilitiesPossibilities

X1 is related to Y alone but after adjusting for X1 is related to Y alone but after adjusting for X2, then X1 is no longer related to YX2, then X1 is no longer related to Y

X1 is not related to Y alone but after adjusting X1 is not related to Y alone but after adjusting for X2, then X1 is related to Yfor X2, then X1 is related to Y

Relation of X1 with Y1 gets stronger after Relation of X1 with Y1 gets stronger after adjusting for X2adjusting for X2

Relation of X1 with Y gets weaker after Relation of X1 with Y gets weaker after adjusting for X2adjusting for X2

Pulmonary Function Example

• Dependent Variable: Forced Expired Volume (FEV1.0)

• Independent Variables:– Age of person

– Smoking status of person

• Questions:– Is age related to FEV independent of smoking status

– Is smoking status related to FEV independent of age

– How much of the variability in FEV is explained by age and smoking combined

Model for FEV Example

Y = 0 + 1X1 + 2X2

X1 = smoking status (1=smoker, 0=nonsmoker)

X2 = age

SmokersFEV = 0 + 1 + 2age

Non Smokers FEV = 0 + 2age

Interpretation of Parameters

SmokersFEV = 0 + 1 + 2age

Non Smokers FEV = 0 + 2age

1 is the effect of smoking for fixed levels of age

2 is the effect of age pooled over smokers and non-smokers

This model assumes the relation of age to FEV is the same for smokers and non-smokers

DATA fev;INFILE DATALINES;INPUT age smk fev;DATALINES;28 1 4.030 1 3.930 1 3.731 1 3.654 0 2.9

More data

PROC MEANS; VAR fev; CLASS smk;RUN;

The MEANS Procedure

Analysis Variable : fev

N smk Obs N Mean Std Dev Minimum Maximum

0 15 15 3.6000000 0.4208834 2.9000000 4.3000000

1 15 15 3.2933333 0.5257195 2.2000000 4.000000

PROC CORR DATA=fev;

Pearson Correlation Coefficients, N = 30 Prob > |r| under H0: Rho=0

age smk fev

age 1.00000 -0.12788 -0.73024 0.5007 <.0001

smk -0.12788 1.00000 -0.31620 0.5007 0.0887

fev -0.73024 -0.31620 1.00000 <.0001 0.0887

PROC REG; MODEL fev = age smk ;RUN;

Dependent Variable: fev

Analysis of Variance

Sum of MeanSource DF Squares Square F Value Pr > F

Model 2 SSR 4.96510 2.48255 32.08 <.0001Error 27 SSE 2.08957 0.07739Corrected Total 29 SST 7.05467

Root MSE 0.27819 R-Square 0.7038Dependent Mean 3.44667 Coeff Var 8.07136

Tests Ho: 1 = 0; 2 =0

Proportion of variance explained by both variables

PROC REG; MODEL fev = age smk ; MODEL fev = age ; MODEL fev = smk ;RUN;Parameter Estimates

Parameter StandardVariable DF Estimate Error t Value Pr > |t|

Intercept 1 5.58114 0.27653 20.18 <.0001age 1 -0.04702 0.00634 -7.42 <.0001smk 1 -0.40384 0.10242 -3.94 0.0005

Intercept 1 5.24787 0.32456 16.17 <.0001age 1 -0.04382 0.00775 -5.66 <.0001

Intercept 1 3.60000 0.12295 29.28 <.0001smk 1 -0.30667 0.17388 -1.76 0.0887

R2 = .7038

R2 = .5333

R2 = .1000

PROC REG; MODEL fev = age smk; PROC REG; MODEL fev = age ; WHERE smk = 0;PROC REG; MODEL fev = age ; WHERE smk = 1;

Parameter StandardVariable DF Estimate Error t Value Pr > |t|

Intercept 1 5.58114 0.27653 20.18 <.0001age 1 -0.04702 0.00634 -7.42 <.0001smk 1 -0.40384 0.10242 -3.94 0.0005

Parameter StandardVariable DF Estimate Error t Value Pr > |t|

Intercept 1 5.24764 0.38050 13.79 <.0001age 1 -0.03911 0.00887 -4.41 0.0007

Intercept 1 5.50002 0.36163 15.21 <.0001age 1 -0.05508 0.00885 -6.22 <.0001

Non-smokers

Smokers

top related