Top Banner
SI Workshop: July 15, 200 5 1 SAS Macro Coding for Jackknife Repeated Replication Jackknife Repeated Replication is well-suited to macro coding due to iterative and flexible abilities with SAS macro language This presentation will demonstrate how to use a general JRR macro to correctly calculate variance estimates for means and regression coefficients (logistic and OLS models)
27

SAS Macro Coding for Jackknife Repeated Replication

Jan 04, 2016

Download

Documents

haracha

SAS Macro Coding for Jackknife Repeated Replication. Jackknife Repeated Replication is well-suited to macro coding due to iterative and flexible abilities with SAS macro language - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 1

SAS Macro Coding for Jackknife Repeated Replication

• Jackknife Repeated Replication is well-suited to macro coding due to iterative and flexible abilities with SAS macro language

• This presentation will demonstrate how to use a general JRR macro to correctly calculate variance estimates for means and regression coefficients (logistic and OLS models)

Page 2: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 2

Analysis of Complex Sample Survey Data

• Data from complex sample surveys must be analyzed using techniques which adjust for the clustering of the sample design

• SAS, SPSS, and Stata assume a simple random sample and do not correctly calculate variances and standard errors within the standard procedures

Page 3: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 3

Analysis of Complex Survey Data

• SAS and Stata offer survey and svy procedures which use the Taylor Series Linearization approach

• JRR is another widely used replication approach, offers an alternative to the Taylor Series method

• JRR is flexible and can be adapted to many different types of statistics such as means, regression coefficients, and other statistics of interest

Page 4: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 4

Visual Representation of JRR process

• JRR systematically removes a small portion of the sample and statistics of interest are computed repeated for each sub-sample

• In this example, str=42 and secu=2 is deleted and str=42 and secu=1 is doubled.

• This process is followed for each strata until entire dataset is covered

Page 5: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 5

Page 6: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 6

Page 7: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 7

SAS JRR Macro: Logistic Regression

*Logistic Regression Jackknife for Analysis of Complex Survey Data****************** ;

*Pat Berglund, July 2003 for Summer Institute Workshop ;

libname d 'd:\sumclass' ;options compress=yes nofmterr symbolgen ;options macrogen mprint;

*create outer jackknife macro with parameters ;*Parameters to fill in:*ncluster=number of clusters, in the NCS I dataset this is 42 ;*weight=case weight ;*depend=dependent variable for the logistic model ;*preds=predictor variables entered with a space between each one ;*indata=input dataset* ;

%macro jacklogods(ncluster,weight,depend,preds,indata);

Page 8: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 8

*section 1: jackknife using strata and secu variables to do 42 jackknife selections* ;*each iteration of do loop selects one strata*secu combination and doubles the contribution of strata=x and secu=1 while setting strata=x and secu=2 to zero ;*all other combinations stay the same* ;

%let nclust=%eval(&ncluster);data one; set &indata;

%macro wgtcal ; %do i=1 %to &nclust ; pwt&i=&weight; if str=&i and secu=1 then pwt&i=pwt&i*2 ; if str=&i and secu=2 then pwt&i=0 ; %end; %mend;%wgtcal ;

Page 9: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 9

**section 2: run base model/statistic of interest for entire sample using full weight* ;

%macro base ;

ods output parameterestimates=parms (keep=variable estimate ) ;

ods listing close ;

proc logistic des data=ONE ;

model &depend=&preds ;

weight &weight ;

run ;

ods listing ;

proc print data=parms ;

run ;

proc sort ;

by variable ;

run ;

%mend base ;

%base ;

Page 10: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 10

*Section 3: Run Replicate Models* ;

* replicate models, one for each strata using weight developed in jackknife section 1* ;

*save statistic of interest for use with variance estimation* ;

%macro reps ;

%do j=1 %to &nclust ;

ods output parameterestimates=parms&j

(keep=estimate variable rename=(estimate=estimate&j )) ;ods listing close ;

proc logistic des data=ONE ;

model &depend=&preds ;

weight pwt&j ;

run ;

proc sort ;

by variable ;

%end ;

%mend reps;

%reps ;

Page 11: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 11

*Section 4: Merge Base and Replicate files together for calculation of statistics of interest* ;

data rep ;

merge parms

%do k=1 %to &nclust;

parms&k

%end;;

by variable ;

proc print ;

run ;

Page 12: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 12

*Section 5-Calculate complex design corrected variance and standard errors

*variance = sum of the squared differences between the base statistic and the replicate statistics ;

*standard error= square root of the sum of the squared differences (variance) ;

*Odds Ratio=exponent of the coefficient ;

*Confidence Intervals=OR+-1.96*corrected standard error* ;

ods listing ;

data calculate ;

set rep ;

%macro it ;

%do j=1 %to &nclust ;

sqdiff&j=(estimate-estimate&j)**2;

%end;

sumdiff=sum(of sqdiff1-sqdiff&nclust);

stderr=sqrt(sumdiff) ;

or=exp(estimate) ;

lowor=or-(1.96*stderr) ;

upor=or+(1.96*stderr) ;

%mend it ;

%it;

run ;

Page 13: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 13

proc print ;

var variable estimate stderr or lowor upor ;

run ;

%mend jacklogods ;

%jacklogods(42,p2wtv3,deplt1,sexf,d.ncsdxdm3 ) ;

*comparison with SRS logistic regression* ;

proc logistic des data=d.ncsdxdm3 ;

weight p2wtv3 ;

model deplt1=sexf ;

run ;

*comparison with SAS surveylogistic ;proc surveylogistic data=d.ncsdxdm3 ;strata str ;cluster secu ;weight p2wtv3 ;model deplt1 (event='1') =sexf ;run ;

Page 14: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 14

Results from Logistic JRR

Design Corrected Results:

Variable Estimate stderr or lowor upor

SEXF 0.7434 0.088842 2.10315 1.92902 2.27728

Page 15: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 15

SRS Results

Analysis of Maximum Likelihood Estimates

Std. Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

SEXF 1 0.7434 0.0724 105.3802 <.0001

Page 16: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 16

SAS Surveylogistic Results

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 2.0084 0.0776 669.6525 <.0001SEXF 1 -0.7434 0.0889 70.0103 <.0001

Odds Ratio Estimates Point 95% WaldEffect Estimate Confidence LimitsSEXF 0.475 0.399 0.566

Page 17: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 17

Another approach: Linear Regression

%macro jackgenmod(ncluster,weight,depend,preds,indata);

%let nclust=%eval(&ncluster);

data one;

set &indata;

%macro wgtcal ;

%do i=1 %to &nclust ;

pwt&i=&weight;

if str=&i and secu=1 then pwt&i=pwt&i*2 ;

if str=&i and secu=2 then pwt&i=0 ;

%end;

%mend;

%wgtcal ;

Page 18: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 18

Base Model for OLS

%macro base ;

ods output parameterestimates=parms

(keep=variable estimate ) ;

title "Example of Proc Reg without design correction" ;

proc reg data=ONE ;

model &depend=&preds ;

weight &weight ;

run ;

proc sort ;

by variable ;

run ;

%mend base ;

%base ;

Page 19: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 19

Replicate Models

%macro reps ;

%do j=1 %to &nclust ;

ods output parameterestimates=parms&j

(keep=estimate variable rename=(estimate=estimate&j )) ;

ods listing close ;

proc reg data=ONE ;

model &depend=&preds ;

weight pwt&j ;

run ;

proc sort ;

by variable ;

%end ;

%mend reps;

%reps ;

Page 20: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 20

Merge Replicate Datasets with Base Dataset

data rep ;

merge parms

%do k=1 %to &nclust;

parms&k

%end;;

by variable ;

proc print ;

run ;

ods listing ;

Page 21: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 21

Calculate Corrected Standard Errors from Distribution of Replicate Coefficients

data calculate ;

set rep ;

%macro it ;

%do j=1 %to &nclust ;

sqdiff&j=(estimate-estimate&j)**2;

%end;

sumdiff=sum(of sqdiff1-sqdiff&nclust);

stderr=sqrt(sumdiff) ;

%mend it ;

%it;

run ;

Page 22: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 22

Code to Print Results from JRR and Execute Outer Macro

proc print ;

title "Results from JRR for OLS regression" ;

var variable estimate stderr ;

run ;

%mend jackgenmod ;

%jackgenmod(42,p2wtv3,incpers,sexf ag25 ag35 ag45,d.ncsdxdm3 ) ;

Page 23: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 23

Proc SurveyReg Code

proc surveyreg data=d.ncsdxdm3 ;

title "Example of Proc SurveyReg" ;

strata str ;

cluster secu ;

weight p2wtv3 ;

model incpers=sexf ag25 ag35 ag45 ;

run ;

Page 24: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 24

Parameter Estimates

Parameter Std.

Variable DF Estimate Error t Value

Intercept 1 11077 485.53334 22.81

SEXF 1 -12096 434.45468 -27.84

AG25 1 15227 586.69609 25.95

AG35 1 22194 600.60265 36.95

AG45 1 21404 683.46087 31.32

Parameter Estimates from OLS SRS Regression

Page 25: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 25

JRR Results

Results from JRR for OLS regression

Obs Variable Estimate stderr

1 Intercept 11077 529.49

2 AG25 15227 698.83

3 AG35 22194 1026.29

4 AG45 21404 1055.67

5 SEXF -12096 689.31

Page 26: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 26

Proc SurveyReg Results

Estimated Regression Coefficients

Standard

Parameter Estimate Error t Value Pr > |t|

Intercept 11077.003 532.95062 20.78 <.0001

SEXF -12095.819 690.29149 -17.52 <.0001

AG25 15227.170 698.54031 21.80 <.0001

AG35 22194.355 1017.50689 21.81 <.0001

AG45 21403.763 1062.42802 20.15 <.0001

Page 27: SAS Macro Coding for Jackknife Repeated Replication

SI Workshop: July 15, 2005 27

Conclusions

• JRR is a flexible and convenient alternative to canned software procedures/programs

• Any statistic/procedure can be used within JRR structure, assuming it makes statistical sense

• SAS Macro coding allows parsimonious syntax and is ideal for repetitive and flexible coding