4-Oct-07 GzLM PresentationBIOL 7932 1 The GzLM and SAS Or why it’s a necessary evil to learn code! Keith Lewis Department of Biology Memorial University, St. John’s, Canada
Jan 03, 2016
4-Oct-07 GzLM PresentationBIOL 7932 1
The GzLM and SAS
Or why it’s a necessary evil to learn code!
Keith Lewis
Department of Biology
Memorial University, St. John’s, Canada
4-Oct-07 GzLM PresentationBIOL 7932 2
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
4-Oct-07 GzLM PresentationBIOL 7932 3
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
4-Oct-07 GzLM PresentationBIOL 7932 4
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
Ratio Categorical Normal Identity ANOVA
4-Oct-07 GzLM PresentationBIOL 7932 5
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
Ratio Categorical Normal Identity ANOVA
Ratio Mixed Normal Identity ANCOVA
4-Oct-07 GzLM PresentationBIOL 7932 6
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
Ratio Categorical Normal Identity ANOVA
Ratio Mixed Normal Identity ANCOVA
Poisson Mixed Poisson Log (ln) Log-linear
4-Oct-07 GzLM PresentationBIOL 7932 7
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
Ratio Categorical Normal Identity ANOVA
Ratio Mixed Normal Identity ANCOVA
Poisson Mixed Poisson Log (ln) Log-linear
Poisson Ratio Poisson Identity Poisson Reg.
4-Oct-07 GzLM PresentationBIOL 7932 8
Variables, Links, and Models(Introduction to Categorical Data Analysis, A. Gresti 1996)
R.V E.V. Error Link Model
Ratio Ratio Normal Identity Linear Reg.
Ratio Categorical Normal Identity ANOVA
Ratio Mixed Normal Identity ANCOVA
Poisson Mixed Poisson Log (ln) Log-linear
Poisson Ratio Poisson Identity Poisson Reg.
Binomial Mixed Binomial logit Logistic Reg.
4-Oct-07 GzLM PresentationBIOL 7932 9
4-Oct-07 GzLM PresentationBIOL 7932 10
4-Oct-07 GzLM PresentationBIOL 7932 11
4-Oct-07 GzLM PresentationBIOL 7932 12
SAS Proc’s: the basics
• Data [dataset];
• Infile [filename];
• input [variables];
• proc [glm (or genmod)];
• model [model];
• run;
4-Oct-07 GzLM PresentationBIOL 7932 13
SAS PROC GLM – Lin. Reg.
• Data nest97;
• infile ‘e:\testdata\97exp1.prn’;
• input lake treat type pred n;
• proc glm;
• model pred = lake treat type;
• run;
4-Oct-07 GzLM PresentationBIOL 7932 14
SAS PROC GLM - ANOVA
• Data nest97;• infile ‘e:\testdata\97exp1.prn’;• input lake treat type pred n;
• proc glm;• class lake treat type;• model pred = lake treat type;• run;
4-Oct-07 GzLM PresentationBIOL 7932 15
SAS PROC GLM - ANOVA
• Data nest97;• infile ‘e:\testdata\97exp1.prn’;• input lake $ treat $ type $ pred n;
• proc glm;• class lake treat type;• model pred = lake treat type;• run;
4-Oct-07 GzLM PresentationBIOL 7932 16
SAS PROC GLM - ANCOVA
• Data nest97;• infile ‘e:\testdata\97exp1.prn’;• input lake treat type pred n;
• proc glm;• class treat type;• model pred = lake treat type;• run;
4-Oct-07 GzLM PresentationBIOL 7932 17
SAS PROC GENMOD – Log-Linear
• Data nest97;• infile ‘e:\testdata\97exp1.prn’;• input lake treat type pred n;
• proc genmod;• class lake treat type;• model pred = lake treat type / dist=poisson
link=log type1 type3;• run;
4-Oct-07 GzLM PresentationBIOL 7932 18
SAS PROC GENMOD – Logistic Regression
• Data nest97;• infile ‘e:\testdata\97exp1.prn’;• input lake treat type pred n;
• proc genmod;• class lake treat type;• model pred/n = lake treat type / dist=binomial
link=logit type1 type3;• run;
4-Oct-07 GzLM PresentationBIOL 7932 19
A full exampledata an_01;infile 'C:\Documents and Settings\Micro-Tech Customer\My Documents\MyWork\thesis\SAS\ch4\An_2000a.csv' firstobs=2 delimiter = ',';input park $ site $ grid $ nest $ dp vt;
proc genmod;class park site grid nest;model dp = park|grid|nest / dist=bin link=logit type1 type3;/*make obstats out=keith noprint;*/title 'Schmidts model, 2000 with contrasts';lsmeans park grid nest;contrast 'bird v control' nest 1 -1 0;contrast 'contrl v large' nest 0 1 -1;estimate 'contrl v large' nest 0 1 -1;estimate 'bird v control' nest 1 1 0;estimate 'bF v bS' park 1 -1;estimate 'con v food' grid 1 -1;run;
4-Oct-07 GzLM PresentationBIOL 7932 20
Deviance and G-tests
• GzLMs based on Maximum Likelihood Estimates (MLE)
• D= -2ln[likelihood of (current model) / (saturated model)]
• G=D(for model w/ variable)-D(model w/o variable)
• G is analagous to F-tests for GLM
4-Oct-07 GzLM PresentationBIOL 7932 21
GENMOD output LR Statistics For Type 1 Analysis Chi- Source Deviance DF Square Pr > ChiSq Intercept 321.4338 park 319.7385 1 1.70 0.1929 grid 314.1447 1 5.59 0.0180
park*grid 313.5346 1 0.61 0.4348 nest 310.1887 2 3.35 0.1877 park*nest 310.1033 2 0.09 0.9582 grid*nest 306.9164 2 3.19 0.2032 park*grid*nest 306.3648 2 0.55 0.7590
321.4338-319.7385 = 1.70, Chisquare = 1.70, df = 1 p = 0.1929
4-Oct-07 GzLM PresentationBIOL 7932 22
GENMOD output
LR Statistics For Type 3 Analysis Chi- Source DF Square Pr > ChiSq park 1 2.62 0.1052 grid 1 7.45 0.0064 park*grid 1 0.81 0.3672 nest 2 3.45 0.1783 park*nest 2 0.13 0.9391 grid*nest 2 3.37 0.1853 park*grid*nest 2 0.55 0.7590
4-Oct-07 GzLM PresentationBIOL 7932 23
Why we use GzLMSame Data, Same Distribution
DATA PROC Source P-value
limpet Glm (normal error) (identity link)
Sp Se
Sp*Se
.1942
.0004
.2966
From Sokal and Rohlf 1995, Box 11.2
4-Oct-07 GzLM PresentationBIOL 7932 24
Why we use GzLMSame Data, Same Distribution
DATA PROC Source P-value
limpet Glm (normal error) (identity link)
Sp Se
Sp*Se
.1942
.0004
.2966
limpet Genmod
Dist = normal Link=identity
Sp Se
Sp*Se
.1627
.0001
.2493
From Sokal and Rohlf 1995, Box 11.2
4-Oct-07 GzLM PresentationBIOL 7932 25
Why we use GzLMSame Data, Different Distribution
Data Proc Source P-value Anest97 Glm
(normal errors) (identity link)
Lake Treat Type
TT*TY
.0505
.3632
.4915
.8619
(K.Lewis, M.Sc data)
4-Oct-07 GzLM PresentationBIOL 7932 26
Why we use GzLMSame Data, Different Distribution
Data Proc Source P-value Anest97 Glm
(normal errors) (identity link)
Lake Treat Type
TT*TY
.0505
.3632
.4915
.8619 Anest97 Genmod
Dist=binom. link=logit
Lake Treat Type
TT*TY
.0001
.1229
.2435
.8098
(K.Lewis, M.Sc data)See Lewis 2005, Oikos
4-Oct-07 GzLM PresentationBIOL 7932 27
SAS v. R
• SAS– Powerful– Widely used– Learning curve– Expensive
• R– Powerful– “limited” use– Learning curve – Free
• Resources– Peter Earle– The web!!!!
4-Oct-07 GzLM PresentationBIOL 7932 28
References
• Criteria:– Readability– Examples with the software code!
• A. Agresti. 1996. Introduction to Categorical Data Analysis. Wiley & Sons, New York.
• Littel et al. 2002. SAS for Linear Models 4th ed. Cary, NC: SAS Institute Inc.