Statistics One Lecture 23 Generalized Linear Model
Jun 29, 2015
Statistics One
Lecture 23 Generalized Linear Model
Two segments
• Overview • Examples
2
Lecture 23 ~ Segment 1
Generalized Linear Model Overview
Generalized Linear Model
• An extension of the General Linear Model that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable
4
Generalized Linear Model
• Generalized Linear Model: GLM* • General Linear Model: GLM
5
General Linear Model (GLM)
• GLM is the mathematical framework used in many common statistical analyses, including multiple regression and ANOVA
6
Characteristics of GLM
• Linear: pairs of variables are assumed to have linear relations
• Additive: if one set of variables predict another variable, the effects are thought to be additive
7
Characteristics of GLM
• BUT! This does not preclude testing non-linear or non-additive effects
8
Characteristics of GLM
• GLM can accommodate such tests, for example, by
• Transformation of variables – Transform so non-linear becomes linear
• Moderation analysis – Fake the GLM into testing non-additive effects
9
GLM example
• Simple regression • Y = B0 + B1X1 + e
• Y = faculty salary • X1 = years since PhD
10
GLM example
• Multiple regression • Y = B0 + B1X1 + B2X2 + B3X3 + e
• Y = faculty salary • X1 = years since PhD • X2 = number of publications • X3 = (years x pubs)
11
Generalized linear model (GLM*)
• Appropriate when simple transformations or product terms are not sufficient
12
Generalized linear model (GLM*)
• The “linear” model is allowed to generalize to other forms by adding a “link function”
13
Generalized linear model (GLM*)
• For example, in binary logistic regression, the logit function was the link function
14
Binary logistic regression
• ln(Ŷ / (1 - Ŷ)) = B0 + Σ(BkXk)
15
Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables
Segment summary
• GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable
16
Segment summary
• Appropriate when simple transformations or product terms are not sufficient
17
Segment summary
• The “linear” model is allowed to generalize to other forms by adding a “link function”
18
END SEGMENT
Lecture 23 ~ Segment 2
Generalized Linear Model Examples
GLM* Examples
• GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable
21
GLM* Examples
• Appropriate when simple transformations or product terms are not sufficient
22
GLM* Examples
• The “linear” model is allowed to generalize to other forms by adding a “link function”
23
GLM* Examples
• Binary logistic regression
24
Binary logistic regression
GLM* Examples
• In binary logistic regression, the logit function served as the link function
26
GLM* Examples
• ln(Ŷ / (1 - Ŷ)) = B0 + Σ(BkXk)
27
Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables
GLM* Examples
• More than 2 categories on the outcome – Multinomial logistic regression • A-1 logistic regression equations are formed
– Where A = # of groups – One group serves as reference group
GLM* Examples
• Another example is Poisson regression • Poission distributions are common with
“count data” – A number of events occurring in a fixed interval
of time
29
GLM* Examples
• For example, the number of traffic accidents as a function of weather conditions – Clear weather – Rain – Snow
30
Poisson regression
31
GLM* Examples
• In Poisson regression, the log function serves as the link function
• Note: this example also has a categorical predictor and would therefore also require dummy coding
32
Segment summary
• GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable
33
Segment summary
• Appropriate when simple transformations or product terms are not sufficient
34
Segment summary
• The “linear” model is allowed to generalize to other forms by adding a “link function”
35
END SEGMENT
END LECTURE 23