Moving further

- Word counts

- Speech error counts

- Metaphor counts

- Active construction counts

Moving furtherCategorical count data

Hissing Koreans

Winter & Grawunder (2012)

No. of Cases

Bentz & Winter (2013)

Poisson Model

Siméon Poisson

1898: Ladislaus Bortkiewicz

Army Corps

with few Horses

Army Corpslots of Horses

few deaths

lowvariability

many deaths

highvariability

The Poisson Distribution

Poisson Regression= generalized linear

model with Poisson error structure

and log link function

The Poisson ModelY ~ log(b0 + b1*X1 + b2*X2)

In R:

lmer(my_counts ~ my_predictors +(1|subject), mydataset, family="poisson")

Poisson model output

logvalues

predicted mean

rate

exponentiate

Poisson Model

- Focus vs. no-focus

- Yes vs. No

- Dative vs. genitive

- Correct vs. incorrect

Moving furtherBinary categorical data


Case yes vs. no ~ Percent L2 speakers

Logistic Regression= generalized linear

model with binomial error structure

and logistic link function

The Logistic Modelp(Y) ~ logit-1(b0 + b1*X1 + b2*X2)

In R:

lmer(binary_variable ~ my_predictors +(1|subject), mydataset,family="binomial")

Probabilities and OddsProbability of

anEvent

Odds of anEvent

Intuition about Odds

N = 12

What are the odds that I pick a blue

marble?

Answer:2/10

Log odds

= logit function

Representative valuesProbability Odds Log odds (= “logits”)0.1 0.111 -2.1970.2 0.25 -1.3860.3 0.428 -0.8470.4 0.667 -0.4050.5 1 00.6 1.5 0.4050.7 2.33 0.8470.8 4 1.3860.9 9 2.197

Snijders & Bosker (1999: 212)


Estimate Std. Error z value Pr(>|z|)(Intercept) 1.4576 0.6831 2.134 0.03286Percent.L2 -6.5728 2.0335 -3.232 0.00123


Log odds when Percent.L2 = 0




For each increase in Percent.L2 by 1%, how much the log odds decrease (= the slope)




Logits or“log odds”

Exponentiate

Transform byinverse logit

Odds

Proba-bilities





Odds

Proba-bilities

exp(-6.5728)




exp(-6.5728)


0.001397878

Proba-bilities

Odds

> 1

< 1

Numeratormore likely

Denominator more likely

= event happens more often than

not

= event is more likely not to

happen




exp(-6.5728)


0.001397878

Proba-bilities



Logits or“log odds” logit.inv(1.4576) 0.81


About 80%(makes sense)



Logits or“log odds” logit.inv(1.4576) 0.81

logit.inv(1.4576+-6.5728*0.3) 0.37


= logit function

= inverse logit

function

= inverse logit

function

This is the famous “logistic

function”

logit-1

Inverse logit function

(transforms back toprobabilities)

logit.inv = function(x){exp(x)/(1+exp(x))}

(this defines the function in R)

GeneralLinear Model

GeneralizedLinear Model

GeneralizedLinearMixed Model

GeneralLinear Model



GeneralLinear Model




= “Generalizing” the General Linear Model to cases that don’t include continuous response variables (in particular categorical ones)

= Consists of two things: (1) an error distribution, (2) a link function



Logistic regression: Binomial distributionPoisson regression:Poisson distribution

Logistic regression:Logit link function

Poisson regression:Log link function



Logistic regression: Binomial distributionPoisson regression:Poisson distribution

Logistic regression:Logit link function

Poisson regression:Log link function

lm(response ~ predictor)

glm(response ~ predictor,family="binomial")

glm(response ~ predictor,family="poisson")

Categorical Data

Dichotomous/Binary Count

Logistic Regression

PoissonRegression

General structure

Linear Modelcontinuous ~ any type of variable

Logistic Regressiondichotomous ~ any type of variable

Poisson Regressioncount ~ any type of variable

For the generalized linearmixed model…

… you only have to specify the family.

lmer(…)lmer(…,family="poisson")lmer(…,family="binomial")

That’s it(for now)

Moving further

Documents

percent l2 speakerslog

percent l2 speakersfor

error z value przintercept

odds n

log odds decrease

slope bentz winter

cases bentz winter

century7 poisson regression