Top Banner
Introduction to latent variable models Lecture 2 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT [email protected] – Typeset by Foil T E X 1
23

Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Apr 05, 2018

Download

Documents

LêHạnh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Introduction to latent variable models

Lecture 2

Francesco BartolucciDepartment of Economics, Finance and Statistics

University of Perugia, IT

[email protected]

– Typeset by FoilTEX – 1

Page 2: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

[2/23]

Outline

• Examples on the EM algorithm for finite mixture and latent class

models

• Choice of the number of components/classes

• Computation of standard errors for the parameter estimates

• Item Response Theory models

• Dynamic versions of latent variable models for panel data

– Typeset by FoilTEX – 2

Page 3: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [3/23]

Example on the EM algorithm forfinite mixture of Normal distributions

• A finite mixture of Normal distributions with common variance is

considered

• Data consist of 500 observations simulated from a model with 2

components

• In order to select the number of components two criteria are

commonly used:

Akaike Information Criterion (AIC) = −2`(θk) + 2×#param.

Bayesian Information Criterion (BIC) = −2`(θk) + log(n)×#param.

• The second criterion is usually preferred (McLachlan & Peel, 2000)

– Typeset by FoilTEX – 3

Page 4: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [4/23]

Example on the EM algorithm for the LC model

• A latent class (LC) model for binary response variables is considered

• Data are collected on 216 subjects who responded to T = 4 items

concerning social aspects (Goodman, 1974, Biometrika)

• Data may be represented by a 24-dimensional vector of frequencies

for all the response configurations

n =

freq(0000)

freq(0001)...

freq(1111)

=

42

23...

20

• Selection criteria AIC and BIC are used for the number of classes

– Typeset by FoilTEX – 4

Page 5: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [5/23]

• For both finite mixture and LC models the likelihood may be

multimodal

• A common strategy to overcome this problem is to try different

starting values for the EM algorithm, which are randomly chosen

• In both cases the vector of probabilities π = {πc} may be chosen to

be proportional to a vector with elements drawn from U(0,1)

• For the finite mixture case we can draw each µc from N(y,S) and

let Σ = S

. y: sample mean

. S: sample variance

• For the LC case we can draw every λtc from U(0,1)

– Typeset by FoilTEX – 5

Page 6: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [6/23]

Latent regression model

• Two possible choices to include individual covariates:

1. on the measurement model so that we have random intercepts (via

a logit or probit parametrization):

λitc = p(yit = 1|ui = ξc,Xi),

logλitc

1− λitc= ξc + x′

itβ, i = 1, . . . , n, t = 1, . . . , T, c = 1, . . . , k

2. on the model for the distribution of the latent variables (via a

multinomial logit parameterization):

πic = p(ui = ξc|Xi), logπic

πi1= x′

itβc, c = 2, . . . , k

• Alternative parameterizations are possible with ordinal response

variables or ordered latent classes

– Typeset by FoilTEX – 6

Page 7: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [7/23]

• The models based on the two extensions have a different

interpretation:

1. the latent variables are used to account for the unobserved

heterogeneity and then the model may be seen as discrete version

of the logistic model with one random effect

2. the main interest is on a latent variable which is measured through

the observable response variables (e.g. health status) and on how

this latent variable depends on the covariates

• Only the M-step of the EM algorithm must be modified by exploiting

standard algorithms for the maximization of:

1. the weighed likelihood of a logit model

2. the likelihood of a multinomial logit model

– Typeset by FoilTEX – 7

Page 8: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [8/23]

Example on the EM algorithm forlatent regression model (type 2)

• Data about 1,093 elderly people, admitted in 2003 to 11 nursing

homes in Umbria, who responded to 9 items about their health status:

Item %

1 [CC1] Does the patient show problems in recalling what

recently happened (5 minutes)? 72.6

2 [CC2] Does the patient show problems in making decisions

regarding tasks of daily life? 64.2

3 [CC3] Does the patient have problems in being understood? 43.9

4 [ADL1] Does the patient need support in moving to/from lying position,

turning side to side and positioning body while in bed? 54.4

5 [ADL2] Does the patient need support in moving to/from bed, chair,

wheelchair and standing position? 59.0

6 [ADL3] Does the patient need support for eating? 28.7

7 [ADL4] Does the patient need support for using the toilet room? 63.5

8 [SC1] Does the patient show presence of pressure ulcers? 15.4

9 [SC2] Does the patient show presence of other ulcers? 23.1

– Typeset by FoilTEX – 8

Page 9: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [9/23]

• Binary responses to items are coded so that 1 is a sign of bad health

conditions

• The available covariates are:

. gender (0 = male, 1 = female)

. 11 dummies for the nursing homes

. age

• Many latent classes (k = 6) are selected through BIC; in order to

have a easier interpretation of the classes, the constraint of

monotonicity of the conditional probabilities should be used (ordered

latent classes: λt1 ≤ · · · ≤ λtc, t = 1, . . . , T )

– Typeset by FoilTEX – 9

Page 10: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [10/23]

Computation of the standard errors

• Differently from the Fisher-scoring and Newton-Raphson algorithms,

the EM algorithm does not provide the information matrix of the

incomplete data; this matrix allows us to obtain standard errors

• Many methods are available to obtain this matrix from the

information matrix of the complete data that is used within the EM

algorithm (McLachlan & Peel, 2000)

• A simple method has been used by Bartolucci & Farcomeni

(2009,Jasa); it is based on the fact that

s(θ) =∂`(θ)∂θ

=∂Q(θ|θ)

∂θ

∣∣∣∣θ=θ

– Typeset by FoilTEX – 10

Page 11: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Examples on the EM algorithm for finite mixture and latent class models [11/23]

• The score at θ of the incomplete data is then equal to the score of

the complete data (first derivative of the expected value of the

complete data log-likelihood computed at the same point θ)

• By computing (minus) the numerical derivative of s(θ) we obtain an

approximated observed information matrix

J(θ) ≈ J(θ) = −∂2`(θ)∂θ∂θ′

• The standard error for each estimate θj, se(θj), is then obtained as

the squared root of the corresponding diagonal element of

J(θ)−1

– Typeset by FoilTEX – 11

Page 12: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [12/23]

Item Response Theory (IRT) models

• IRT models are tailored to the analysis of data arising from the

administration of a questionnaire made of a series of items which

measure a common (continuous) latent trait

• The main application of these models is then for educational

assessment, where the latent trait corresponds to a certain type of

ability of an examinee

• Main references: Fischer & Molenaar (1995), Hambleton &

Swaminathan (1996), van der Linden & Hambleton (1997), Baker &

Kim (2004)

– Typeset by FoilTEX – 12

Page 13: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [13/23]

• Main IRT assumptions:

. unidimensionality: for each subject i, the responses to the T items

depend on the same latent variable ui

. local independence: for each subject i, the responses to the T

items are independent given ui

. monotonicity: the probability pt(ui) = p(yit = 1|ui) is a monotonic

increasing function of ui

• Most used Item Response Functions (IRF) for pt(ui):

. one-parameter logistic (1PL, Rasch, 1960):

pt(ui) =eui−βt

1 + eui−βt

∗ βt: difficulty level of item t

– Typeset by FoilTEX – 13

Page 14: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [14/23]

. two-parameter logistic (2PL, Birnbaum, 1968):

pt(ui) =eαt(ui−βt)

1 + eαt(ui−βt)

∗ αt: discriminating index of item t measuring how strongly the

probability of success depends on the ability level

. three-parameter logistic (3PL, Birnbaum, 1968):

pt(ui) = γt + (1− γt)eαt(ui−βt)

1 + eαt(ui−βt)

∗ γt: guess parameter corresponding to probability of success for a

subject with ability level tending to −∞

– Typeset by FoilTEX – 14

Page 15: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [15/23]

• Most used estimation methods:

. Joint Maximum Likelihood (JML): fixed-parameters approach

which consists of maximizing the likelihood of the model with

respect to the ability and item parameters jointly

. Conditional Maximum Likelihood (CML): applicable only to

estimate the difficulty parameters of the Rasch model. It is based

on the maximization of the conditional likelihood of these

parameters given a set of sufficient statistics for the ability

parameters

. Marginal Maximum Likelihood (MML): random-parameters

approach which consists of maximizing the marginal likelihood

corresponding to the manifest probability of the observed responses

– Typeset by FoilTEX – 15

Page 16: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [16/23]

Joint maximum likelihood method

• Local independence implies:

p(yi|ui) =∏

t

pt(ui)yit[1− pt(ui)]1−yit

• The joint likelihood is then

LJ(θ) =∏

i

p(yi|ui) =∏

i

∏t

pt(ui)yit[1− pt(ui)]1−yit

. θ: parameter vector which contains the item parameters and the

ability parameters (ui)

• LJ(θ) is maximized by a standard Newton-Raphson algorithm

(attention must be payed to the implementation with many subjects)

• The method is simple to apply, but it is known to lead to an

inconsistent estimator

– Typeset by FoilTEX – 16

Page 17: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [17/23]

Conditional maximum likelihood method

• The method exploits the conditional likelihood of yi given

yi+ =∑

t yit:

p(yi|yi+, ui) =p(yi|ui)p(yi+|ui)

= p(yi|yi+)

which does not depend on ui for the Rasch model

• The conditional likelihood is then

LC(β) =∏

i

p(yi|yi+)

• LC(β) is maximized by a Newton-Raphson algorithm, which also

produces standard errors (attention must be payed to the

implementation with many items)

– Typeset by FoilTEX – 17

Page 18: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [18/23]

• The method leads to a consistent estimator, but only for the

difficulty parameters in β

– Typeset by FoilTEX – 18

Page 19: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [19/23]

Marginal maximum likelihood method

• The method exploits the manifest distribution of yi

p(yi) =∫

p(yi|ui)p(ui)dui

• The marginal log-likelihood is then LM(θ) =∏

i p(yi)

. θ: parameter vector which contains the item parameters and the

parameters of the latent distribution

• The distribution of ui may be continuous or discrete; the second is

seen as a semiparametric approach (Lindsay et al., 1991, Jasa)

• Maximization of LM(θ) is carried on via a Newton-Raphson

algorithm (typically with continuous latent distribution) or EM

algorithm (typically with discrete latent distribution)

– Typeset by FoilTEX – 19

Page 20: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [20/23]

Example

• Application based on a dataset provided by the Educational Testing

Service (Bartolucci & Forcina, 2005, Psychometrika)

• Data concern responses of 1,510 students to 12 items on Math

within National Assessment of Educational Progress 1996 project:

1 Round to thousand place

2 Write fraction that represents shaded region

3 Multiply two negative integers

4 Reason about sample space (number correct)

5 Find amount of restaurant tip

6 Identify representative sample

7 Read dials on a meter

8 Find (x, y) solution of linear equation

9 Translate words to symbols

10 Find number of diagonals in polygon from a vertex

11 Find perimeter (quadrilateral)

12 Reason about betweenness

– Typeset by FoilTEX – 20

Page 21: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [21/23]

estimate s.e. 95%-conf.int.

β1 0.000 – – –

β2 -0.051 0.097 -0.241 0.138

β3 0.755 0.093 0.574 0.936

β4 -1.140 0.111 -1.357 -0.923

β5 1.672 0.092 1.491 1.853

β6 0.014 0.096 -0.175 0.202

β7 0.724 0.093 0.542 0.905

β8 1.305 0.092 1.125 1.485

β9 0.365 0.094 0.181 0.549

β10 0.574 0.093 0.391 0.756

β11 2.697 0.098 2.505 2.888

β12 2.751 0.098 2.558 2.944

u1 -0.080 0.674 -1.400 1.241

u2 1.193 0.662 -0.104 2.491

u3 1.193 0.662 -0.104 2.491

u4 0.770 0.649 -0.501 2.041

u5 -0.080 0.674 -1.400 1.241... ... ... ... ...

u1510 2.158 0.750 0.689 3.626

Table 1: JML estiamtes of ability and item parameters under the Rasch model

– Typeset by FoilTEX – 21

Page 22: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [22/23]

estimate s.e. 95%-conf.int.

β1 0.000 – – –

β2 -0.047 0.092 -0.229 0.134

β3 0.691 0.088 0.517 0.864

β4 -1.040 0.106 -1.247 -0.833

β5 1.521 0.088 1.349 1.693

β6 0.013 0.092 -0.168 0.193

β7 0.662 0.089 0.489 0.836

β8 1.191 0.088 1.019 1.363

β9 0.334 0.090 0.158 0.511

β10 0.525 0.089 0.351 0.700

β11 2.427 0.092 2.246 2.607

β12 2.474 0.093 2.292 2.655

Table 2: CML estimates of the item parameters of the Rasch model

– Typeset by FoilTEX – 22

Page 23: Introduction to latent variable models - EIEF · Introduction to latent variable models Lecture 2 ... • Item Response Theory models ... . 11 dummies for the nursing homes

Item Response Theory models [23/23]

#classes (k) `M(θk) #parameters BIC

1 -11009 12 22106

2 -10242 14 20586

3 -10166 16 20450

4 -10163 18 20458

Table 3: Selection of the number of classes for the latent class Rasch model (model

with one discrete latent variable)

class ability probability

1 -0.645 0.165

2 0.970 0.457

3 2.432 0.378

Table 4: MML estimates of the ability parameters of the Rasch model with 3 latent

classes

• Estimates of the item parameters are very similar to those obtained

with the CML approach

– Typeset by FoilTEX – 23