OPTIMAL DESIGN FOR A THREE LEVEL NESTED ...IJRRAS 2 (2) February 2010 Jafari Optimal Design for a Multinomial Logit Model 184 T M T T T 1 , 2 ,..., ; m m H m m T m m 1, 2,...,OThis

IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

181

OPTIMAL DESIGN FOR A THREE –LEVEL NESTED MULTINOMIAL

LOGIT MODEL IN DISCRETE CHOICE EXPERIMENTS

Habib Jafari

Institute for Mathematical Stochastics (IMST),

OvG University, Universitätsplatz 2, 39106 Magdeburg-Germany

ABSTRACT

In this paper we calculate the information matrix for a three-level nested Multinomial logit model and derive the

locally D-optimal design to estimate the parameter vector.

Keywords: Conjoint Analysis, Three-level Nested MNL Model, Discrete Choice Experiment, D-Optimal Criterion..

1. INTRODUCTION

The multinomial logit (MNL) model is most widely used in discrete choice models due to its closed-form choice

probabilities and its consistency with the random utility maximization (RUM). However, the MNL model suffers

from restrictive independence from irrelevant alternatives (IIA) property, which states that the ratio of two choice

probabilities is independent of the other alternatives in the model. This implies that a change in an attribute of one

alternative will have the same proportional impact on the probability of each of the other alternatives being chosen.

The NMNL model relaxes the IIA property by dividing the alternatives into subsets or nests, allowing the IIA

assumption to hold within each nest but not for alternatives in different nests. Notwithstanding that there is the same

IIA property for the nests that it is the IIN (Independent from Irrelevant Nest). As opposed to the more flexible

Multinomial Probit and Mixed Logit models, the NMNL model has closed-form choice probabilities, which can be

estimated without resorting to simulation methods. Due to its simplicity and allowing for a variety of substitution

patterns, the NMNL model remains the most common extension of the MNL model in applied work. Daly and

Zachary (1978) and McFadden (1978a) have shown that the two-level NMNL model is consistent with RUM under

the condition that the dissimilarity parameters are constrained within the unit interval. In many practical

applications, however, this condition has not been met. Börch-Supan (1990) argues that the DZM condition is

unnecessarily strong given that the NMNL model should be viewed as a local approximation. Based on the work of

Börch-Supan, Herriges and Kling (1996) who derive the necessary conditions for local consistency with random

utility maximization for two-level NMNL models; the two-level NMNL model is consistent with RUM when

dissimilarity parameters vary in interval [0,1) and when the dissimilarity parameters are greater than one. Therefore,

the two-level NMNL model is consistent for some range of the characteristics of attributes with RUM. A two-level

NMNL model is not consistent with RUM when there is a dissimilarity parameter less than zero.

In some cases of two-level NMNL models, the IIA property may not hold within some or all of the nests. In this

situation, we can divide the alternatives of these nests into several sub-sets, called sub-nests. This kind of NL model

is termed the three-level NMNL model, since within it there are three kinds of choice probabilities that will be

discussed in the section 2.

The rest of this chapter is structured as follows. Section 2 discusses the model specifications of three-level NMNL

models. Section 3 presents the information matrix for a three-level NMNL models (with two nests). We will

introduce the D-optimal criterion by subsection 3.1 .

2. MODEL SPECIFICATIONS

Following Gil-Molton and Hole, (2004), let us consider a sample of I individuals with J discrete possible

alternatives (in choice set C ), which are produced by K attributes, each with kL levels. In this paper, for three-

level NMNL model, the total number of alternatives is showed by

M

m

H

h

hm

m

J1 1

, where hmJ is the number of

alternatives in the sub-nest h of nest m . In this case, there are S choice sets each containing sJ alternatives to fit


182

model, where if sC is a choice set with sJ alternatives then S

s

sCC1

. Certainly, in such a model, the total number

of alternatives in choice set C is denoted by

K

k

kL1

with regard to the attributes and their levels. This model was

obtained based on selection of an alternative with the highest utility.

The utility related to the three-level NMNL model (i.e. choice set, sC ), where the individual i is derived when

choosing alternative j as denoted by ijhmsU . This utility is partitioned into a systematic component, ijhmsv , and a

random component, ijhms ( s denotes the choice set sC ), to produce (Because, the conditions are the same for all

individuals then we ignore index i ):

(1) , || msmshhmsjjhms UUUU

So that:

msmsmsmshmshmshhmsjhmsjhmsj vUvUvU , , |||||| ,

where hmsj | have EVD (Extreme Value Distribution type (II)) with variance 2

hm (They are correlated in the same

sub-nest, ),( || hmsjhmsjhm corr ), the distributions of msh| is such that variable hmsijCj

Uhms

|max

with variance

2

m ( hmsC denotes a choice set ( s ) which includes alternatives in sub-nest h of nest m and

),( || mshmshm corr ) and the distribution of ms is such that variable msihHh

Um

|max

will have EVD (Type II)

with variance 2 so that 0),( smiimscorr , where mH denotes the number of sub-nest in nest m . Naturally,

these three error terms are independent ( hmsij | , msih| and ims ). Now, with consideration to utility (1) hmsjv | can

be written by a regression function as:

K

k

Lkkk

T

kkhmksjhmsjhmsj kv

1

,,1,

T

|

T

|| ),,,(; ββxβx .

According to effect-type coding:

k k

k

L L

kLkk

1

1

1

,,, 0

.

In this situation, T

kβ will be rewritten by ),,,( 1,,1, kLkkkT

k β . Similarly, we will have:

);,...,,,( ||1||T

hmKsj

T

hmksj

T

shmj

T

hmsj xxxx ),...,,( ,1,|,2,|,1,|| sLhmkjshmkjshmkjT

hmksj kxxx x , where

T

kβ and T

hmksj |x

denote the characteristics of the attributes k related to choosing alternative j by individual i

(ignored) in the sub-nest h of the nest m according to choice set sC . Now, according to previous assumptions

we will have:

,',0

;',),( '

mm

mmUUCov

m

smmsUs where

',

;',

', mm

hh

mhhm

hm

m ,

hmhms JmhmhmJhmhmhmjjhmhmUUCov JI )()1(),( 2222' ,

hmJmmmhjjhmmhhmUUCov J)(),( 22''', ,

where Ιi ,...,2,1 , hmsJj ,...,2,1 , mHh ,...,2,1 ; Mm ,...,2,1 and rI is identity matrix of size r and T

rrr 11J .

Now, with consideration to the utility (1) observation variables as follows can be introduced:


183

,,0

;max,1 |''

|

|

Otherwise

UUY

hmsjCj

hmsj

hmsjhms ,

,,0

;max,1 |''

|

|

Otherwise

UUY

mshHh

msh

mshm ,

.,0

;max,1 ''

Otherwise

UUY

smm

ms

ms

Thus, when the variables msmshhmsj YYY and , || are independent msmshhmsjjhms YYYY || : (2) , || msmshhmsjjhms pppp

where 1Pr || hmsjhmsj Yp is the conditional probability of choosing alternative j , given that sub-nest h and nest m have been chosen, 1Pr || mshmsh Yp is the conditional probability of choosing sub-nest h when nest m is chosen and 1Pr msms Yp is the marginal probability of choosing nest m (with respect to choice set s ) and Pr denotes the probability of an event. Based on the distribution of the error terms of the utility, these probabilities can be calculated by (McFadden, (1978)):

hms

hm

hmsj

hm

hmsj

J

j

v

v

hmsj

e

ep

1

||

|

,

m msh

m

mh

hmsm

hm

H

h

IV

IV

msh

e

ep

1

|

,

M

m

IV

IV

ms

smm

msm

e

ep

1

,

where

m hmsm

hm

m

H

h

IV

msihHh

ms eUEIV1

| lnmax

,

hms

hm

hmsj

hms

J

j

v

hmsjCj

hms eUEIV1

|

|

lnmax

.

3. INFORMATION MATRIX

There are criteria like D-, A-, G-criterion, etc, for obtaining optimal design. In this chapter, we use D-optimal

criterion (a function of the determinant of the information matrix) in order to obtain an optimal design. Thus, first

we must obtain the information matrix for the three-level nested logit model. In this situation, the log-likelihood

function is required, defined for the choice set sC and one individual as follows:

M

m

H

h

J

j

jhmsjhmss

m hms

pyC1 1 1

)ln();(θ ,

where hmsJ denotes the number of alternatives in sub-set hm corresponding to choice set sC .

Based on the definition of the information matrix (w.r.t choice set sC ):

)3( ,

lnlnln;

,,

2|

2

|

2

||

2

jhmT

ms

T

msh

T

hmsj

msmshhmsjT

s ppppppC

Eθθθθθθθθ

θ

where:

(4) ;

III

III

III

θITT

T

sC

is the information matrix of the choice set sC and θ is the full parameters vector, so that TTTT λμβθ ,, and

TKTTT ββββ ,...,, 21 ; 1,,1, ,...,,, kLkkkT

k β and MT ,...,, 21μ and


184

TMTTT λλλλ ,...,2,1 ; mHmmTm m ,...,, 21λ ; m . This means that parameter ,k is related to the th level of attribute k ,

m , the dissimilarity parameter of the nest m and T

mλ is the dissimilarity parameters

vector of the nest m , where hm denotes the dissimilarity parameter of the sub-nest hm in nest m . Thus, the

number of parameters in the three-level NMNL model there are as follows:

MqqMmHLqnests

nestssub

M

m

worthpart

K

k

k

2111

)(1

,

where 1q is the number of part-worth parameters, 2q is the number of the dissimilarity parameters of the sub-nests

and M is the number of dissimilarity parameters of the nests, hence, the information matrix (4) is a symmetric positive semi definite qq matrix.

In order to fit the three-level NMNL model, let us consider the following experiments:

(5) , S S; / J/ 21s1 1

MqqJM

m

H

h

hm

m

where

M

m

H

h

hm

m

J1 1

denotes the total number of alternatives in population and

M

1m 1h

hmss JJmH

denotes the number

of alternatives in choice set s , selected from population, randomly. In particular, suppose that s J;Js then in

this case, there will be S choice sets each with J alternatives. However, based on (5), the SS choice sets can

be considered instead of S (in reality the number of choice setsS increases dramatically when the number of attributes and their levels increase, then S must be often reduced to S , by employing a suitable technique (See Grasshof, et al. (2004))). Also, the number of alternatives, which will be selected from sub-nests, may vary. Thus,

there are different classes can be used in order to obtain a sample with size J from the population by:

(6) J

...J

...J

...J

...J

MnHn1M

1

nhm1nH

1

n11

11

M1

1

MHMhmH

n

MJJJJJ

S

where NM

m

H

h

m

nJ;J1 1

nhm and nS is the number of choice sets, each including J alternatives.

Based on class n to create an experiment, sJ can be rewrite as ns

M

m

H

h

m

JJ1 1

nhms

, where

Nn ,sJ;Jns nS and nS s's ; JJ nhms'nhms but nhmsJ and hms'n'J (for different class and

different choice set) may be equal or not equal. According to reduce the total number of choice sets (S ) to a

reasonable number ( S ), we reduce nS to nS in each class, where qSn (avoiding singular information matrix)

then consider NnqJM

m

H

h

hm

m

; S ;S / J/ nn1 1

instead of (5). This involves choosing nS choice sets each of

them with J alternatives in each class. According to the type of model, it is possible that Nnq ;nS . In

such a case and in order to avoid a singularity information matrix, we can combine them ( nS ) together to create a new design. To obtain information matrix related to each choice set this is used (4).

Table 1: Three-level NMNL model with two nests

First Nest (1) Second Nest (2)

Sub-nest (1) Sub-nest (2)

11J 21J 2J


185

Lemma 1: The information matrix is related to a three-level nested logit model (choice set sC ) with two nests,

the first nest has two sub-nests with 11sJ and 21sJ alternatives and the second, 2sJ alternatives (Table 1 denotes a

population with 22111 JJJ alternatives) is calculated as follows:

ssss

T

s

ssss

T

s

ssss

T

s

ssss

T

s

sssss

s

IIII

IIII

IIII

IIII

C

5545352515

4544342414

3534332313

2524232212

1514131211

);(

I

I

I

I

IIIII

θI ,

where 21,11,2,1, TT βθ . For simplicity, we suppose that 11,1 … 11, qLK K and 111 , 212 then 21211 ,,,,,..., 1 q

T θ (See Appendix-AI).

To fit this model (Table 1), examine the following experiments:

nS nn SqSJJJ 122111 ; J// ,

where 2n21n11n JJJJ and:

N.1,2,...,n ; JJJ n2

2

n21

21

n11

11

JJJnS

Corollary 1: When obtaining a locally optimal design when 0β , the above information matrix (Lemma 1) should

be rewritten by the following Table (w.r.t Appendix AI):

s

T

ss 1|1J1|1

11s

1|1 11sJ

1XIXB

11sJ1|1

11s

1|1J

11XA

T

ss

s

T

ss 1|2J1|2

21s

1|2 21sJ

1XIXB

21sJ1|2

21s

1|2J

11XA

T

ss

s

T

ss 2J2

2s

2 2sJ

1XIXB

2sJ2

2s

2J

11XA

T

ss

11s1|1 Jlnsa , 21s1|2 Jlnsa , 2s2 Jlnsa 11s

11|J

1sjp

,

21s

21|J

1sjp

,

2s

2|J

1sjp

1

2

1

1

21s11s1|21 JJln

sa ssT s 1|J1|11| 1,..., xxX

2,1;,..., 1|1|11| 1 sjqsjT

sj xxx ssTs s 2|J2|12 2,...,xxX ; sjqsjT

sj xx 2|2|12| 1,...,x

ss

h

h

s ppph

h

12

2s

2

1

h1s

2

1

h1s

1 1;

JJ

J

2

1

1

1

1

ss

h

s ppph

1|11|22

1

h1s

11s1|1 1;

J

J

1

1

1

,

where rI denotes an rr -identity matrix and r1 is a r dimensional vector which all of its elements are one.


186

3.1 D-Optimal Criterion

Taking into account (5) and (6), consider the following designs to fit the model, which was introduced in Table 1:

(7) 4, ; ...

... 1

21

21NnqS

www

CCCnn

nSnn

nSnn

n

n

n

where nsn SsC ,,2,1; denotes a choice set in nth

class, which includes J alternatives. As the number of

attributes )(K and their levels ),,2,1;( KkLk increase (design (7)), the total number of possible classes ( nS ) increases dramatically. In this situation, there is a need to search for techniques to reduce the number of support

points or sample size NnSn ; such that we can obtain a reasonable number of choice sets (see Graßhoff, et al. (2004)). As have been told the D-optimality criterion in linear models typically leads to an optimal number of support points which is the same number of unknown parameters and the design takes an equal number of observations at each point (Silvey, 1980, pp.42). The bound also applies to most local optimality criteria and Bayesian criteria for linear models (Chernoff, (1972)). In contrast for non-linear models there is no such bound

available on the number of support points. Then we consider condition NnqSn ,...,2,1; (reduced) to

obtain the D-optimal criterion for design (7). The information matrix of design (7) is calculated as follow:

(8) ,;; 1

nS

s

snsnn Cw θIθI

where snw is the weight (frequency) of the choice sets snC , snC;θI is the information matrix of choice set s in

nth

class, which is calculated by Lemma 1 and the local D-optimality criterion at 0θ is 1

0;det

nθI ( 0

is true value of full parameters vector).

The Db-optimal criterion in relation to the prior distribution θ on the parameters can be defined as follows (Atkinson, et al. (2007));

(9) , ;det;det1 2

11

λμβθθIθI dddEDq M q

qn

qnnb

where , and are the spaces of μβ, and λ , respectively and λμβθ ,, . Specifically, suppose that β is independent of λ and μ such that λμβλμβ ,,, . Consider, even, the independence between λ and μ to be not complex; means that λμβλμβ ,, . For example, uniform distribution

for λ and μ , and multivariate normal distribution for , 01 ββ qN . Since usually there is not analytical expression for quantity (9), therefore, it may be approximated by the Monte-

Carlo technique that takes a large number, R , of independent draws, rθ , of θ from the prior distribution, , and

the average of local criterion

q

n

1

;det θI over all draws. Thus, the weighted D-criterion is approximated

by:

(10) ;det1

1

1

R

r

qnrnb

RD θI

if

.1;det;det1

limPr1

1

1

qnr

R

r

qnr

RE

R θIθI θ

This case, in which *

n minimizes the Db-approximate criterion (10), is called the Db-optimal design and will be

approximated by the solution:


187

(11) minarg * nbn Dnn

then:

;...

... **

2

*

1

21* Nnwww

CCC

n

n

nSnn

nSnn

n

is D-optimal design in

nS

s

nsnsnsnn Sswww1

,...,2,1;10 ,1 . Finally, according to (10) and (11) it

can be theorized that *

'n is the most suitable design for estimating parameters, if Nn' exists with the result

that minarg **' nbn

n D .

Explained previously, 10;det θI will be the local D-optimality criterion, where 0θ is the true value of the parameters. Thus, we say that

* is locally D-optimal design in if 10* ;detminarg

θI and it has

been used here to obtain a locally D-optimal design.

Table 2: Three-level NMNL model with with six alternatives

First Nest (1) Second Nest (2)

Sub-nest 1(1) Sub-nest 2(1)

21 , aa 43 , aa 65 , aa

Illustration:

Here is a population with three attributes, each comprised of two levels. In this situation, consider a three-level

NMNL model, which includes six possible alternatives in two nests (Table 2; 2221 1

M

m

H

h

hm

m

J , where

(1,1,1), (1,1,-1), (1,-1,1), (1,-1,-1), (-1,1,1), (-1,1,-1) characterize alternatives 54321 , , , , aaaaa and 6a . Fitting this

model (Table 2), consider experiment S/5/222 , based on Equation (5):

62

2

2

2

1

2

2

2

1

2

2

2

1

2

2

2

2

2

321

SSS

S .

In this case, three classes ( 3N ) we found to define the design and because of Nn ;7nS , we can combine them in order to define a suitable design. Thus, there are six choice sets (Table 3) with their design

matrixes, as shown by Table 4. In this situation, suppose that121111 , , 111331 , and

221 , for clarity, then 212131 ,,,,, T

θ is full parameters vector. In this case and keeping to

RUM conditions (Gil-Molton and Hole, (2004)), we will encounter the two conditions as follows:

S shppmp msmshmm

hm

ms

m , 2,1;11

)2 and 2,1;1

1 )1

.


188

Table 3: Three-level NMNL model with six choice sets

Choice Set First Nest (1) Second Nest (2)

Sub-nest (1)

Sub-nest (2)

1C 21 , aa 43 , aa 5a

2C 21 , aa 43 , aa 6a

3C 21 , aa 3a 65 , aa

4C 21 , aa 4a 65 , aa

5C 1a 43 , aa 65 , aa

6C 2a 43 , aa 65 , aa

Table 4: The design matrix of three-level NMNL model with six choice sets

Choice Set First Nest (1) Second Nest (2)

Sub-nest (1)

Sub-nest (2)

1C

111

111

111

111

111

2C

111

111

111

111

111

3C

111

111

111

111

111

4C

111

111

111

111

111

5C 111

111

111

111

111

6C 111

111

111

111

111

For estimating the parameters of the model, which have been described on Table 2 and based on experiments

6/5/222 and Equation (7), consider the following design:

(12) . 654321

654321

wwwwww

CCCCCC

The information matrix of design (12) is calculated by

6

1

;;s

ss Cw θIθI . Specifically, let 0β . Now,

according to Lemma 1 and Corollary 1, the elements of the information matrix sC;θI can be calculated.

According to the rule of permutation, the levels of third attribute in choice sets 1C and 2C (second nest) will

acquire permutation between these two choice sets. Also, permutation between the two choice sets 3C and 4C will

exist with respect to permutation and the levels of the third attribute in the second sub-nest of the first nest. By

permutation, the levels of the third attribute in the first sub-nest of the first nest, we will encounter permutation

between the two choice sets 5C and 6C . Thus, we can define a new design to fit the model, already introduced by


189

Table 2 and according to Table 3, as

follows: (13) . 654321

563412

wwwwww

CCCCCC

In this situation, in order to have equation between the two designs (12) and (13), the following design can be considered:

(14) , 332211

654321

CCCCCC

where 2

1321 .

Now, suppose that 21 . Moreover, we know that 1 and according Table 2, it is to be expected that

12 . Then we can assume that 21 and 42 , thus ;det I will be changed to a more

function of , 1 and 2 where 2132

1 . In this situation, the RUM conditions will be upheld when

25.00 . According to this condition for , some locally optimal design has been calculated in Table 5.

Table 5: 21 and 42 , locally optimal design when 25.00 .

0.01 0.05 0.10 0.15 0.17 0.20 0.25

*

1 0.3092 0.3120 0.3150 0.3180 0.3195 0.3210 0.3270

*

2 0.0954 0.0940 0.0925 0.0910 0.0905 0.0899 0.0895

*

3 0.0954 0.0940 0.0925 0.0910 0.0900 0.0891 0.0877

D 0.00368 0.08985 0.35081 0.77476 0.98940 1.36021 2.11334

Table 5 shows that*

1 increases as increases but *

2 and *

3 decrease when increases because of the

combination of alternatives (and attributes) in two choice sets 1C and 2C are less similar than in the other choice

sets. According to Table 4 we can observe that two sub-nests of the first nest in the choice sets 1C and 2C are

equal but there are two different alternatives in second nest. In this situation, because of equation between 1 and

2 , it is observed that *

1 increases as increases. In choice sets 3C and 4C , there are two different alternatives

in the second sub-nest of the first nest. We can see a similar situation for choice sets 5C and 6C , naturally, there

are two different alternatives in the first sub-nest of the first nest (there is not change in the second nest for choice

sets 3C to 6C ). With respect to the combination of the alternatives in the four choice sets 3C to 6C , then a similar

result for *

2 and *

3 will be obtained, so that these two weights (*

2 and *

3 ) are almost equal and decrease as

increases ( 15.00 ). But, the decreasing trend of *3 is faster than *

2 when 17.0 , then the

combination of these attributes and their levels in the two choice sets 5C and 6C are more similar than the choice

sets 3C and 4C (Table 4).

Now, suppose that 15.0 ,1.0 11 and 25.02 , then the RUM conditions are hold if 15.00 2 .

In Table 6 several locally D-optimal designs based on Table 4 were obtained. In this situation, *

2 increases as 2

increases but *

3 decreases (Table 6). That means the alternatives in the second sub-nest (first nest) of choice sets


190

5C and 6C are much similar, but the alternatives in choice sets 3C and 4C (second sub-nest) are much more

dissimilar (Table 6).

Table 6: 15.01 , 25.02 and 1.01 , locally optimal design when .15.00 2

2 0.01 0.05 0.06 0.08 0.10 0.12 0.15

*

1 0.3310 0.3338 0.3245 0.3133 0.3095 0.3103 0.3165

*

2 0.0000 0.0001 0.0285 0.0657 0.0953 0.1183 0.1458

*

3 0.1690 0.1661 0.1497 0.1210 0.0952 0.0713 0.0377

D 0.05879 0.15900 0.17522 0.19926 0.21502 0.22534 0.23478

Another Table 7, which includes was calculated some locally D-optimal designs based on

2121 . In this case, the RUM conditions hold if 10 . Table 7 denotes: *

1 increases as

increases, but *2 and *

3 decrease. Noting the decreasing trend of *

2 and *

3 , we can observe that the

decreasing trend of *

3 is faster than *

2 , because of more similarity (alternatives) in the choice sets 5C and 6C in

contrast of that between 3C and 4C .

Table 7: 2121 , locally optimal design when .10

0.05 0.10 0.15 0.20 0.30 0.40 0.50

*

1 0.2889 0.2908 0.2926 0.2943 0.2979 0.3011 0.3040

*

2 0.1055 0.1046 0.1037 0.1029 0.1012 0.0998 0.0988

*

3 0.1055 0.1046 0.1037 0.1028 0.1009 0.0991 0.0971

bD 0.02689 0.10466 0.22959 0.39868 0.86051 1.47810 2.24747

With respect to fixed values for 1.01 and 08.021 (Table 8), *

2 and *

3 are equal and they decrease

as 2 increases, but *

1 increases. Then, the alternatives in the second nest (choice sets 3C to 6C ) are more similar

than the alternatives in the second nest of the choice sets 1C and 2C .

Table 8: 1.01 and 08.021 , locally optimal design when 10 2 .

2 0.10 0.15 0.20 0.25 0.30 0.40 0.50

*

1 0.2964 0.3038 0.3074 0.3095 0.3110 0.3130 0.3150

*

2 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924

*

3 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924

D 0.09941 0.10191 0.10319 0.10409 0.10484 0.10619 0.10750


191

Suppose that 5.02 and 2.0 ,1.0 21 . In this situation, RUM conditions hold if 5.02.0 1 .

Table 9 showed that *

1 increases (almost as always, with a decreasing trend) as 1 increases. The third row of

Table 9 denotes, *

2 decreases (with a very weak decreasing trend) and *

3 is equal zero as 1 increases. That

means that the alternatives in the choice sets 5C and 6C are much more similar than are the others. And we can

say, if 5.02 and 2.0 ,1.0 21 and 145.0 , then:

0000.00000.01636.01636.03364.03364.0

654321'*'CCCCCC

is locally D-optimal design in 3

1

n

n .

According to the results which were obtained in the different classes in Table 5 to Table 9, we can say that the

alternatives in the two choice sets 1C and 2C are less dissimilar than the others and the alternatives in the choice

sets 5C and 6C are more similar than the others.

Table 9: 5.02 and 2.0 ,1.0 21 , locally optimal design when 5.02.0 1 .

1 0.20 0.25 0.30 0.35 0.40 0.45 0.50

*

1 0.3333 0.3347 0.3355 0.3360 0.3363 0.3364 0.3364

*

2 0.1667 0.1653 0.1645 0.1640 0.1637 0.1636 0.1636

*

3 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

D 0.39644 0.56598 0.75789 0.97082 1.20394 1.45688 1.72959

Note: To obtain locally D-optimal design, 1;det 0I D (Table 5 to Table 9), Maple has been used

with initial values 1.0 ,2.0 132 . The Sequential Quadratic Programming (SQP) method was also

used and naturally the number 1000 was considered for the iteration limit.

4. CONCULSION

We know that in two-level NMNL model, all of alternatives are divided into several nests. According to IIA

property which holds in each nest, it may be necessary that the alternatives of some nests or all of those to divide

into several sub-nests.

In this paper, to fit three-level NMNL model it has been used D-optimal design, which is a function of the

determinant of the information matrix. Also, we have calculated the information matrix of a three-level NMNL

model for local D-optimality criterion when 0β . Based on example, we have discussed about different classes.

We have observed, the optimal weights (choice sets) have an increasing or decreasing trend; of course, this trend

depends on similarity between the alternatives in the choice sets. For example, there isn’t much similarity between

alternatives when the optimal weights (choice sets) have an increasing trend (dissimilarity parameter increased) and

vice versa.


192

APPENDIX

Appendix AI:

The elements of the information matrix related to Lemma 1 are as follows

TssssT sssssT

sss

ss

s

ppppp

2222

2

21|21|2122

2

121

1|11|1112

1

111

11

..AABAABAABI

TssT sssT sssssT ssT ssT ssT sssss

ppppppp

221|21|2

2

1|21|11|1

2

1|1211|11|21|21|11|21|21|11|12

1

1|21|11...

..AAAAAAAAAAAAAA

T ssT sssTsssssssT ssT ssssss pppppppppp 1|21|21|11|1221|21|21|11|1211|11|21|21|11|21|121 ........ AAAAAAAAAA

ssssssssssss

ssss

sss

s

ppaapappp

aappp

21|21|21|11|11|2111|21|221|11|11

1

21

1|21|11|111|223

1

1|21|11

12

.. ......

-

....

AAA

AAI

s

T

ssssssss

T

ssss

s appppp

22

2

21|21|21|11|1212223

2

213

1... βAAAAβAABI

βAAAA

βAAAβAABI

T

sssssss

sss

T

ssss

sssT

sss

ss

s

appppp

appppp

1|11|1121|21|21|11|1

1

1|121

1|11|111|21|12

11

1|21|11

1|11|11|13

1

1|11

14

.....

..

...

βAAAA

βAAAβAABI

T

sssssss

sss

T

ssss

sssT

sss

ss

s

appppp

appppp

1|21|2221|11|11|11|2

2

1|221

1|21|221|11|22

12

1|21|11

1|21|21|23

2

1|21

15

.....

..

...

2 1|21211|21|221|11|111

1|2121

2

1|21|221|11|112

1

212

1|221|114

1

1|21|11

22

........2

-

.....

....

sssssss

sss

ssssss

ss

sss

s

appapapapp

apappp

aappp

I

sTsssssssss aapapapp

I 2221|21|221|11|111|21121

2123 ......

.

.

βA

sT sssssssss

s

T

sss

sss

s

aaapapppp

aaappp

I

1|111|11|2111|21|221|11|11

11

1|121

1|111|11|221|113

11

1|21|11

24

.......

..

....

..

βA

βA


193

sT sssssssss

s

T

sss

sss

s

aaapapppp

aaappp

I

1|221|21|2111|21|221|11|11

12

1|221

1|221|21|111|223

12

1|21|11

25

.......

..

....

..

βA

βA

sTsssTssTsssTss aappp

I 22222222

212224

2

233 ..

.

βAAββAABβ

sT sssTssss aappp

I 1|111|122221

1|121

34 ...

..

βAAβ

sT sssTsss

s aappp

I 1|221|222222

1|221

35 ...

..

βAAβ

sT sssTsssssT sssTsss aappppppp

I 1|111|11|111|11|122

11|22

1

2

1

1|11

1|11|11|14

1

1|11

44 .....

..

βAAββAABβ

sT sssTssss

s aapppp

I 1|221|21|111|12

122

121

1|21|11

45 ...1..

..

βAAβ

sT sssTsssssT sssTsss aappppppp

I 1|221|21|221|21|222

11|12

1

2

2

1|21

1|21|21|24

2

1|21

55 .....

..

βAAββAABβ

where

ss

T

ss 1|111|.1|11|1 XPXB , ssT

ss 1|221|.1|21|2 XPXB , sT

ss 11|.1|11|1 pXA , sT

ss 21|.1|21|2 pXA

ss

T

ss 22|.22 XPXB , sT

ss 2|.22 pXA , ssT

s s 1|J1|11| 1,..., xxX ,

2,1 ; ,..., 1|1|11| 1 sjqsjT

sj xxx , sss sppdiag 1|J1|11|. 1,..., P sss sppdiag 1|J1|11|. 1,..., p , ss

T

s s 2|J2|12 2,...,xxX , sjqsjT sj xx 2|2|12| 1,...,x

, sss sppdiag 2|J2|12|. 2,...,P , sss sppdiag 2|J2|12|. 2,...,p

s

Tsj

j

s ea11

1

11|J

1

1|1 ln

βX

,

s

Tsj

j

s ea21

2

21|J

1

1|2 ln

βX

,

s

Tsj

j

s ea2

2

2|J

1

2 ln

βX

s

Tsj

Tsj

j

sj

e

ep

111

11|

1

11|

J

1

11|

βX

βX

,

s

Tsj

Tsj

j

sj

e

ep

212

21|

2

21|

J

1

21|

βX

βX

,

s

Tsj

Tsj

j

sj

e

ep

22

2|

2

2|

J

1

2|

βX

βX

,


194

1

2

212

21|1

1

111

11|J

1

J

1

1|21 ln

s

Tsj

s

Tsj

jj

s eea

βXβX

,

2

22

2|

1

1

111

11|

1

1

111

11|

J

1

2

1

J

1

2

1

J

1

1

s

Tsj

h

s

Tsj

h

s

Tsj

jh j

h j

s

ee

e

p

βXβX

βX

,

2

1

J

1

J

1

1|1

1

1

1|

1

1

111

11|

h j

j

sh

shh

Tshj

s

Tsj

e

e

p

βX

βX

5. REFRENCES:

[1] Atkinson. A.C. and A.N. Donev and R.D. Tobias (2007). Optimum experimental designs, with SAS, Oxford

Univ. Press

[2] Ben-Akiva, M. (1973). The structure of travel demand models, PhD Thesis, MIT.

[3] Börch-Supan, A. (1990). On the compatibility of nested logit models with utility maximization,

Journal of Econometrics, Vol.43, 373-388.

[4] Chernoff, H. (1972). Sequential analysis and optimal design. Society for Industrial and Applied

Mathematics, Philadelphia PA.

[5] Daly, A. and S. Zachary (1978). Improved multiple choice models, in D.Hensher and M.Dalvi, eds.,

Determinates of Travel Choice, Saxon House, Sussex.

[6] Gil-Molton,M. and Hole, A. (2004). Tests for the consistency of three-level tested logit models with

utility maximization, Economics Letters,85, 133-137.

[7] Graßhoff U., Großmann H., Holling H. and Schwabe R., (2004). Optimal Designs for Main-

Effects in Linear Paired Comparison Models, Journal of Statistical Planning and Inference, 126:361-

376.

[8] Herriges, JA. And CL. Kling (1996). Testing the consistency of nested logit models with utility

maximization, Econometrics Letters, Vol.50, No.1, 33-39.

[9] McFadden, D., (1978). Modeling the choice of residential location, in: A. Karlquist et. al., eds.,

Spatial interaction theory and residential location, North-Holland, Amsterdam 75-96.

[10] Silvey, S.D.(1980). Optimal design. Chapman and Hall, London.

OPTIMAL DESIGN FOR A THREE LEVEL NESTED ...IJRRAS 2 (2) February 2010 Jafari Optimal Design for a Multinomial Logit Model 184 T M T T T 1 , 2 ,..., ; m m H m m T m m 1, 2,...,OThis

Documents