-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
181
OPTIMAL DESIGN FOR A THREE –LEVEL NESTED MULTINOMIAL
LOGIT MODEL IN DISCRETE CHOICE EXPERIMENTS
Habib Jafari
Institute for Mathematical Stochastics (IMST),
OvG University, Universitätsplatz 2, 39106 Magdeburg-Germany
ABSTRACT
In this paper we calculate the information matrix for a
three-level nested Multinomial logit model and derive the
locally D-optimal design to estimate the parameter vector.
Keywords: Conjoint Analysis, Three-level Nested MNL Model,
Discrete Choice Experiment, D-Optimal Criterion..
1. INTRODUCTION
The multinomial logit (MNL) model is most widely used in
discrete choice models due to its closed-form choice
probabilities and its consistency with the random utility
maximization (RUM). However, the MNL model suffers
from restrictive independence from irrelevant alternatives (IIA)
property, which states that the ratio of two choice
probabilities is independent of the other alternatives in the
model. This implies that a change in an attribute of one
alternative will have the same proportional impact on the
probability of each of the other alternatives being chosen.
The NMNL model relaxes the IIA property by dividing the
alternatives into subsets or nests, allowing the IIA
assumption to hold within each nest but not for alternatives in
different nests. Notwithstanding that there is the same
IIA property for the nests that it is the IIN (Independent from
Irrelevant Nest). As opposed to the more flexible
Multinomial Probit and Mixed Logit models, the NMNL model has
closed-form choice probabilities, which can be
estimated without resorting to simulation methods. Due to its
simplicity and allowing for a variety of substitution
patterns, the NMNL model remains the most common extension of
the MNL model in applied work. Daly and
Zachary (1978) and McFadden (1978a) have shown that the
two-level NMNL model is consistent with RUM under
the condition that the dissimilarity parameters are constrained
within the unit interval. In many practical
applications, however, this condition has not been met.
Börch-Supan (1990) argues that the DZM condition is
unnecessarily strong given that the NMNL model should be viewed
as a local approximation. Based on the work of
Börch-Supan, Herriges and Kling (1996) who derive the necessary
conditions for local consistency with random
utility maximization for two-level NMNL models; the two-level
NMNL model is consistent with RUM when
dissimilarity parameters vary in interval [0,1) and when the
dissimilarity parameters are greater than one. Therefore,
the two-level NMNL model is consistent for some range of the
characteristics of attributes with RUM. A two-level
NMNL model is not consistent with RUM when there is a
dissimilarity parameter less than zero.
In some cases of two-level NMNL models, the IIA property may not
hold within some or all of the nests. In this
situation, we can divide the alternatives of these nests into
several sub-sets, called sub-nests. This kind of NL model
is termed the three-level NMNL model, since within it there are
three kinds of choice probabilities that will be
discussed in the section 2.
The rest of this chapter is structured as follows. Section 2
discusses the model specifications of three-level NMNL
models. Section 3 presents the information matrix for a
three-level NMNL models (with two nests). We will
introduce the D-optimal criterion by subsection 3.1 .
2. MODEL SPECIFICATIONS
Following Gil-Molton and Hole, (2004), let us consider a sample
of I individuals with J discrete possible
alternatives (in choice set C ), which are produced by K
attributes, each with kL levels. In this paper, for three-
level NMNL model, the total number of alternatives is showed
by
M
m
H
h
hm
m
J1 1
, where hmJ is the number of
alternatives in the sub-nest h of nest m . In this case, there
are S choice sets each containing sJ alternatives to fit
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
182
model, where if sC is a choice set with sJ alternatives then
S
s
sCC1
. Certainly, in such a model, the total number
of alternatives in choice set C is denoted by
K
k
kL1
with regard to the attributes and their levels. This model
was
obtained based on selection of an alternative with the highest
utility.
The utility related to the three-level NMNL model (i.e. choice
set, sC ), where the individual i is derived when
choosing alternative j as denoted by ijhmsU . This utility is
partitioned into a systematic component, ijhmsv , and a
random component, ijhms ( s denotes the choice set sC ), to
produce (Because, the conditions are the same for all
individuals then we ignore index i ):
(1) , || msmshhmsjjhms UUUU
So that:
msmsmsmshmshmshhmsjhmsjhmsj vUvUvU , , |||||| ,
where hmsj | have EVD (Extreme Value Distribution type (II))
with variance 2
hm (They are correlated in the same
sub-nest, ),( || hmsjhmsjhm corr ), the distributions of msh| is
such that variable hmsijCj
Uhms
|max
with variance
2
m ( hmsC denotes a choice set ( s ) which includes alternatives
in sub-nest h of nest m and
),( || mshmshm corr ) and the distribution of ms is such that
variable msihHh
Um
|max
will have EVD (Type II)
with variance 2 so that 0),( smiimscorr , where mH denotes the
number of sub-nest in nest m . Naturally,
these three error terms are independent ( hmsij | , msih| and
ims ). Now, with consideration to utility (1) hmsjv | can
be written by a regression function as:
K
k
Lkkk
T
kkhmksjhmsjhmsj kv
1
,,1,
T
|
T
|| ),,,(; ββxβx .
According to effect-type coding:
k k
k
L L
kLkk
1
1
1
,,, 0
.
In this situation, T
kβ will be rewritten by ),,,( 1,,1, kLkkkT
k β . Similarly, we will have:
);,...,,,( ||1||T
hmKsj
T
hmksj
T
shmj
T
hmsj xxxx ),...,,( ,1,|,2,|,1,|| sLhmkjshmkjshmkjT
hmksj kxxx x , where
T
kβ and T
hmksj |x
denote the characteristics of the attributes k related to
choosing alternative j by individual i
(ignored) in the sub-nest h of the nest m according to choice
set sC . Now, according to previous assumptions
we will have:
,',0
;',),( '
mm
mmUUCov
m
smmsUs where
',
;',
', mm
hh
mhhm
hm
m ,
hmhms JmhmhmJhmhmhmjjhmhmUUCov JI )()1(),( 2222' ,
hmJmmmhjjhmmhhmUUCov J)(),( 22''', ,
where Ιi ,...,2,1 , hmsJj ,...,2,1 , mHh ,...,2,1 ; Mm ,...,2,1
and rI is identity matrix of size r and T
rrr 11J .
Now, with consideration to the utility (1) observation variables
as follows can be introduced:
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
183
,,0
;max,1 |''
|
|
Otherwise
UUY
hmsjCj
hmsj
hmsjhms ,
,,0
;max,1 |''
|
|
Otherwise
UUY
mshHh
msh
mshm ,
.,0
;max,1 ''
Otherwise
UUY
smm
ms
ms
Thus, when the variables msmshhmsj YYY and , || are independent
msmshhmsjjhms YYYY || : (2) , || msmshhmsjjhms pppp
where 1Pr || hmsjhmsj Yp is the conditional probability of
choosing alternative j , given that sub-nest h and nest m have been
chosen, 1Pr || mshmsh Yp is the conditional probability of choosing
sub-nest h when nest m is chosen and 1Pr msms Yp is the marginal
probability of choosing nest m (with respect to choice set s ) and
Pr denotes the probability of an event. Based on the distribution
of the error terms of the utility, these probabilities can be
calculated by (McFadden, (1978)):
hms
hm
hmsj
hm
hmsj
J
j
v
v
hmsj
e
ep
1
||
|
,
m msh
m
mh
hmsm
hm
H
h
IV
IV
msh
e
ep
1
|
,
M
m
IV
IV
ms
smm
msm
e
ep
1
,
where
m hmsm
hm
m
H
h
IV
msihHh
ms eUEIV1
| lnmax
,
hms
hm
hmsj
hms
J
j
v
hmsjCj
hms eUEIV1
|
|
lnmax
.
3. INFORMATION MATRIX
There are criteria like D-, A-, G-criterion, etc, for obtaining
optimal design. In this chapter, we use D-optimal
criterion (a function of the determinant of the information
matrix) in order to obtain an optimal design. Thus, first
we must obtain the information matrix for the three-level nested
logit model. In this situation, the log-likelihood
function is required, defined for the choice set sC and one
individual as follows:
M
m
H
h
J
j
jhmsjhmss
m hms
pyC1 1 1
)ln();(θ ,
where hmsJ denotes the number of alternatives in sub-set hm
corresponding to choice set sC .
Based on the definition of the information matrix (w.r.t choice
set sC ):
)3( ,
lnlnln;
,,
2|
2
|
2
||
2
jhmT
ms
T
msh
T
hmsj
msmshhmsjT
s ppppppC
Eθθθθθθθθ
θ
where:
(4) ;
III
III
III
θITT
T
sC
is the information matrix of the choice set sC and θ is the full
parameters vector, so that TTTT λμβθ ,, and
TKTTT ββββ ,...,, 21 ; 1,,1, ,...,,, kLkkkT
k β and MT ,...,, 21μ and
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
184
TMTTT λλλλ ,...,2,1 ; mHmmTm m ,...,, 21λ ; m . This means that
parameter ,k is related to the th level of attribute k ,
m , the dissimilarity parameter of the nest m and T
mλ is the dissimilarity parameters
vector of the nest m , where hm denotes the dissimilarity
parameter of the sub-nest hm in nest m . Thus, the
number of parameters in the three-level NMNL model there are as
follows:
MqqMmHLqnests
nestssub
M
m
worthpart
K
k
k
2111
)(1
,
where 1q is the number of part-worth parameters, 2q is the
number of the dissimilarity parameters of the sub-nests
and M is the number of dissimilarity parameters of the nests,
hence, the information matrix (4) is a symmetric positive semi
definite qq matrix.
In order to fit the three-level NMNL model, let us consider the
following experiments:
(5) , S S; / J/ 21s1 1
MqqJM
m
H
h
hm
m
where
M
m
H
h
hm
m
J1 1
denotes the total number of alternatives in population and
M
1m 1h
hmss JJmH
denotes the number
of alternatives in choice set s , selected from population,
randomly. In particular, suppose that s J;Js then in
this case, there will be S choice sets each with J alternatives.
However, based on (5), the SS choice sets can
be considered instead of S (in reality the number of choice
setsS increases dramatically when the number of attributes and
their levels increase, then S must be often reduced to S , by
employing a suitable technique (See Grasshof, et al. (2004))).
Also, the number of alternatives, which will be selected from
sub-nests, may vary. Thus,
there are different classes can be used in order to obtain a
sample with size J from the population by:
(6) J
...J
...J
...J
...J
MnHn1M
1
nhm1nH
1
n11
11
M1
1
MHMhmH
n
MJJJJJ
S
where NM
m
H
h
m
nJ;J1 1
nhm and nS is the number of choice sets, each including J
alternatives.
Based on class n to create an experiment, sJ can be rewrite as
ns
M
m
H
h
m
JJ1 1
nhms
, where
Nn ,sJ;Jns nS and nS s's ; JJ nhms'nhms but nhmsJ and hms'n'J
(for different class and
different choice set) may be equal or not equal. According to
reduce the total number of choice sets (S ) to a
reasonable number ( S ), we reduce nS to nS in each class, where
qSn (avoiding singular information matrix)
then consider NnqJM
m
H
h
hm
m
; S ;S / J/ nn1 1
instead of (5). This involves choosing nS choice sets each
of
them with J alternatives in each class. According to the type of
model, it is possible that Nnq ;nS . In
such a case and in order to avoid a singularity information
matrix, we can combine them ( nS ) together to create a new design.
To obtain information matrix related to each choice set this is
used (4).
Table 1: Three-level NMNL model with two nests
First Nest (1) Second Nest (2)
Sub-nest (1) Sub-nest (2)
11J 21J 2J
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
185
Lemma 1: The information matrix is related to a three-level
nested logit model (choice set sC ) with two nests,
the first nest has two sub-nests with 11sJ and 21sJ alternatives
and the second, 2sJ alternatives (Table 1 denotes a
population with 22111 JJJ alternatives) is calculated as
follows:
ssss
T
s
ssss
T
s
ssss
T
s
ssss
T
s
sssss
s
IIII
IIII
IIII
IIII
C
5545352515
4544342414
3534332313
2524232212
1514131211
);(
I
I
I
I
IIIII
θI ,
where 21,11,2,1, TT βθ . For simplicity, we suppose that 11,1 …
11, qLK K and 111 , 212 then 21211 ,,,,,..., 1 q
T θ (See Appendix-AI).
To fit this model (Table 1), examine the following
experiments:
nS nn SqSJJJ 122111 ; J// ,
where 2n21n11n JJJJ and:
N.1,2,...,n ; JJJ n2
2
n21
21
n11
11
JJJnS
Corollary 1: When obtaining a locally optimal design when 0β ,
the above information matrix (Lemma 1) should
be rewritten by the following Table (w.r.t Appendix AI):
s
T
ss 1|1J1|1
11s
1|1 11sJ
1XIXB
11sJ1|1
11s
1|1J
11XA
T
ss
s
T
ss 1|2J1|2
21s
1|2 21sJ
1XIXB
21sJ1|2
21s
1|2J
11XA
T
ss
s
T
ss 2J2
2s
2 2sJ
1XIXB
2sJ2
2s
2J
11XA
T
ss
11s1|1 Jlnsa , 21s1|2 Jlnsa , 2s2 Jlnsa 11s
11|J
1sjp
,
21s
21|J
1sjp
,
2s
2|J
1sjp
1
2
1
1
21s11s1|21 JJln
sa ssT s 1|J1|11| 1,..., xxX
2,1;,..., 1|1|11| 1 sjqsjT
sj xxx ssTs s 2|J2|12 2,...,xxX ; sjqsjT
sj xx 2|2|12| 1,...,x
ss
h
h
s ppph
h
12
2s
2
1
h1s
2
1
h1s
1 1;
JJ
J
2
1
1
1
1
ss
h
s ppph
1|11|22
1
h1s
11s1|1 1;
J
J
1
1
1
,
where rI denotes an rr -identity matrix and r1 is a r
dimensional vector which all of its elements are one.
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
186
3.1 D-Optimal Criterion
Taking into account (5) and (6), consider the following designs
to fit the model, which was introduced in Table 1:
(7) 4, ; ...
... 1
21
21NnqS
www
CCCnn
nSnn
nSnn
n
n
n
where nsn SsC ,,2,1; denotes a choice set in nth
class, which includes J alternatives. As the number of
attributes )(K and their levels ),,2,1;( KkLk increase (design
(7)), the total number of possible classes ( nS ) increases
dramatically. In this situation, there is a need to search for
techniques to reduce the number of support
points or sample size NnSn ; such that we can obtain a
reasonable number of choice sets (see Graßhoff, et al. (2004)). As
have been told the D-optimality criterion in linear models
typically leads to an optimal number of support points which is the
same number of unknown parameters and the design takes an equal
number of observations at each point (Silvey, 1980, pp.42). The
bound also applies to most local optimality criteria and Bayesian
criteria for linear models (Chernoff, (1972)). In contrast for
non-linear models there is no such bound
available on the number of support points. Then we consider
condition NnqSn ,...,2,1; (reduced) to
obtain the D-optimal criterion for design (7). The information
matrix of design (7) is calculated as follow:
(8) ,;; 1
nS
s
snsnn Cw θIθI
where snw is the weight (frequency) of the choice sets snC ,
snC;θI is the information matrix of choice set s in
nth
class, which is calculated by Lemma 1 and the local D-optimality
criterion at 0θ is 1
0;det
nθI ( 0
is true value of full parameters vector).
The Db-optimal criterion in relation to the prior distribution θ
on the parameters can be defined as follows (Atkinson, et al.
(2007));
(9) , ;det;det1 2
11
λμβθθIθI dddEDq M q
qn
qnnb
where , and are the spaces of μβ, and λ , respectively and λμβθ
,, . Specifically, suppose that β is independent of λ and μ such
that λμβλμβ ,,, . Consider, even, the independence between λ and μ
to be not complex; means that λμβλμβ ,, . For example, uniform
distribution
for λ and μ , and multivariate normal distribution for , 01 ββ
qN . Since usually there is not analytical expression for quantity
(9), therefore, it may be approximated by the Monte-
Carlo technique that takes a large number, R , of independent
draws, rθ , of θ from the prior distribution, , and
the average of local criterion
q
n
1
;det θI over all draws. Thus, the weighted D-criterion is
approximated
by:
(10) ;det1
1
1
R
r
qnrnb
RD θI
if
.1;det;det1
limPr1
1
1
qnr
R
r
qnr
RE
R θIθI θ
This case, in which *
n minimizes the Db-approximate criterion (10), is called the
Db-optimal design and will be
approximated by the solution:
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
187
(11) minarg * nbn Dnn
then:
;...
... **
2
*
1
21* Nnwww
CCC
n
n
nSnn
nSnn
n
is D-optimal design in
nS
s
nsnsnsnn Sswww1
,...,2,1;10 ,1 . Finally, according to (10) and (11) it
can be theorized that *
'n is the most suitable design for estimating parameters, if Nn'
exists with the result
that minarg **' nbn
n D .
Explained previously, 10;det θI will be the local D-optimality
criterion, where 0θ is the true value of the parameters. Thus, we
say that
* is locally D-optimal design in if 10* ;detminarg
θI and it has
been used here to obtain a locally D-optimal design.
Table 2: Three-level NMNL model with with six alternatives
First Nest (1) Second Nest (2)
Sub-nest 1(1) Sub-nest 2(1)
21 , aa 43 , aa 65 , aa
Illustration:
Here is a population with three attributes, each comprised of
two levels. In this situation, consider a three-level
NMNL model, which includes six possible alternatives in two
nests (Table 2; 2221 1
M
m
H
h
hm
m
J , where
(1,1,1), (1,1,-1), (1,-1,1), (1,-1,-1), (-1,1,1), (-1,1,-1)
characterize alternatives 54321 , , , , aaaaa and 6a . Fitting
this
model (Table 2), consider experiment S/5/222 , based on Equation
(5):
62
2
2
2
1
2
2
2
1
2
2
2
1
2
2
2
2
2
321
SSS
S .
In this case, three classes ( 3N ) we found to define the design
and because of Nn ;7nS , we can combine them in order to define a
suitable design. Thus, there are six choice sets (Table 3) with
their design
matrixes, as shown by Table 4. In this situation, suppose
that121111 , , 111331 , and
221 , for clarity, then 212131 ,,,,, T
θ is full parameters vector. In this case and keeping to
RUM conditions (Gil-Molton and Hole, (2004)), we will encounter
the two conditions as follows:
S shppmp msmshmm
hm
ms
m , 2,1;11
)2 and 2,1;1
1 )1
.
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
188
Table 3: Three-level NMNL model with six choice sets
Choice Set First Nest (1) Second Nest (2)
Sub-nest (1)
Sub-nest (2)
1C 21 , aa 43 , aa 5a
2C 21 , aa 43 , aa 6a
3C 21 , aa 3a 65 , aa
4C 21 , aa 4a 65 , aa
5C 1a 43 , aa 65 , aa
6C 2a 43 , aa 65 , aa
Table 4: The design matrix of three-level NMNL model with six
choice sets
Choice Set First Nest (1) Second Nest (2)
Sub-nest (1)
Sub-nest (2)
1C
111
111
111
111
111
2C
111
111
111
111
111
3C
111
111
111
111
111
4C
111
111
111
111
111
5C 111
111
111
111
111
6C 111
111
111
111
111
For estimating the parameters of the model, which have been
described on Table 2 and based on experiments
6/5/222 and Equation (7), consider the following design:
(12) . 654321
654321
wwwwww
CCCCCC
The information matrix of design (12) is calculated by
6
1
;;s
ss Cw θIθI . Specifically, let 0β . Now,
according to Lemma 1 and Corollary 1, the elements of the
information matrix sC;θI can be calculated.
According to the rule of permutation, the levels of third
attribute in choice sets 1C and 2C (second nest) will
acquire permutation between these two choice sets. Also,
permutation between the two choice sets 3C and 4C will
exist with respect to permutation and the levels of the third
attribute in the second sub-nest of the first nest. By
permutation, the levels of the third attribute in the first
sub-nest of the first nest, we will encounter permutation
between the two choice sets 5C and 6C . Thus, we can define a
new design to fit the model, already introduced by
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
189
Table 2 and according to Table 3, as
follows: (13) . 654321
563412
wwwwww
CCCCCC
In this situation, in order to have equation between the two
designs (12) and (13), the following design can be considered:
(14) , 332211
654321
CCCCCC
where 2
1321 .
Now, suppose that 21 . Moreover, we know that 1 and according
Table 2, it is to be expected that
12 . Then we can assume that 21 and 42 , thus ;det I will be
changed to a more
function of , 1 and 2 where 2132
1 . In this situation, the RUM conditions will be upheld
when
25.00 . According to this condition for , some locally optimal
design has been calculated in Table 5.
Table 5: 21 and 42 , locally optimal design when 25.00 .
0.01 0.05 0.10 0.15 0.17 0.20 0.25
*
1 0.3092 0.3120 0.3150 0.3180 0.3195 0.3210 0.3270
*
2 0.0954 0.0940 0.0925 0.0910 0.0905 0.0899 0.0895
*
3 0.0954 0.0940 0.0925 0.0910 0.0900 0.0891 0.0877
D 0.00368 0.08985 0.35081 0.77476 0.98940 1.36021 2.11334
Table 5 shows that*
1 increases as increases but *
2 and *
3 decrease when increases because of the
combination of alternatives (and attributes) in two choice sets
1C and 2C are less similar than in the other choice
sets. According to Table 4 we can observe that two sub-nests of
the first nest in the choice sets 1C and 2C are
equal but there are two different alternatives in second nest.
In this situation, because of equation between 1 and
2 , it is observed that *
1 increases as increases. In choice sets 3C and 4C , there are
two different alternatives
in the second sub-nest of the first nest. We can see a similar
situation for choice sets 5C and 6C , naturally, there
are two different alternatives in the first sub-nest of the
first nest (there is not change in the second nest for choice
sets 3C to 6C ). With respect to the combination of the
alternatives in the four choice sets 3C to 6C , then a similar
result for *
2 and *
3 will be obtained, so that these two weights (*
2 and *
3 ) are almost equal and decrease as
increases ( 15.00 ). But, the decreasing trend of *3 is faster
than *
2 when 17.0 , then the
combination of these attributes and their levels in the two
choice sets 5C and 6C are more similar than the choice
sets 3C and 4C (Table 4).
Now, suppose that 15.0 ,1.0 11 and 25.02 , then the RUM
conditions are hold if 15.00 2 .
In Table 6 several locally D-optimal designs based on Table 4
were obtained. In this situation, *
2 increases as 2
increases but *
3 decreases (Table 6). That means the alternatives in the second
sub-nest (first nest) of choice sets
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
190
5C and 6C are much similar, but the alternatives in choice sets
3C and 4C (second sub-nest) are much more
dissimilar (Table 6).
Table 6: 15.01 , 25.02 and 1.01 , locally optimal design when
.15.00 2
2 0.01 0.05 0.06 0.08 0.10 0.12 0.15
*
1 0.3310 0.3338 0.3245 0.3133 0.3095 0.3103 0.3165
*
2 0.0000 0.0001 0.0285 0.0657 0.0953 0.1183 0.1458
*
3 0.1690 0.1661 0.1497 0.1210 0.0952 0.0713 0.0377
D 0.05879 0.15900 0.17522 0.19926 0.21502 0.22534 0.23478
Another Table 7, which includes was calculated some locally
D-optimal designs based on
2121 . In this case, the RUM conditions hold if 10 . Table 7
denotes: *
1 increases as
increases, but *2 and *
3 decrease. Noting the decreasing trend of *
2 and *
3 , we can observe that the
decreasing trend of *
3 is faster than *
2 , because of more similarity (alternatives) in the choice sets
5C and 6C in
contrast of that between 3C and 4C .
Table 7: 2121 , locally optimal design when .10
0.05 0.10 0.15 0.20 0.30 0.40 0.50
*
1 0.2889 0.2908 0.2926 0.2943 0.2979 0.3011 0.3040
*
2 0.1055 0.1046 0.1037 0.1029 0.1012 0.0998 0.0988
*
3 0.1055 0.1046 0.1037 0.1028 0.1009 0.0991 0.0971
bD 0.02689 0.10466 0.22959 0.39868 0.86051 1.47810 2.24747
With respect to fixed values for 1.01 and 08.021 (Table 8),
*
2 and *
3 are equal and they decrease
as 2 increases, but *
1 increases. Then, the alternatives in the second nest (choice
sets 3C to 6C ) are more similar
than the alternatives in the second nest of the choice sets 1C
and 2C .
Table 8: 1.01 and 08.021 , locally optimal design when 10 2
.
2 0.10 0.15 0.20 0.25 0.30 0.40 0.50
*
1 0.2964 0.3038 0.3074 0.3095 0.3110 0.3130 0.3150
*
2 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924
*
3 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924
D 0.09941 0.10191 0.10319 0.10409 0.10484 0.10619 0.10750
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
191
Suppose that 5.02 and 2.0 ,1.0 21 . In this situation, RUM
conditions hold if 5.02.0 1 .
Table 9 showed that *
1 increases (almost as always, with a decreasing trend) as 1
increases. The third row of
Table 9 denotes, *
2 decreases (with a very weak decreasing trend) and *
3 is equal zero as 1 increases. That
means that the alternatives in the choice sets 5C and 6C are
much more similar than are the others. And we can
say, if 5.02 and 2.0 ,1.0 21 and 145.0 , then:
0000.00000.01636.01636.03364.03364.0
654321'*'CCCCCC
is locally D-optimal design in 3
1
n
n .
According to the results which were obtained in the different
classes in Table 5 to Table 9, we can say that the
alternatives in the two choice sets 1C and 2C are less
dissimilar than the others and the alternatives in the choice
sets 5C and 6C are more similar than the others.
Table 9: 5.02 and 2.0 ,1.0 21 , locally optimal design when
5.02.0 1 .
1 0.20 0.25 0.30 0.35 0.40 0.45 0.50
*
1 0.3333 0.3347 0.3355 0.3360 0.3363 0.3364 0.3364
*
2 0.1667 0.1653 0.1645 0.1640 0.1637 0.1636 0.1636
*
3 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
D 0.39644 0.56598 0.75789 0.97082 1.20394 1.45688 1.72959
Note: To obtain locally D-optimal design, 1;det 0I D (Table 5 to
Table 9), Maple has been used
with initial values 1.0 ,2.0 132 . The Sequential Quadratic
Programming (SQP) method was also
used and naturally the number 1000 was considered for the
iteration limit.
4. CONCULSION
We know that in two-level NMNL model, all of alternatives are
divided into several nests. According to IIA
property which holds in each nest, it may be necessary that the
alternatives of some nests or all of those to divide
into several sub-nests.
In this paper, to fit three-level NMNL model it has been used
D-optimal design, which is a function of the
determinant of the information matrix. Also, we have calculated
the information matrix of a three-level NMNL
model for local D-optimality criterion when 0β . Based on
example, we have discussed about different classes.
We have observed, the optimal weights (choice sets) have an
increasing or decreasing trend; of course, this trend
depends on similarity between the alternatives in the choice
sets. For example, there isn’t much similarity between
alternatives when the optimal weights (choice sets) have an
increasing trend (dissimilarity parameter increased) and
vice versa.
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
192
APPENDIX
Appendix AI:
The elements of the information matrix related to Lemma 1 are as
follows
TssssT sssssT
sss
ss
s
ppppp
2222
2
21|21|2122
2
121
1|11|1112
1
111
11
..AABAABAABI
TssT sssT sssssT ssT ssT ssT sssss
ppppppp
221|21|2
2
1|21|11|1
2
1|1211|11|21|21|11|21|21|11|12
1
1|21|11...
..AAAAAAAAAAAAAA
T ssT sssTsssssssT ssT ssssss pppppppppp
1|21|21|11|1221|21|21|11|1211|11|21|21|11|21|121 ........
AAAAAAAAAA
ssssssssssss
ssss
sss
s
ppaapappp
aappp
21|21|21|11|11|2111|21|221|11|11
1
21
1|21|11|111|223
1
1|21|11
12
.. ......
-
....
AAA
AAI
s
T
ssssssss
T
ssss
s appppp
22
2
21|21|21|11|1212223
2
213
1... βAAAAβAABI
βAAAA
βAAAβAABI
T
sssssss
sss
T
ssss
sssT
sss
ss
s
appppp
appppp
1|11|1121|21|21|11|1
1
1|121
1|11|111|21|12
11
1|21|11
1|11|11|13
1
1|11
14
.....
..
...
βAAAA
βAAAβAABI
T
sssssss
sss
T
ssss
sssT
sss
ss
s
appppp
appppp
1|21|2221|11|11|11|2
2
1|221
1|21|221|11|22
12
1|21|11
1|21|21|23
2
1|21
15
.....
..
...
2 1|21211|21|221|11|111
1|2121
2
1|21|221|11|112
1
212
1|221|114
1
1|21|11
22
........2
-
.....
....
sssssss
sss
ssssss
ss
sss
s
appapapapp
apappp
aappp
I
sTsssssssss aapapapp
I 2221|21|221|11|111|21121
2123 ......
.
.
βA
sT sssssssss
s
T
sss
sss
s
aaapapppp
aaappp
I
1|111|11|2111|21|221|11|11
11
1|121
1|111|11|221|113
11
1|21|11
24
.......
..
....
..
βA
βA
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
193
sT sssssssss
s
T
sss
sss
s
aaapapppp
aaappp
I
1|221|21|2111|21|221|11|11
12
1|221
1|221|21|111|223
12
1|21|11
25
.......
..
....
..
βA
βA
sTsssTssTsssTss aappp
I 22222222
212224
2
233 ..
.
βAAββAABβ
sT sssTssss aappp
I 1|111|122221
1|121
34 ...
..
βAAβ
sT sssTsss
s aappp
I 1|221|222222
1|221
35 ...
..
βAAβ
sT sssTsssssT sssTsss aappppppp
I 1|111|11|111|11|122
11|22
1
2
1
1|11
1|11|11|14
1
1|11
44 .....
..
βAAββAABβ
sT sssTssss
s aapppp
I 1|221|21|111|12
122
121
1|21|11
45 ...1..
..
βAAβ
sT sssTsssssT sssTsss aappppppp
I 1|221|21|221|21|222
11|12
1
2
2
1|21
1|21|21|24
2
1|21
55 .....
..
βAAββAABβ
where
ss
T
ss 1|111|.1|11|1 XPXB , ssT
ss 1|221|.1|21|2 XPXB , sT
ss 11|.1|11|1 pXA , sT
ss 21|.1|21|2 pXA
ss
T
ss 22|.22 XPXB , sT
ss 2|.22 pXA , ssT
s s 1|J1|11| 1,..., xxX ,
2,1 ; ,..., 1|1|11| 1 sjqsjT
sj xxx , sss sppdiag 1|J1|11|. 1,..., P sss sppdiag 1|J1|11|.
1,..., p , ss
T
s s 2|J2|12 2,...,xxX , sjqsjT sj xx 2|2|12| 1,...,x
, sss sppdiag 2|J2|12|. 2,...,P , sss sppdiag 2|J2|12|.
2,...,p
s
Tsj
j
s ea11
1
11|J
1
1|1 ln
βX
,
s
Tsj
j
s ea21
2
21|J
1
1|2 ln
βX
,
s
Tsj
j
s ea2
2
2|J
1
2 ln
βX
s
Tsj
Tsj
j
sj
e
ep
111
11|
1
11|
J
1
11|
βX
βX
,
s
Tsj
Tsj
j
sj
e
ep
212
21|
2
21|
J
1
21|
βX
βX
,
s
Tsj
Tsj
j
sj
e
ep
22
2|
2
2|
J
1
2|
βX
βX
,
-
IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a
Multinomial Logit Model
194
1
2
212
21|1
1
111
11|J
1
J
1
1|21 ln
s
Tsj
s
Tsj
jj
s eea
βXβX
,
2
22
2|
1
1
111
11|
1
1
111
11|
J
1
2
1
J
1
2
1
J
1
1
s
Tsj
h
s
Tsj
h
s
Tsj
jh j
h j
s
ee
e
p
βXβX
βX
,
2
1
J
1
J
1
1|1
1
1
1|
1
1
111
11|
h j
j
sh
shh
Tshj
s
Tsj
e
e
p
βX
βX
5. REFRENCES:
[1] Atkinson. A.C. and A.N. Donev and R.D. Tobias (2007).
Optimum experimental designs, with SAS, Oxford
Univ. Press
[2] Ben-Akiva, M. (1973). The structure of travel demand models,
PhD Thesis, MIT.
[3] Börch-Supan, A. (1990). On the compatibility of nested logit
models with utility maximization,
Journal of Econometrics, Vol.43, 373-388.
[4] Chernoff, H. (1972). Sequential analysis and optimal design.
Society for Industrial and Applied
Mathematics, Philadelphia PA.
[5] Daly, A. and S. Zachary (1978). Improved multiple choice
models, in D.Hensher and M.Dalvi, eds.,
Determinates of Travel Choice, Saxon House, Sussex.
[6] Gil-Molton,M. and Hole, A. (2004). Tests for the consistency
of three-level tested logit models with
utility maximization, Economics Letters,85, 133-137.
[7] Graßhoff U., Großmann H., Holling H. and Schwabe R., (2004).
Optimal Designs for Main-
Effects in Linear Paired Comparison Models, Journal of
Statistical Planning and Inference, 126:361-
376.
[8] Herriges, JA. And CL. Kling (1996). Testing the consistency
of nested logit models with utility
maximization, Econometrics Letters, Vol.50, No.1, 33-39.
[9] McFadden, D., (1978). Modeling the choice of residential
location, in: A. Karlquist et. al., eds.,
Spatial interaction theory and residential location,
North-Holland, Amsterdam 75-96.
[10] Silvey, S.D.(1980). Optimal design. Chapman and Hall,
London.