Top Banner
IJRRAS 2 (2) February 2010 Jafari Optimal Design for a Multinomial Logit Model 181 OPTIMAL DESIGN FOR A THREE LEVEL NESTED MULTINOMIAL LOGIT MODEL IN DISCRETE CHOICE EXPERIMENTS Habib Jafari Institute for Mathematical Stochastics (IMST), OvG University, Universitätsplatz 2, 39106 Magdeburg-Germany ABSTRACT In this paper we calculate the information matrix for a three-level nested Multinomial logit model and derive the locally D-optimal design to estimate the parameter vector. Keywords: Conjoint Analysis, Three-level Nested MNL Model, Discrete Choice Experiment, D-Optimal Criterion.. 1. INTRODUCTION The multinomial logit (MNL) model is most widely used in discrete choice models due to its closed-form choice probabilities and its consistency with the random utility maximization (RUM). However, the MNL model suffers from restrictive independence from irrelevant alternatives (IIA) property, which states that the ratio of two choice probabilities is independent of the other alternatives in the model. This implies that a change in an attribute of one alternative will have the same proportional impact on the probability of each of the other alternatives being chosen. The NMNL model relaxes the IIA property by dividing the alternatives into subsets or nests, allowing the IIA assumption to hold within each nest but not for alternatives in different nests. Notwithstanding that there is the same IIA property for the nests that it is the IIN (Independent from Irrelevant Nest). As opposed to the more flexible Multinomial Probit and Mixed Logit models, the NMNL model has closed-form choice probabilities, which can be estimated without resorting to simulation methods. Due to its simplicity and allowing for a variety of substitution patterns, the NMNL model remains the most common extension of the MNL model in applied work. Daly and Zachary (1978) and McFadden (1978a) have shown that the two-level NMNL model is consistent with RUM under the condition that the dissimilarity parameters are constrained within the unit interval. In many practical applications, however, this condition has not been met. Börch-Supan (1990) argues that the DZM condition is unnecessarily strong given that the NMNL model should be viewed as a local approximation. Based on the work of Börch-Supan, Herriges and Kling (1996) who derive the necessary conditions for local consistency with random utility maximization for two-level NMNL models; the two-level NMNL model is consistent with RUM when dissimilarity parameters vary in interval [0,1) and when the dissimilarity parameters are greater than one. Therefore, the two-level NMNL model is consistent for some range of the characteristics of attributes with RUM. A two-level NMNL model is not consistent with RUM when there is a dissimilarity parameter less than zero. In some cases of two-level NMNL models, the IIA property may not hold within some or all of the nests. In this situation, we can divide the alternatives of these nests into several sub-sets, called sub-nests. This kind of NL model is termed the three-level NMNL model, since within it there are three kinds of choice probabilities that will be discussed in the section 2. The rest of this chapter is structured as follows. Section 2 discusses the model specifications of three-level NMNL models. Section 3 presents the information matrix for a three-level NMNL models (with two nests). We will introduce the D-optimal criterion by subsection 3.1 . 2. MODEL SPECIFICATIONS Following Gil-Molton and Hole, (2004), let us consider a sample of I individuals with J discrete possible alternatives (in choice set C ), which are produced by K attributes, each with k L levels. In this paper, for three- level NMNL model, the total number of alternatives is showed by M m H h hm m J 1 1 , where hm J is the number of alternatives in the sub-nest h of nest m . In this case, there are S choice sets each containing s J alternatives to fit
14

OPTIMAL DESIGN FOR A THREE LEVEL NESTED ...IJRRAS 2 (2) February 2010 Jafari Optimal Design for a Multinomial Logit Model 184 T M T T T 1 , 2 ,..., ; m m H m m T m m 1, 2,...,OThis

Oct 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    181

    OPTIMAL DESIGN FOR A THREE –LEVEL NESTED MULTINOMIAL

    LOGIT MODEL IN DISCRETE CHOICE EXPERIMENTS

    Habib Jafari

    Institute for Mathematical Stochastics (IMST),

    OvG University, Universitätsplatz 2, 39106 Magdeburg-Germany

    ABSTRACT

    In this paper we calculate the information matrix for a three-level nested Multinomial logit model and derive the

    locally D-optimal design to estimate the parameter vector.

    Keywords: Conjoint Analysis, Three-level Nested MNL Model, Discrete Choice Experiment, D-Optimal Criterion..

    1. INTRODUCTION

    The multinomial logit (MNL) model is most widely used in discrete choice models due to its closed-form choice

    probabilities and its consistency with the random utility maximization (RUM). However, the MNL model suffers

    from restrictive independence from irrelevant alternatives (IIA) property, which states that the ratio of two choice

    probabilities is independent of the other alternatives in the model. This implies that a change in an attribute of one

    alternative will have the same proportional impact on the probability of each of the other alternatives being chosen.

    The NMNL model relaxes the IIA property by dividing the alternatives into subsets or nests, allowing the IIA

    assumption to hold within each nest but not for alternatives in different nests. Notwithstanding that there is the same

    IIA property for the nests that it is the IIN (Independent from Irrelevant Nest). As opposed to the more flexible

    Multinomial Probit and Mixed Logit models, the NMNL model has closed-form choice probabilities, which can be

    estimated without resorting to simulation methods. Due to its simplicity and allowing for a variety of substitution

    patterns, the NMNL model remains the most common extension of the MNL model in applied work. Daly and

    Zachary (1978) and McFadden (1978a) have shown that the two-level NMNL model is consistent with RUM under

    the condition that the dissimilarity parameters are constrained within the unit interval. In many practical

    applications, however, this condition has not been met. Börch-Supan (1990) argues that the DZM condition is

    unnecessarily strong given that the NMNL model should be viewed as a local approximation. Based on the work of

    Börch-Supan, Herriges and Kling (1996) who derive the necessary conditions for local consistency with random

    utility maximization for two-level NMNL models; the two-level NMNL model is consistent with RUM when

    dissimilarity parameters vary in interval [0,1) and when the dissimilarity parameters are greater than one. Therefore,

    the two-level NMNL model is consistent for some range of the characteristics of attributes with RUM. A two-level

    NMNL model is not consistent with RUM when there is a dissimilarity parameter less than zero.

    In some cases of two-level NMNL models, the IIA property may not hold within some or all of the nests. In this

    situation, we can divide the alternatives of these nests into several sub-sets, called sub-nests. This kind of NL model

    is termed the three-level NMNL model, since within it there are three kinds of choice probabilities that will be

    discussed in the section 2.

    The rest of this chapter is structured as follows. Section 2 discusses the model specifications of three-level NMNL

    models. Section 3 presents the information matrix for a three-level NMNL models (with two nests). We will

    introduce the D-optimal criterion by subsection 3.1 .

    2. MODEL SPECIFICATIONS

    Following Gil-Molton and Hole, (2004), let us consider a sample of I individuals with J discrete possible

    alternatives (in choice set C ), which are produced by K attributes, each with kL levels. In this paper, for three-

    level NMNL model, the total number of alternatives is showed by

    M

    m

    H

    h

    hm

    m

    J1 1

    , where hmJ is the number of

    alternatives in the sub-nest h of nest m . In this case, there are S choice sets each containing sJ alternatives to fit

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    182

    model, where if sC is a choice set with sJ alternatives then S

    s

    sCC1

    . Certainly, in such a model, the total number

    of alternatives in choice set C is denoted by

    K

    k

    kL1

    with regard to the attributes and their levels. This model was

    obtained based on selection of an alternative with the highest utility.

    The utility related to the three-level NMNL model (i.e. choice set, sC ), where the individual i is derived when

    choosing alternative j as denoted by ijhmsU . This utility is partitioned into a systematic component, ijhmsv , and a

    random component, ijhms ( s denotes the choice set sC ), to produce (Because, the conditions are the same for all

    individuals then we ignore index i ):

    (1) , || msmshhmsjjhms UUUU

    So that:

    msmsmsmshmshmshhmsjhmsjhmsj vUvUvU , , |||||| ,

    where hmsj | have EVD (Extreme Value Distribution type (II)) with variance 2

    hm (They are correlated in the same

    sub-nest, ),( || hmsjhmsjhm corr ), the distributions of msh| is such that variable hmsijCj

    Uhms

    |max

    with variance

    2

    m ( hmsC denotes a choice set ( s ) which includes alternatives in sub-nest h of nest m and

    ),( || mshmshm corr ) and the distribution of ms is such that variable msihHh

    Um

    |max

    will have EVD (Type II)

    with variance 2 so that 0),( smiimscorr , where mH denotes the number of sub-nest in nest m . Naturally,

    these three error terms are independent ( hmsij | , msih| and ims ). Now, with consideration to utility (1) hmsjv | can

    be written by a regression function as:

    K

    k

    Lkkk

    T

    kkhmksjhmsjhmsj kv

    1

    ,,1,

    T

    |

    T

    || ),,,(; ββxβx .

    According to effect-type coding:

    k k

    k

    L L

    kLkk

    1

    1

    1

    ,,, 0

    .

    In this situation, T

    kβ will be rewritten by ),,,( 1,,1, kLkkkT

    k β . Similarly, we will have:

    );,...,,,( ||1||T

    hmKsj

    T

    hmksj

    T

    shmj

    T

    hmsj xxxx ),...,,( ,1,|,2,|,1,|| sLhmkjshmkjshmkjT

    hmksj kxxx x , where

    T

    kβ and T

    hmksj |x

    denote the characteristics of the attributes k related to choosing alternative j by individual i

    (ignored) in the sub-nest h of the nest m according to choice set sC . Now, according to previous assumptions

    we will have:

    ,',0

    ;',),( '

    mm

    mmUUCov

    m

    smmsUs where

    ',

    ;',

    ', mm

    hh

    mhhm

    hm

    m ,

    hmhms JmhmhmJhmhmhmjjhmhmUUCov JI )()1(),( 2222' ,

    hmJmmmhjjhmmhhmUUCov J)(),( 22''', ,

    where Ιi ,...,2,1 , hmsJj ,...,2,1 , mHh ,...,2,1 ; Mm ,...,2,1 and rI is identity matrix of size r and T

    rrr 11J .

    Now, with consideration to the utility (1) observation variables as follows can be introduced:

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    183

    ,,0

    ;max,1 |''

    |

    |

    Otherwise

    UUY

    hmsjCj

    hmsj

    hmsjhms ,

    ,,0

    ;max,1 |''

    |

    |

    Otherwise

    UUY

    mshHh

    msh

    mshm ,

    .,0

    ;max,1 ''

    Otherwise

    UUY

    smm

    ms

    ms

    Thus, when the variables msmshhmsj YYY and , || are independent msmshhmsjjhms YYYY || : (2) , || msmshhmsjjhms pppp

    where 1Pr || hmsjhmsj Yp is the conditional probability of choosing alternative j , given that sub-nest h and nest m have been chosen, 1Pr || mshmsh Yp is the conditional probability of choosing sub-nest h when nest m is chosen and 1Pr msms Yp is the marginal probability of choosing nest m (with respect to choice set s ) and Pr denotes the probability of an event. Based on the distribution of the error terms of the utility, these probabilities can be calculated by (McFadden, (1978)):

    hms

    hm

    hmsj

    hm

    hmsj

    J

    j

    v

    v

    hmsj

    e

    ep

    1

    ||

    |

    ,

    m msh

    m

    mh

    hmsm

    hm

    H

    h

    IV

    IV

    msh

    e

    ep

    1

    |

    ,

    M

    m

    IV

    IV

    ms

    smm

    msm

    e

    ep

    1

    ,

    where

    m hmsm

    hm

    m

    H

    h

    IV

    msihHh

    ms eUEIV1

    | lnmax

    ,

    hms

    hm

    hmsj

    hms

    J

    j

    v

    hmsjCj

    hms eUEIV1

    |

    |

    lnmax

    .

    3. INFORMATION MATRIX

    There are criteria like D-, A-, G-criterion, etc, for obtaining optimal design. In this chapter, we use D-optimal

    criterion (a function of the determinant of the information matrix) in order to obtain an optimal design. Thus, first

    we must obtain the information matrix for the three-level nested logit model. In this situation, the log-likelihood

    function is required, defined for the choice set sC and one individual as follows:

    M

    m

    H

    h

    J

    j

    jhmsjhmss

    m hms

    pyC1 1 1

    )ln();(θ ,

    where hmsJ denotes the number of alternatives in sub-set hm corresponding to choice set sC .

    Based on the definition of the information matrix (w.r.t choice set sC ):

    )3( ,

    lnlnln;

    ,,

    2|

    2

    |

    2

    ||

    2

    jhmT

    ms

    T

    msh

    T

    hmsj

    msmshhmsjT

    s ppppppC

    Eθθθθθθθθ

    θ

    where:

    (4) ;

    III

    III

    III

    θITT

    T

    sC

    is the information matrix of the choice set sC and θ is the full parameters vector, so that TTTT λμβθ ,, and

    TKTTT ββββ ,...,, 21 ; 1,,1, ,...,,, kLkkkT

    k β and MT ,...,, 21μ and

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    184

    TMTTT λλλλ ,...,2,1 ; mHmmTm m ,...,, 21λ ; m . This means that parameter ,k is related to the th level of attribute k ,

    m , the dissimilarity parameter of the nest m and T

    mλ is the dissimilarity parameters

    vector of the nest m , where hm denotes the dissimilarity parameter of the sub-nest hm in nest m . Thus, the

    number of parameters in the three-level NMNL model there are as follows:

    MqqMmHLqnests

    nestssub

    M

    m

    worthpart

    K

    k

    k

    2111

    )(1

    ,

    where 1q is the number of part-worth parameters, 2q is the number of the dissimilarity parameters of the sub-nests

    and M is the number of dissimilarity parameters of the nests, hence, the information matrix (4) is a symmetric positive semi definite qq matrix.

    In order to fit the three-level NMNL model, let us consider the following experiments:

    (5) , S S; / J/ 21s1 1

    MqqJM

    m

    H

    h

    hm

    m

    where

    M

    m

    H

    h

    hm

    m

    J1 1

    denotes the total number of alternatives in population and

    M

    1m 1h

    hmss JJmH

    denotes the number

    of alternatives in choice set s , selected from population, randomly. In particular, suppose that s J;Js then in

    this case, there will be S choice sets each with J alternatives. However, based on (5), the SS choice sets can

    be considered instead of S (in reality the number of choice setsS increases dramatically when the number of attributes and their levels increase, then S must be often reduced to S , by employing a suitable technique (See Grasshof, et al. (2004))). Also, the number of alternatives, which will be selected from sub-nests, may vary. Thus,

    there are different classes can be used in order to obtain a sample with size J from the population by:

    (6) J

    ...J

    ...J

    ...J

    ...J

    MnHn1M

    1

    nhm1nH

    1

    n11

    11

    M1

    1

    MHMhmH

    n

    MJJJJJ

    S

    where NM

    m

    H

    h

    m

    nJ;J1 1

    nhm and nS is the number of choice sets, each including J alternatives.

    Based on class n to create an experiment, sJ can be rewrite as ns

    M

    m

    H

    h

    m

    JJ1 1

    nhms

    , where

    Nn ,sJ;Jns nS and nS s's ; JJ nhms'nhms but nhmsJ and hms'n'J (for different class and

    different choice set) may be equal or not equal. According to reduce the total number of choice sets (S ) to a

    reasonable number ( S ), we reduce nS to nS in each class, where qSn (avoiding singular information matrix)

    then consider NnqJM

    m

    H

    h

    hm

    m

    ; S ;S / J/ nn1 1

    instead of (5). This involves choosing nS choice sets each of

    them with J alternatives in each class. According to the type of model, it is possible that Nnq ;nS . In

    such a case and in order to avoid a singularity information matrix, we can combine them ( nS ) together to create a new design. To obtain information matrix related to each choice set this is used (4).

    Table 1: Three-level NMNL model with two nests

    First Nest (1) Second Nest (2)

    Sub-nest (1) Sub-nest (2)

    11J 21J 2J

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    185

    Lemma 1: The information matrix is related to a three-level nested logit model (choice set sC ) with two nests,

    the first nest has two sub-nests with 11sJ and 21sJ alternatives and the second, 2sJ alternatives (Table 1 denotes a

    population with 22111 JJJ alternatives) is calculated as follows:

    ssss

    T

    s

    ssss

    T

    s

    ssss

    T

    s

    ssss

    T

    s

    sssss

    s

    IIII

    IIII

    IIII

    IIII

    C

    5545352515

    4544342414

    3534332313

    2524232212

    1514131211

    );(

    I

    I

    I

    I

    IIIII

    θI ,

    where 21,11,2,1, TT βθ . For simplicity, we suppose that 11,1 … 11, qLK K and 111 , 212 then 21211 ,,,,,..., 1 q

    T θ (See Appendix-AI).

    To fit this model (Table 1), examine the following experiments:

    nS nn SqSJJJ 122111 ; J// ,

    where 2n21n11n JJJJ and:

    N.1,2,...,n ; JJJ n2

    2

    n21

    21

    n11

    11

    JJJnS

    Corollary 1: When obtaining a locally optimal design when 0β , the above information matrix (Lemma 1) should

    be rewritten by the following Table (w.r.t Appendix AI):

    s

    T

    ss 1|1J1|1

    11s

    1|1 11sJ

    1XIXB

    11sJ1|1

    11s

    1|1J

    11XA

    T

    ss

    s

    T

    ss 1|2J1|2

    21s

    1|2 21sJ

    1XIXB

    21sJ1|2

    21s

    1|2J

    11XA

    T

    ss

    s

    T

    ss 2J2

    2s

    2 2sJ

    1XIXB

    2sJ2

    2s

    2J

    11XA

    T

    ss

    11s1|1 Jlnsa , 21s1|2 Jlnsa , 2s2 Jlnsa 11s

    11|J

    1sjp

    ,

    21s

    21|J

    1sjp

    ,

    2s

    2|J

    1sjp

    1

    2

    1

    1

    21s11s1|21 JJln

    sa ssT s 1|J1|11| 1,..., xxX

    2,1;,..., 1|1|11| 1 sjqsjT

    sj xxx ssTs s 2|J2|12 2,...,xxX ; sjqsjT

    sj xx 2|2|12| 1,...,x

    ss

    h

    h

    s ppph

    h

    12

    2s

    2

    1

    h1s

    2

    1

    h1s

    1 1;

    JJ

    J

    2

    1

    1

    1

    1

    ss

    h

    s ppph

    1|11|22

    1

    h1s

    11s1|1 1;

    J

    J

    1

    1

    1

    ,

    where rI denotes an rr -identity matrix and r1 is a r dimensional vector which all of its elements are one.

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    186

    3.1 D-Optimal Criterion

    Taking into account (5) and (6), consider the following designs to fit the model, which was introduced in Table 1:

    (7) 4, ; ...

    ... 1

    21

    21NnqS

    www

    CCCnn

    nSnn

    nSnn

    n

    n

    n

    where nsn SsC ,,2,1; denotes a choice set in nth

    class, which includes J alternatives. As the number of

    attributes )(K and their levels ),,2,1;( KkLk increase (design (7)), the total number of possible classes ( nS ) increases dramatically. In this situation, there is a need to search for techniques to reduce the number of support

    points or sample size NnSn ; such that we can obtain a reasonable number of choice sets (see Graßhoff, et al. (2004)). As have been told the D-optimality criterion in linear models typically leads to an optimal number of support points which is the same number of unknown parameters and the design takes an equal number of observations at each point (Silvey, 1980, pp.42). The bound also applies to most local optimality criteria and Bayesian criteria for linear models (Chernoff, (1972)). In contrast for non-linear models there is no such bound

    available on the number of support points. Then we consider condition NnqSn ,...,2,1; (reduced) to

    obtain the D-optimal criterion for design (7). The information matrix of design (7) is calculated as follow:

    (8) ,;; 1

    nS

    s

    snsnn Cw θIθI

    where snw is the weight (frequency) of the choice sets snC , snC;θI is the information matrix of choice set s in

    nth

    class, which is calculated by Lemma 1 and the local D-optimality criterion at 0θ is 1

    0;det

    nθI ( 0

    is true value of full parameters vector).

    The Db-optimal criterion in relation to the prior distribution θ on the parameters can be defined as follows (Atkinson, et al. (2007));

    (9) , ;det;det1 2

    11

    λμβθθIθI dddEDq M q

    qn

    qnnb

    where , and are the spaces of μβ, and λ , respectively and λμβθ ,, . Specifically, suppose that β is independent of λ and μ such that λμβλμβ ,,, . Consider, even, the independence between λ and μ to be not complex; means that λμβλμβ ,, . For example, uniform distribution

    for λ and μ , and multivariate normal distribution for , 01 ββ qN . Since usually there is not analytical expression for quantity (9), therefore, it may be approximated by the Monte-

    Carlo technique that takes a large number, R , of independent draws, rθ , of θ from the prior distribution, , and

    the average of local criterion

    q

    n

    1

    ;det θI over all draws. Thus, the weighted D-criterion is approximated

    by:

    (10) ;det1

    1

    1

    R

    r

    qnrnb

    RD θI

    if

    .1;det;det1

    limPr1

    1

    1

    qnr

    R

    r

    qnr

    RE

    R θIθI θ

    This case, in which *

    n minimizes the Db-approximate criterion (10), is called the Db-optimal design and will be

    approximated by the solution:

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    187

    (11) minarg * nbn Dnn

    then:

    ;...

    ... **

    2

    *

    1

    21* Nnwww

    CCC

    n

    n

    nSnn

    nSnn

    n

    is D-optimal design in

    nS

    s

    nsnsnsnn Sswww1

    ,...,2,1;10 ,1 . Finally, according to (10) and (11) it

    can be theorized that *

    'n is the most suitable design for estimating parameters, if Nn' exists with the result

    that minarg **' nbn

    n D .

    Explained previously, 10;det θI will be the local D-optimality criterion, where 0θ is the true value of the parameters. Thus, we say that

    * is locally D-optimal design in if 10* ;detminarg

    θI and it has

    been used here to obtain a locally D-optimal design.

    Table 2: Three-level NMNL model with with six alternatives

    First Nest (1) Second Nest (2)

    Sub-nest 1(1) Sub-nest 2(1)

    21 , aa 43 , aa 65 , aa

    Illustration:

    Here is a population with three attributes, each comprised of two levels. In this situation, consider a three-level

    NMNL model, which includes six possible alternatives in two nests (Table 2; 2221 1

    M

    m

    H

    h

    hm

    m

    J , where

    (1,1,1), (1,1,-1), (1,-1,1), (1,-1,-1), (-1,1,1), (-1,1,-1) characterize alternatives 54321 , , , , aaaaa and 6a . Fitting this

    model (Table 2), consider experiment S/5/222 , based on Equation (5):

    62

    2

    2

    2

    1

    2

    2

    2

    1

    2

    2

    2

    1

    2

    2

    2

    2

    2

    321

    SSS

    S .

    In this case, three classes ( 3N ) we found to define the design and because of Nn ;7nS , we can combine them in order to define a suitable design. Thus, there are six choice sets (Table 3) with their design

    matrixes, as shown by Table 4. In this situation, suppose that121111 , , 111331 , and

    221 , for clarity, then 212131 ,,,,, T

    θ is full parameters vector. In this case and keeping to

    RUM conditions (Gil-Molton and Hole, (2004)), we will encounter the two conditions as follows:

    S shppmp msmshmm

    hm

    ms

    m , 2,1;11

    )2 and 2,1;1

    1 )1

    .

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    188

    Table 3: Three-level NMNL model with six choice sets

    Choice Set First Nest (1) Second Nest (2)

    Sub-nest (1)

    Sub-nest (2)

    1C 21 , aa 43 , aa 5a

    2C 21 , aa 43 , aa 6a

    3C 21 , aa 3a 65 , aa

    4C 21 , aa 4a 65 , aa

    5C 1a 43 , aa 65 , aa

    6C 2a 43 , aa 65 , aa

    Table 4: The design matrix of three-level NMNL model with six choice sets

    Choice Set First Nest (1) Second Nest (2)

    Sub-nest (1)

    Sub-nest (2)

    1C

    111

    111

    111

    111

    111

    2C

    111

    111

    111

    111

    111

    3C

    111

    111

    111

    111

    111

    4C

    111

    111

    111

    111

    111

    5C 111

    111

    111

    111

    111

    6C 111

    111

    111

    111

    111

    For estimating the parameters of the model, which have been described on Table 2 and based on experiments

    6/5/222 and Equation (7), consider the following design:

    (12) . 654321

    654321

    wwwwww

    CCCCCC

    The information matrix of design (12) is calculated by

    6

    1

    ;;s

    ss Cw θIθI . Specifically, let 0β . Now,

    according to Lemma 1 and Corollary 1, the elements of the information matrix sC;θI can be calculated.

    According to the rule of permutation, the levels of third attribute in choice sets 1C and 2C (second nest) will

    acquire permutation between these two choice sets. Also, permutation between the two choice sets 3C and 4C will

    exist with respect to permutation and the levels of the third attribute in the second sub-nest of the first nest. By

    permutation, the levels of the third attribute in the first sub-nest of the first nest, we will encounter permutation

    between the two choice sets 5C and 6C . Thus, we can define a new design to fit the model, already introduced by

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    189

    Table 2 and according to Table 3, as

    follows: (13) . 654321

    563412

    wwwwww

    CCCCCC

    In this situation, in order to have equation between the two designs (12) and (13), the following design can be considered:

    (14) , 332211

    654321

    CCCCCC

    where 2

    1321 .

    Now, suppose that 21 . Moreover, we know that 1 and according Table 2, it is to be expected that

    12 . Then we can assume that 21 and 42 , thus ;det I will be changed to a more

    function of , 1 and 2 where 2132

    1 . In this situation, the RUM conditions will be upheld when

    25.00 . According to this condition for , some locally optimal design has been calculated in Table 5.

    Table 5: 21 and 42 , locally optimal design when 25.00 .

    0.01 0.05 0.10 0.15 0.17 0.20 0.25

    *

    1 0.3092 0.3120 0.3150 0.3180 0.3195 0.3210 0.3270

    *

    2 0.0954 0.0940 0.0925 0.0910 0.0905 0.0899 0.0895

    *

    3 0.0954 0.0940 0.0925 0.0910 0.0900 0.0891 0.0877

    D 0.00368 0.08985 0.35081 0.77476 0.98940 1.36021 2.11334

    Table 5 shows that*

    1 increases as increases but *

    2 and *

    3 decrease when increases because of the

    combination of alternatives (and attributes) in two choice sets 1C and 2C are less similar than in the other choice

    sets. According to Table 4 we can observe that two sub-nests of the first nest in the choice sets 1C and 2C are

    equal but there are two different alternatives in second nest. In this situation, because of equation between 1 and

    2 , it is observed that *

    1 increases as increases. In choice sets 3C and 4C , there are two different alternatives

    in the second sub-nest of the first nest. We can see a similar situation for choice sets 5C and 6C , naturally, there

    are two different alternatives in the first sub-nest of the first nest (there is not change in the second nest for choice

    sets 3C to 6C ). With respect to the combination of the alternatives in the four choice sets 3C to 6C , then a similar

    result for *

    2 and *

    3 will be obtained, so that these two weights (*

    2 and *

    3 ) are almost equal and decrease as

    increases ( 15.00 ). But, the decreasing trend of *3 is faster than *

    2 when 17.0 , then the

    combination of these attributes and their levels in the two choice sets 5C and 6C are more similar than the choice

    sets 3C and 4C (Table 4).

    Now, suppose that 15.0 ,1.0 11 and 25.02 , then the RUM conditions are hold if 15.00 2 .

    In Table 6 several locally D-optimal designs based on Table 4 were obtained. In this situation, *

    2 increases as 2

    increases but *

    3 decreases (Table 6). That means the alternatives in the second sub-nest (first nest) of choice sets

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    190

    5C and 6C are much similar, but the alternatives in choice sets 3C and 4C (second sub-nest) are much more

    dissimilar (Table 6).

    Table 6: 15.01 , 25.02 and 1.01 , locally optimal design when .15.00 2

    2 0.01 0.05 0.06 0.08 0.10 0.12 0.15

    *

    1 0.3310 0.3338 0.3245 0.3133 0.3095 0.3103 0.3165

    *

    2 0.0000 0.0001 0.0285 0.0657 0.0953 0.1183 0.1458

    *

    3 0.1690 0.1661 0.1497 0.1210 0.0952 0.0713 0.0377

    D 0.05879 0.15900 0.17522 0.19926 0.21502 0.22534 0.23478

    Another Table 7, which includes was calculated some locally D-optimal designs based on

    2121 . In this case, the RUM conditions hold if 10 . Table 7 denotes: *

    1 increases as

    increases, but *2 and *

    3 decrease. Noting the decreasing trend of *

    2 and *

    3 , we can observe that the

    decreasing trend of *

    3 is faster than *

    2 , because of more similarity (alternatives) in the choice sets 5C and 6C in

    contrast of that between 3C and 4C .

    Table 7: 2121 , locally optimal design when .10

    0.05 0.10 0.15 0.20 0.30 0.40 0.50

    *

    1 0.2889 0.2908 0.2926 0.2943 0.2979 0.3011 0.3040

    *

    2 0.1055 0.1046 0.1037 0.1029 0.1012 0.0998 0.0988

    *

    3 0.1055 0.1046 0.1037 0.1028 0.1009 0.0991 0.0971

    bD 0.02689 0.10466 0.22959 0.39868 0.86051 1.47810 2.24747

    With respect to fixed values for 1.01 and 08.021 (Table 8), *

    2 and *

    3 are equal and they decrease

    as 2 increases, but *

    1 increases. Then, the alternatives in the second nest (choice sets 3C to 6C ) are more similar

    than the alternatives in the second nest of the choice sets 1C and 2C .

    Table 8: 1.01 and 08.021 , locally optimal design when 10 2 .

    2 0.10 0.15 0.20 0.25 0.30 0.40 0.50

    *

    1 0.2964 0.3038 0.3074 0.3095 0.3110 0.3130 0.3150

    *

    2 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924

    *

    3 0.1018 0.0981 0.0963 0.0952 0.0945 0.0933 0.0924

    D 0.09941 0.10191 0.10319 0.10409 0.10484 0.10619 0.10750

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    191

    Suppose that 5.02 and 2.0 ,1.0 21 . In this situation, RUM conditions hold if 5.02.0 1 .

    Table 9 showed that *

    1 increases (almost as always, with a decreasing trend) as 1 increases. The third row of

    Table 9 denotes, *

    2 decreases (with a very weak decreasing trend) and *

    3 is equal zero as 1 increases. That

    means that the alternatives in the choice sets 5C and 6C are much more similar than are the others. And we can

    say, if 5.02 and 2.0 ,1.0 21 and 145.0 , then:

    0000.00000.01636.01636.03364.03364.0

    654321'*'CCCCCC

    is locally D-optimal design in 3

    1

    n

    n .

    According to the results which were obtained in the different classes in Table 5 to Table 9, we can say that the

    alternatives in the two choice sets 1C and 2C are less dissimilar than the others and the alternatives in the choice

    sets 5C and 6C are more similar than the others.

    Table 9: 5.02 and 2.0 ,1.0 21 , locally optimal design when 5.02.0 1 .

    1 0.20 0.25 0.30 0.35 0.40 0.45 0.50

    *

    1 0.3333 0.3347 0.3355 0.3360 0.3363 0.3364 0.3364

    *

    2 0.1667 0.1653 0.1645 0.1640 0.1637 0.1636 0.1636

    *

    3 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

    D 0.39644 0.56598 0.75789 0.97082 1.20394 1.45688 1.72959

    Note: To obtain locally D-optimal design, 1;det 0I D (Table 5 to Table 9), Maple has been used

    with initial values 1.0 ,2.0 132 . The Sequential Quadratic Programming (SQP) method was also

    used and naturally the number 1000 was considered for the iteration limit.

    4. CONCULSION

    We know that in two-level NMNL model, all of alternatives are divided into several nests. According to IIA

    property which holds in each nest, it may be necessary that the alternatives of some nests or all of those to divide

    into several sub-nests.

    In this paper, to fit three-level NMNL model it has been used D-optimal design, which is a function of the

    determinant of the information matrix. Also, we have calculated the information matrix of a three-level NMNL

    model for local D-optimality criterion when 0β . Based on example, we have discussed about different classes.

    We have observed, the optimal weights (choice sets) have an increasing or decreasing trend; of course, this trend

    depends on similarity between the alternatives in the choice sets. For example, there isn’t much similarity between

    alternatives when the optimal weights (choice sets) have an increasing trend (dissimilarity parameter increased) and

    vice versa.

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    192

    APPENDIX

    Appendix AI:

    The elements of the information matrix related to Lemma 1 are as follows

    TssssT sssssT

    sss

    ss

    s

    ppppp

    2222

    2

    21|21|2122

    2

    121

    1|11|1112

    1

    111

    11

    ..AABAABAABI

    TssT sssT sssssT ssT ssT ssT sssss

    ppppppp

    221|21|2

    2

    1|21|11|1

    2

    1|1211|11|21|21|11|21|21|11|12

    1

    1|21|11...

    ..AAAAAAAAAAAAAA

    T ssT sssTsssssssT ssT ssssss pppppppppp 1|21|21|11|1221|21|21|11|1211|11|21|21|11|21|121 ........ AAAAAAAAAA

    ssssssssssss

    ssss

    sss

    s

    ppaapappp

    aappp

    21|21|21|11|11|2111|21|221|11|11

    1

    21

    1|21|11|111|223

    1

    1|21|11

    12

    .. ......

    -

    ....

    AAA

    AAI

    s

    T

    ssssssss

    T

    ssss

    s appppp

    22

    2

    21|21|21|11|1212223

    2

    213

    1... βAAAAβAABI

    βAAAA

    βAAAβAABI

    T

    sssssss

    sss

    T

    ssss

    sssT

    sss

    ss

    s

    appppp

    appppp

    1|11|1121|21|21|11|1

    1

    1|121

    1|11|111|21|12

    11

    1|21|11

    1|11|11|13

    1

    1|11

    14

    .....

    ..

    ...

    βAAAA

    βAAAβAABI

    T

    sssssss

    sss

    T

    ssss

    sssT

    sss

    ss

    s

    appppp

    appppp

    1|21|2221|11|11|11|2

    2

    1|221

    1|21|221|11|22

    12

    1|21|11

    1|21|21|23

    2

    1|21

    15

    .....

    ..

    ...

    2 1|21211|21|221|11|111

    1|2121

    2

    1|21|221|11|112

    1

    212

    1|221|114

    1

    1|21|11

    22

    ........2

    -

    .....

    ....

    sssssss

    sss

    ssssss

    ss

    sss

    s

    appapapapp

    apappp

    aappp

    I

    sTsssssssss aapapapp

    I 2221|21|221|11|111|21121

    2123 ......

    .

    .

    βA

    sT sssssssss

    s

    T

    sss

    sss

    s

    aaapapppp

    aaappp

    I

    1|111|11|2111|21|221|11|11

    11

    1|121

    1|111|11|221|113

    11

    1|21|11

    24

    .......

    ..

    ....

    ..

    βA

    βA

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    193

    sT sssssssss

    s

    T

    sss

    sss

    s

    aaapapppp

    aaappp

    I

    1|221|21|2111|21|221|11|11

    12

    1|221

    1|221|21|111|223

    12

    1|21|11

    25

    .......

    ..

    ....

    ..

    βA

    βA

    sTsssTssTsssTss aappp

    I 22222222

    212224

    2

    233 ..

    .

    βAAββAABβ

    sT sssTssss aappp

    I 1|111|122221

    1|121

    34 ...

    ..

    βAAβ

    sT sssTsss

    s aappp

    I 1|221|222222

    1|221

    35 ...

    ..

    βAAβ

    sT sssTsssssT sssTsss aappppppp

    I 1|111|11|111|11|122

    11|22

    1

    2

    1

    1|11

    1|11|11|14

    1

    1|11

    44 .....

    ..

    βAAββAABβ

    sT sssTssss

    s aapppp

    I 1|221|21|111|12

    122

    121

    1|21|11

    45 ...1..

    ..

    βAAβ

    sT sssTsssssT sssTsss aappppppp

    I 1|221|21|221|21|222

    11|12

    1

    2

    2

    1|21

    1|21|21|24

    2

    1|21

    55 .....

    ..

    βAAββAABβ

    where

    ss

    T

    ss 1|111|.1|11|1 XPXB , ssT

    ss 1|221|.1|21|2 XPXB , sT

    ss 11|.1|11|1 pXA , sT

    ss 21|.1|21|2 pXA

    ss

    T

    ss 22|.22 XPXB , sT

    ss 2|.22 pXA , ssT

    s s 1|J1|11| 1,..., xxX ,

    2,1 ; ,..., 1|1|11| 1 sjqsjT

    sj xxx , sss sppdiag 1|J1|11|. 1,..., P sss sppdiag 1|J1|11|. 1,..., p , ss

    T

    s s 2|J2|12 2,...,xxX , sjqsjT sj xx 2|2|12| 1,...,x

    , sss sppdiag 2|J2|12|. 2,...,P , sss sppdiag 2|J2|12|. 2,...,p

    s

    Tsj

    j

    s ea11

    1

    11|J

    1

    1|1 ln

    βX

    ,

    s

    Tsj

    j

    s ea21

    2

    21|J

    1

    1|2 ln

    βX

    ,

    s

    Tsj

    j

    s ea2

    2

    2|J

    1

    2 ln

    βX

    s

    Tsj

    Tsj

    j

    sj

    e

    ep

    111

    11|

    1

    11|

    J

    1

    11|

    βX

    βX

    ,

    s

    Tsj

    Tsj

    j

    sj

    e

    ep

    212

    21|

    2

    21|

    J

    1

    21|

    βX

    βX

    ,

    s

    Tsj

    Tsj

    j

    sj

    e

    ep

    22

    2|

    2

    2|

    J

    1

    2|

    βX

    βX

    ,

  • IJRRAS 2 (2) ● February 2010 Jafari ● Optimal Design for a Multinomial Logit Model

    194

    1

    2

    212

    21|1

    1

    111

    11|J

    1

    J

    1

    1|21 ln

    s

    Tsj

    s

    Tsj

    jj

    s eea

    βXβX

    ,

    2

    22

    2|

    1

    1

    111

    11|

    1

    1

    111

    11|

    J

    1

    2

    1

    J

    1

    2

    1

    J

    1

    1

    s

    Tsj

    h

    s

    Tsj

    h

    s

    Tsj

    jh j

    h j

    s

    ee

    e

    p

    βXβX

    βX

    ,

    2

    1

    J

    1

    J

    1

    1|1

    1

    1

    1|

    1

    1

    111

    11|

    h j

    j

    sh

    shh

    Tshj

    s

    Tsj

    e

    e

    p

    βX

    βX

    5. REFRENCES:

    [1] Atkinson. A.C. and A.N. Donev and R.D. Tobias (2007). Optimum experimental designs, with SAS, Oxford

    Univ. Press

    [2] Ben-Akiva, M. (1973). The structure of travel demand models, PhD Thesis, MIT.

    [3] Börch-Supan, A. (1990). On the compatibility of nested logit models with utility maximization,

    Journal of Econometrics, Vol.43, 373-388.

    [4] Chernoff, H. (1972). Sequential analysis and optimal design. Society for Industrial and Applied

    Mathematics, Philadelphia PA.

    [5] Daly, A. and S. Zachary (1978). Improved multiple choice models, in D.Hensher and M.Dalvi, eds.,

    Determinates of Travel Choice, Saxon House, Sussex.

    [6] Gil-Molton,M. and Hole, A. (2004). Tests for the consistency of three-level tested logit models with

    utility maximization, Economics Letters,85, 133-137.

    [7] Graßhoff U., Großmann H., Holling H. and Schwabe R., (2004). Optimal Designs for Main-

    Effects in Linear Paired Comparison Models, Journal of Statistical Planning and Inference, 126:361-

    376.

    [8] Herriges, JA. And CL. Kling (1996). Testing the consistency of nested logit models with utility

    maximization, Econometrics Letters, Vol.50, No.1, 33-39.

    [9] McFadden, D., (1978). Modeling the choice of residential location, in: A. Karlquist et. al., eds.,

    Spatial interaction theory and residential location, North-Holland, Amsterdam 75-96.

    [10] Silvey, S.D.(1980). Optimal design. Chapman and Hall, London.