UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

What's New in Stata 10: Discrete Choice Models

Discrete choice models are used for analyzing individual choice behavior. For example, in transportation studies, discrete choice models have been used widely for describing transportation demands. Here, we are going to briefly discuss two new commands for different choice models: alternative specific conditional logit model (asclogit) and alternative specific rank ordered probit model (asroprobit).


Alternative Specific Conditional Choice Models

Let's first take a look at the structure of the data set for alternative specific models.

webuse choice, clear
clist in 1/10, noobs

       id        sex     income        car       size     choice     dealer
        1       male       46.7   American          2          0         18
        1       male       46.7      Japan          2          0          8
        1       male       46.7     Europe          2          1          5
        2       male       26.1   American          3          1         17
        2       male       26.1      Japan          3          0          6
        2       male       26.1     Europe          3          0          2
        3       male       32.7   American          4          1         12
        3       male       32.7      Japan          4          0          6
        3       male       32.7     Europe          4          0          2
        4     female       49.2   American          2          0         18

The outcome variable is the choice of type of cars by each individual defined by the variable id. There are two types of predictor variables. Variables such as sex and income are individual level models. Their values might change across individuals, but will never change within each individual. These variables are said to be case-specific. The variable named dealer is the number of the specific car dealers in each individual's neighborhood. For example, in the first individual case, there are 18 American car dealers in the neighborhood, 8 Japanese car dealers and 5 European car dealers. This type of variables is said to be alternative specific.  

When we only have alternative specific predictor variables, we can either use the command clogit or asclogit. They give the exactly same results, although the syntax look somewhat different. For example, the following two commands are doing the exactly the same thing.

tab car, gen(y)
clogit choice dealer y2 y3, group(id)
asclogit choice dealer, case(id) alternative(car)

Now if we also have case-specific predictor variables, clogit will not work directly. For example, variable sex in the following clogit command will be dropped, since clogit model is a fixed effect model and variable sex does not vary within individuals.

clogit choice dealer sex y2 y3, group(id)
note: sex omitted because of no within-group variance.

 Here, the new command asclogit of Stata 10 comes handy.

asclogit choice dealer, casevars(sex) case(id) alternative(car) nolog

Alternative-specific conditional logit         Number of obs      =        885
Case variable: id                              Number of cases    =        295

Alternative variable: car                      Alts per case: min =          3
                                                              avg =        3.0
                                                              max =          3

                                                  Wald chi2(3)    =       7.19
Log likelihood =  -255.5512                       Prob > chi2     =     0.0661

------------------------------------------------------------------------------
      choice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
car          |
      dealer |   .0436522   .0329333     1.33   0.185     -.020896    .1082003
-------------+----------------------------------------------------------------
American     |   (base alternative)
-------------+----------------------------------------------------------------
Japan        |
         sex |  -.6390918   .3105483    -2.06   0.040    -1.247755   -.0304283
       _cons |  -.1604799   .4730256    -0.34   0.734    -1.087593    .7666333
-------------+----------------------------------------------------------------
Europe       |
         sex |   .4242441   .4496724     0.94   0.345    -.4570977    1.305586
       _cons |  -1.255797   .6455486    -1.95   0.052    -2.521049    .0094554
------------------------------------------------------------------------------

This does not mean that clogit will not work. It requires that interaction terms be created and included in the model. Below is the code for performing the same analysis using clogit.

tab car, gen(y)
gen ysex1 = y1*sex
gen ysex2 = y2*sex
gen ysex3 = y3*sex
clogit choice dealer ysex3 ysex2 y2 y3, group(id)

As one can imagine that this process gets more complicated for models with more case-specific variables and the new command asclogit in Stata 10, in comparison, will be fairly straightforward all the time.


Alternative Specific Rank Ordered Probit Models

Again, let's take a look at the data structure for alternative specific rank ordered models.

clear
set mem 500m
webuse wlsrank, clear
keep if noties
(11244 observations deleted)

sort id jobchar
clist in 1/12, noobs

       id    jobchar     female      score       rank       high        low     noties
       13     esteem          0   .3246512          4          0          1          1
       13    variety          0   .3246512          2          1          0          1
       13   autonomy          0   .3246512          1          0          0          1
       13   security          0   .3246512          3          0          1          1
       19     esteem          1   .0492111          3          0          0          1
       19    variety          1   .0492111          2          0          0          1
       19   autonomy          1   .0492111          4          0          0          1
       19   security          1   .0492111          1          1          0          1
       22     esteem          1   1.426412          4          1          0          1
       22    variety          1   1.426412          1          0          0          1
       22   autonomy          1   1.426412          2          1          0          1
       22   security          1   1.426412          3          0          0          1

Each individual indicated by the id variable was asked to rank the importance of four job characteristics. Again, we have two types of predictor variables, the choice-specific, or, the alternative-specific and the case-specific variables. Stata's rologit command can be used to fit a rank-ordered logistic regression model as shown below. 

rologit rank high low female if noties, group(id) nolog
female omitted because of no within-id variance

Rank-ordered logistic regression                Number of obs      =      1660
Group variable: id                              Number of groups   =       415

No ties in data                                 Obs per group: min =         4
                                                               avg =      4.00
                                                               max =         4

                                                LR chi2(2)         =     10.57
Log likelihood = -1313.608                      Prob > chi2        =    0.0051

------------------------------------------------------------------------------
        rank |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        high |  -.2475398   .0832944    -2.97   0.003    -.4107939   -.0842857
         low |   .0128058   .0989479     0.13   0.897    -.1811285    .2067402
------------------------------------------------------------------------------

But there could be potentially several problems. First of all, it will drop all the case-specific predictor variables, since they don't vary within individuals. Secondly, rologit, being a type of logit model, will assume IIA, i.e., the assumption of independence of irrelevant alternatives. This might not be a valid assumption for many situations. The alternative model is the alternative specific rank-ordered probit model, which is implemented in Stata 10. It will relax the IIA assumption and allows case-specific variables as well.

asroprobit rank high low if noties, casevars(female) case(id) alternative(jobchar) 
note: variable high has 107 cases that are not alternative-specific: there is no within-case variability
note: variable low has 193 cases that are not alternative-specific: there is no within-case variability

Iteration 0:   log simulated-likelihood = -1105.5751  
...................................................
Iteration 17:  log simulated-likelihood = -1086.4176  

Alternative-specific rank-ordered probit       Number of obs      =       1660
Case variable: id                              Number of cases    =        415

Alternative variable: jobchar                  Alts per case: min =          4
                                                              avg =        4.0
                                                              max =          4
Integration sequence:      Hammersley
Integration points:               200             Wald chi2(5)    =      23.35
Log simulated-likelihood = -1086.4176             Prob > chi2     =     0.0003

------------------------------------------------------------------------------
        rank |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
jobchar      |
        high |  -.3663908   .0915408    -4.00   0.000    -.5458074   -.1869741
         low |   .0678953    .108842     0.62   0.533     -.145431    .2812217
-------------+----------------------------------------------------------------
esteem       |  (base alternative)
-------------+----------------------------------------------------------------
variety      |
      female |  -.1397973   .1840925    -0.76   0.448    -.5006119    .2210173
       _cons |  -1.749604   .1443756   -12.12   0.000    -2.032575   -1.466633
-------------+----------------------------------------------------------------
autonomy     |
      female |  -.2584587   .1655809    -1.56   0.119    -.5829913     .066074
       _cons |  -.7324097   .1203629    -6.09   0.000    -.9683166   -.4965028
-------------+----------------------------------------------------------------
security     |
      female |   -.227936   .2053046    -1.11   0.267    -.6303256    .1744537
       _cons |  -1.306075    .156829    -8.33   0.000    -1.613454   -.9986955
-------------+----------------------------------------------------------------
     /lnl2_2 |   .1838218   .0755158     2.43   0.015     .0358137      .33183
     /lnl3_3 |   .4872013   .0790135     6.17   0.000     .3323377     .642065
-------------+----------------------------------------------------------------
       /l2_1 |   .6217402   .1154746     5.38   0.000     .3954142    .8480663
       /l3_1 |   .4413976   .1440186     3.06   0.002     .1591263    .7236689
       /l3_2 |   .2089166    .121502     1.72   0.086    -.0292228    .4470561
------------------------------------------------------------------------------
(jobchar=esteem is the alternative normalizing location)
(jobchar=variety is the alternative normalizing scale)

This procedure asroprobit, together with asmprobit, is implemented using maximum simulated likelihood estimation method. With probit model, we also have options to choose different correlation structures for the error terms. Therefore, these models require more computing resource and attentions from the users.


For more information on Stata's new commands for choice models, visit http://stata.com/stata10/choice.html

For a general discussion on choice models, visit http://roso.epfl.ch/mbi/papers/discretechoice/paper.html

For a general discussion on maximum simulated likelihood, visit http://www.aae.wisc.edu/pubs/sps/pdf/stpap421.pdf


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California