UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Multinomial Logistic Regression,

Contrived Examples

1. ratio 1.2 and 1.4

-------------------------------------------------------

1. contrived example, 1.2 and 1.4

The examples used here are similar to those used in the binary logistic examples. In those examples, we predicted whether a wife in a family worked (or did not work) based on the income of the husband. Here, we have a more complex example because the work status of the wife had 3 values, 0 not working, 1 working part time, and 2 working full time.

When your outcome has 3 or more values, you can analyze the data using a multinomial logistic regression. This breaks the regression up into a series of binary regressions comparing each group to a baseline group. For example, wifework has 3 values, 0=not working, 1=part time, 2=full time. If we choose not working (0) as the baseline group, multinomial logistic regression will assess the odds of working part time vs. not working, and working full time vs. not working. It is kind of like you performed 2 binary logistic regressions where the first will treat working part time as a 1 and not working as a 0, and the second will treat working full time as a 1 and not working as a 0.

Let's use a contrived data file for exploring multinomial logistic regression and understaning the results it produces.

clear
input inc count2 count1 count0 
           inc     count2     count1     count0 
  1. 10 100  100 100
  2. 11 140  120 100
  3. 12 196  144 100
  4. 13 274  173 100
  5. 14 384  207 100
  6. 15 538  249 100
  7. 16 752  296 100
  8. 17 1054 358 100
  9. 18 1476 430 100
 10. 19 2066 516 100
 11. end
reshape long count, i(inc) j(wifework)
(note:  j = 0 1 2)

Data                               wide   ->  long
-----------------------------------------------------------------------------
Number of obs.                       10   ->      30
Number of variables                   4   ->       3
j variable (3 values)                     ->   wifework
xij variables:
                   count0 count1 count2   ->   count
-----------------------------------------------------------------------------
expand count
(10543 observations created)
tabulate inc wifework 
           |             wifework
       inc |         0          1          2 |     Total
-----------+---------------------------------+----------
        10 |       100        100        100 |       300 
        11 |       100        120        140 |       360 
        12 |       100        144        196 |       440 
        13 |       100        173        274 |       547 
        14 |       100        207        384 |       691 
        15 |       100        249        538 |       887 
        16 |       100        296        752 |      1148 
        17 |       100        358       1054 |      1512 
        18 |       100        430       1476 |      2006 
        19 |       100        516       2066 |      2682 
-----------+---------------------------------+----------
     Total |      1000       2593       6980 |     10573 

save multrat1 , replace
file multrat1.dta saved
use multrat1 , clear

We perform a multinomial logistic regression using those who dont work (0) as the baseline group.

mlogit wifework inc, rrr base(0)
Iteration 0:   log likelihood = -8901.2117
Iteration 1:   log likelihood = -8523.0599
Iteration 2:   log likelihood = -8484.4998
Iteration 3:   log likelihood = -8484.3002
Iteration 4:   log likelihood = -8484.3002

Multinomial regression                            Number of obs   =      10573
                                                  LR chi2(2)      =     833.82
                                                  Prob > chi2     =     0.0000
Log likelihood = -8484.3002                       Pseudo R2       =     0.0468

------------------------------------------------------------------------------
wifework |        RRR   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
1        |
     inc |   1.199894   .0160932     13.587   0.000       1.168762    1.231854
---------+--------------------------------------------------------------------
2        |
     inc |    1.39992   .0178082     26.446   0.000       1.365448    1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)

You see two sets of results in the output. The first row shows that inc has a RRR (which is like an odds ratio) of 1.1999 (let's round that to 1.2). The RRR refers to the odds of the wife working part time (1) vs. those who do not work at all (0).

The second set of results shows the RRR to be 1.3999 (let's round this to 1.4). In this case, the RRR refers to the odds of a wife working full time (2) as compared to those who do not work at all (0).

The first set of analyses are as though you performed a simple binary logistic regression where the full time working wives were omitted. In fact, if we do that analysis you see the odds ratio from the logistic command will be strikingly similar to the RRR from the mlogit command.

logistic wifework inc if wifework != 2
Logit estimates                                   Number of obs   =       3593
                                                  LR chi2(1)      =     186.42
                                                  Prob > chi2     =     0.0000
Log likelihood = -2031.5391                       Pseudo R2       =     0.0439

------------------------------------------------------------------------------
wifework | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
     inc |   1.199817   .0162769     13.428   0.000       1.168336    1.232147
------------------------------------------------------------------------------

Likewise, the second set of analyses are as though you performed a simple logistic regression where the part time wives were omitted. We show that analysis below, and again the "odds ratio" and the RRR are very similar.

logistic wifework inc if wifework != 1
Logit estimates                                   Number of obs   =       7980
                                                  LR chi2(1)      =     707.64
                                                  Prob > chi2     =     0.0000
Log likelihood = -2657.6675                       Pseudo R2       =     0.1175

------------------------------------------------------------------------------
wifework | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
     inc |   1.400033   .0181139     26.008   0.000       1.364977    1.435989
------------------------------------------------------------------------------

We can generalize about how the multinomial regression model works from this example. Say we had 5 groups, and we set group 1 to be the baseline group. The multinomial regression model would have 4 sets of results, and RRR would be predicting the odds of
- Being in group 2 (as compared to group 1)
- Being in group 3 (as compared to group 1)
- Being in group 4 (as compared to group 1)
- Being in group 5 (as compared to group 1)

You can choose which group is to be the baseline group. For our example, group 0 was one natural comparison group. This allowed us to assess the odds of working part time vs not working, and the odds or working full time vs. not working.

Let's look at other aspects of the multinomial logistic model. We first run the model that we ran before.

mlogit wifework inc, rrr base(0)
Iteration 0:   log likelihood = -8901.2117
Iteration 1:   log likelihood = -8523.0599
Iteration 2:   log likelihood = -8484.4998
Iteration 3:   log likelihood = -8484.3002
Iteration 4:   log likelihood = -8484.3002

Multinomial regression                            Number of obs   =      10573
                                                  LR chi2(2)      =     833.82
                                                  Prob > chi2     =     0.0000
Log likelihood = -8484.3002                       Pseudo R2       =     0.0468

------------------------------------------------------------------------------
wifework |        RRR   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
1        |
     inc |   1.199894   .0160932     13.587   0.000       1.168762    1.231854
---------+--------------------------------------------------------------------
2        |
     inc |    1.39992   .0178082     26.446   0.000       1.365448    1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)

We can use the adjust command to get the RRR values assessing the odds of working part time as compared to not working. The 1 in equation(1) refers to the code for those working part time.

adjust , by(inc) exp equation(1)
-------------------------------------------------------------------------------
Dependent variable: wifework     Equation: 1     Command: mlogit
-------------------------------------------------------------------------------

----------+-----------
      inc |    exp(xb)
----------+-----------
       10 |    .999379
       11 |    1.19915
       12 |    1.43885
       13 |    1.72647
       14 |    2.07158
       15 |    2.48567
       16 |    2.98254
       17 |    3.57873
       18 |     4.2941
       19 |    5.15246
----------+-----------
Key:  exp(xb)    =  exp(xb)

We can divide the odds of those earning $13k (1.726) by the odds for those earning $12k (1.4388) and see the value is 1.199, or 1.2. This is indeed the RRR reported by the mlogit command (the values of $12k and $13k are arbitrary... you could pick any 2 adjacent values and get the same results).

We can use the adjust command to get the RRR for assessing the odds of working full time as compared to the odds of not working. The equation(2) tells Stata that we want the equation that compares the full time workers (since full time is coded 2) with those not working (since those not working are the comparison group).

adjust , by(inc) exp equation(2)
-------------------------------------------------------------------------------
Dependent variable: wifework     Equation: 2     Command: mlogit
-------------------------------------------------------------------------------

----------+-----------
      inc |    exp(xb)
----------+-----------
       10 |    1.00007
       11 |    1.40002
       12 |    1.95991
       13 |    2.74372
       14 |    3.84099
       15 |    5.37707
       16 |    7.52747
       17 |    10.5379
       18 |    14.7522
       19 |    20.6518
----------+-----------
Key:  exp(xb)    =  exp(xb)

You can choose any 2 adjacent values and divide the odds to see that the odds increase by a factor of 1.4. For simplicity, comapring those who earn $11k with $10k is 1.4 / 1.0 = 1.4.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California