### Stata Library Understanding RR Ratios in Multinomial Logistic Regression

#### 1. Contrived example, RR ratio of 1.2 and 1.4

The examples used here are similar to those used in the binary logistic examples. In those examples, we predicted whether a wife in a family worked (or did not work) based on the income of the husband. Here, we have a more complex example because the work status of the wife had 3 values, 0 not working, 1 working part time, and 2 working full time.

When your outcome has 3 or more values, you can analyze the data using a multinomial logistic regression. This breaks the regression up into a series of binary regressions comparing each group to a baseline group. For example, wifework has 3 values, 0=not working, 1=part time, 2=full time. If we choose not working (0) as the baseline group, multinomial logistic regression will assess the odds of working part time vs. not working, and the odds of working full time vs. not working. It is kind of like you performed 2 binary logistic regressions where the first will treat working part time as a 1 and not working as a 0, and the second will treat working full time as a 1 and not working as a 0.

Let's use a contrived data file for exploring multinomial logistic regression and understanding the results it produces.  This file contains the income of the husband (in thousands of dollars) in inc and the work status of the wife in wifework, coded  0=not working, 1=part time, 2=full time.  Below we see the crosstab of inc and wifework.
use multrat1 , clear
tabulate inc wifework

|             wifework
inc |         0          1          2 |     Total
-----------+---------------------------------+----------
10 |       100        100        100 |       300
11 |       100        120        140 |       360
12 |       100        144        196 |       440
13 |       100        173        274 |       547
14 |       100        207        384 |       691
15 |       100        249        538 |       887
16 |       100        296        752 |      1148
17 |       100        358       1054 |      1512
18 |       100        430       1476 |      2006
19 |       100        516       2066 |      2682
-----------+---------------------------------+----------
Total |      1000       2593       6980 |     10573 
We perform a multinomial logistic regression using those who dont work (0) as the baseline group.
mlogit wifework inc, rrr base(0)

Iteration 0:   log likelihood = -8901.2117
Iteration 1:   log likelihood = -8523.0599
Iteration 2:   log likelihood = -8484.4998
Iteration 3:   log likelihood = -8484.3002
Iteration 4:   log likelihood = -8484.3002

Multinomial regression                            Number of obs   =      10573
LR chi2(2)      =     833.82
Prob > chi2     =     0.0000
Log likelihood = -8484.3002                       Pseudo R2       =     0.0468

------------------------------------------------------------------------------
wifework |        RRR   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
1        |
inc |   1.199894   .0160932     13.587   0.000       1.168762    1.231854
---------+--------------------------------------------------------------------
2        |
inc |    1.39992   .0178082     26.446   0.000       1.365448    1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)
You see two sets of results in the output. The first row shows that inc has a RRR (which is like an odds ratio) of 1.1999 (let's round that to 1.2). The RRR refers to the odds of the wife working part time (1) vs. those who do not work at all (0).

The second set of results shows the RRR to be 1.3999 (let's round this to 1.4). In this case, the RRR refers to the odds of a wife working full time (2) as compared to those who do not work at all (0).

The first set of analyses are as though you performed a simple binary logistic regression where the full time working wives were omitted. In fact, if we do that analysis you see the odds ratio from the logistic command will be strikingly similar to the RRR from the mlogit command.
logistic wifework inc if wifework != 2

Logit estimates                                   Number of obs   =       3593
LR chi2(1)      =     186.42
Prob > chi2     =     0.0000
Log likelihood = -2031.5391                       Pseudo R2       =     0.0439

------------------------------------------------------------------------------
wifework | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
inc |   1.199817   .0162769     13.428   0.000       1.168336    1.232147
------------------------------------------------------------------------------
Likewise, the second set of analyses are as though you performed a simple logistic regression where the part time wives were omitted. We show that analysis below, and again the "odds ratio" and the RRR are very similar.
logistic wifework inc if wifework != 1

Logit estimates                                   Number of obs   =       7980
LR chi2(1)      =     707.64
Prob > chi2     =     0.0000
Log likelihood = -2657.6675                       Pseudo R2       =     0.1175

------------------------------------------------------------------------------
wifework | Odds Ratio   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
inc |   1.400033   .0181139     26.008   0.000       1.364977    1.435989
------------------------------------------------------------------------------
We can generalize about how the multinomial regression model works from this example. Say we had 5 groups, and we set group 1 to be the baseline group. The multinomial regression model would have 4 sets of results, and RRR would be predicting the odds of

- Being in group 2 (as compared to group 1)
- Being in group 3 (as compared to group 1)
- Being in group 4 (as compared to group 1)
- Being in group 5 (as compared to group 1)

You can choose which group is to be the baseline group. For our example, group 0 was one natural comparison group. This allowed us to assess the odds of working part time versus not working, and the odds or working full time vs. not working, but you can choose any of the 3 groups to be the reference group.

Let's look at other aspects of the multinomial logistic model. We first run the model that we ran before.
mlogit wifework inc, rrr base(0)

Iteration 0:   log likelihood = -8901.2117
Iteration 1:   log likelihood = -8523.0599
Iteration 2:   log likelihood = -8484.4998
Iteration 3:   log likelihood = -8484.3002
Iteration 4:   log likelihood = -8484.3002

Multinomial regression                            Number of obs   =      10573
LR chi2(2)      =     833.82
Prob > chi2     =     0.0000
Log likelihood = -8484.3002                       Pseudo R2       =     0.0468

------------------------------------------------------------------------------
wifework |        RRR   Std. Err.       z     P>|z|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
1        |
inc |   1.199894   .0160932     13.587   0.000       1.168762    1.231854
---------+--------------------------------------------------------------------
2        |
inc |    1.39992   .0178082     26.446   0.000       1.365448    1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)
We can use the adjust command to get the RRR values assessing the odds of working part time as compared to not working. The 1 in equation(1) refers to the code for those working part time.
adjust , by(inc) exp equation(1)

-------------------------------------------------------------------------------
Dependent variable: wifework     Equation: 1     Command: mlogit
-------------------------------------------------------------------------------

----------+-----------
inc |    exp(xb)
----------+-----------
10 |    .999379
11 |    1.19915
12 |    1.43885
13 |    1.72647
14 |    2.07158
15 |    2.48567
16 |    2.98254
17 |    3.57873
18 |     4.2941
19 |    5.15246
----------+-----------
Key:  exp(xb)    =  exp(xb)
We can divide the odds of those earning $13k (1.726) by the odds for those earning$12k (1.4388) and see the value is 1.199, or 1.2. This is indeed the RRR reported by the mlogit command (the values of $12k and$13k are arbitrary... you could pick any 2 adjacent values and get the same results).

We can use the adjust command to get the RRR for assessing the odds of working full time as compared to the odds of not working. The equation(2) tells Stata that we want the equation that compares the full time workers (since full time is coded 2) with those not working (since those not working are the comparison group).

adjust , by(inc) exp equation(2)

-------------------------------------------------------------------------------
Dependent variable: wifework     Equation: 2     Command: mlogit
-------------------------------------------------------------------------------

----------+-----------
inc |    exp(xb)
----------+-----------
10 |    1.00007
11 |    1.40002
12 |    1.95991
13 |    2.74372
14 |    3.84099
15 |    5.37707
16 |    7.52747
17 |    10.5379
18 |    14.7522
19 |    20.6518
----------+-----------
Key:  exp(xb)    =  exp(xb)
You can choose any two adjacent values and divide the odds to see that the odds increase by a factor of 1.4. For simplicity, comparing those who earn $11k with$10k is 1.4 / 1.0 = 1.4. As you can see, interpreting RRRs is much like interpreting odds ratios in a binary logistic regression.

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.