|
|
|
||||
|
|
|||||
When your outcome has 3 or more values, you can analyze the data using a multinomial logistic regression. This breaks the regression up into a series of binary regressions comparing each group to a baseline group. For example, wifework has 3 values, 0=not working, 1=part time, 2=full time. If we choose not working (0) as the baseline group, multinomial logistic regression will assess the odds of working part time vs. not working, and the odds of working full time vs. not working. It is kind of like you performed 2 binary logistic regressions where the first will treat working part time as a 1 and not working as a 0, and the second will treat working full time as a 1 and not working as a 0.
Let's use a contrived data file for exploring multinomial logistic regression and understanding the results it produces. This file contains the income of the husband (in thousands of dollars) in inc and the work status of the wife in wifework, coded 0=not working, 1=part time, 2=full time. Below we see the crosstab of inc and wifework.
use multrat1 , clear
tabulate inc wifework
| wifework
inc | 0 1 2 | Total
-----------+---------------------------------+----------
10 | 100 100 100 | 300
11 | 100 120 140 | 360
12 | 100 144 196 | 440
13 | 100 173 274 | 547
14 | 100 207 384 | 691
15 | 100 249 538 | 887
16 | 100 296 752 | 1148
17 | 100 358 1054 | 1512
18 | 100 430 1476 | 2006
19 | 100 516 2066 | 2682
-----------+---------------------------------+----------
Total | 1000 2593 6980 | 10573
We perform a multinomial logistic regression using those who dont work (0) as the
baseline group.
mlogit wifework inc, rrr base(0)
Iteration 0: log likelihood = -8901.2117
Iteration 1: log likelihood = -8523.0599
Iteration 2: log likelihood = -8484.4998
Iteration 3: log likelihood = -8484.3002
Iteration 4: log likelihood = -8484.3002
Multinomial regression Number of obs = 10573
LR chi2(2) = 833.82
Prob > chi2 = 0.0000
Log likelihood = -8484.3002 Pseudo R2 = 0.0468
------------------------------------------------------------------------------
wifework | RRR Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
inc | 1.199894 .0160932 13.587 0.000 1.168762 1.231854
---------+--------------------------------------------------------------------
2 |
inc | 1.39992 .0178082 26.446 0.000 1.365448 1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)
You see two sets of results in the output. The first row
shows that inc has a RRR (which is like an odds ratio)
of 1.1999 (let's round that to 1.2). The RRR refers to the odds of the
wife working part time (1) vs. those who do not work at all (0).The second set of results shows the RRR to be 1.3999 (let's round this to 1.4). In this case, the RRR refers to the odds of a wife working full time (2) as compared to those who do not work at all (0).
The first set of analyses are as though you performed a simple binary logistic regression where the full time working wives were omitted. In fact, if we do that analysis you see the odds ratio from the logistic command will be strikingly similar to the RRR from the mlogit command.
logistic wifework inc if wifework != 2
Logit estimates Number of obs = 3593
LR chi2(1) = 186.42
Prob > chi2 = 0.0000
Log likelihood = -2031.5391 Pseudo R2 = 0.0439
------------------------------------------------------------------------------
wifework | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
inc | 1.199817 .0162769 13.428 0.000 1.168336 1.232147
------------------------------------------------------------------------------
Likewise, the second set of analyses are as though you
performed a simple logistic regression where the part time wives were omitted. We show
that analysis below, and again the "odds ratio" and the RRR are
very similar.
logistic wifework inc if wifework != 1
Logit estimates Number of obs = 7980
LR chi2(1) = 707.64
Prob > chi2 = 0.0000
Log likelihood = -2657.6675 Pseudo R2 = 0.1175
------------------------------------------------------------------------------
wifework | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
inc | 1.400033 .0181139 26.008 0.000 1.364977 1.435989
------------------------------------------------------------------------------
We can generalize about how the multinomial regression
model works from this example. Say we had 5 groups, and we set group 1 to be the baseline
group. The multinomial regression model would have 4 sets of results, and RRR
would be predicting the odds ofYou can choose which group is to be the baseline group. For our example, group 0 was one natural comparison group. This allowed us to assess the odds of working part time versus not working, and the odds or working full time vs. not working, but you can choose any of the 3 groups to be the reference group.
Let's look at other aspects of the multinomial logistic model. We first run the model that we ran before.
mlogit wifework inc, rrr base(0)
Iteration 0: log likelihood = -8901.2117
Iteration 1: log likelihood = -8523.0599
Iteration 2: log likelihood = -8484.4998
Iteration 3: log likelihood = -8484.3002
Iteration 4: log likelihood = -8484.3002
Multinomial regression Number of obs = 10573
LR chi2(2) = 833.82
Prob > chi2 = 0.0000
Log likelihood = -8484.3002 Pseudo R2 = 0.0468
------------------------------------------------------------------------------
wifework | RRR Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
1 |
inc | 1.199894 .0160932 13.587 0.000 1.168762 1.231854
---------+--------------------------------------------------------------------
2 |
inc | 1.39992 .0178082 26.446 0.000 1.365448 1.435262
------------------------------------------------------------------------------
(Outcome wifework==0 is the comparison group)
We can use the adjust command to get
the RRR values assessing the odds of working part time as compared to not
working. The 1 in equation(1) refers to the code for
those working part time.
adjust , by(inc) exp equation(1)
-------------------------------------------------------------------------------
Dependent variable: wifework Equation: 1 Command: mlogit
-------------------------------------------------------------------------------
----------+-----------
inc | exp(xb)
----------+-----------
10 | .999379
11 | 1.19915
12 | 1.43885
13 | 1.72647
14 | 2.07158
15 | 2.48567
16 | 2.98254
17 | 3.57873
18 | 4.2941
19 | 5.15246
----------+-----------
Key: exp(xb) = exp(xb)
We can divide the odds of those earning $13k (1.726) by
the odds for those earning $12k (1.4388) and see the value is 1.199, or 1.2. This is
indeed the RRR reported by the mlogit command (the
values of $12k and $13k are arbitrary... you could pick any 2 adjacent values and get the
same results).We can use the adjust command to get the RRR for assessing the odds of working full time as compared to the odds of not working. The equation(2) tells Stata that we want the equation that compares the full time workers (since full time is coded 2) with those not working (since those not working are the comparison group).
adjust , by(inc) exp equation(2)
-------------------------------------------------------------------------------
Dependent variable: wifework Equation: 2 Command: mlogit
-------------------------------------------------------------------------------
----------+-----------
inc | exp(xb)
----------+-----------
10 | 1.00007
11 | 1.40002
12 | 1.95991
13 | 2.74372
14 | 3.84099
15 | 5.37707
16 | 7.52747
17 | 10.5379
18 | 14.7522
19 | 20.6518
----------+-----------
Key: exp(xb) = exp(xb)
You can choose any two adjacent values and divide the odds
to see that the odds increase by a factor of 1.4. For simplicity, comparing those who earn
$11k with $10k is 1.4 / 1.0 = 1.4. As you can see, interpreting RRRs is much
like interpreting odds ratios in a binary logistic regression.
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services