Where L(m*) denotes the likelihood of the respective model, and ll(m*) the natural log of the models' likelihood.
This statistic is distributed chi-squared with degrees of freedom equal to the difference in the number of degrees of freedom between the two models (i.e., the number of variables added to the model). In order to perform the likelihood ratio test we will need to run both models and make note of their final log likelihoods. We will run the models using Stata and use commands to store the log likelihoods. We could also just copy the likelihoods down (i.e., by writing them down, or cutting and pasting), but using commands is a little easier and is less likely to result in errors. The first line of syntax below reads in the dataset from our website. The second line of syntax runs a logistic regression model, predicting hiwrite based on students' gender (female), and reading scores (read). The third line of code stores the value of the log likelihood for the model, which is temporarily stored as the returned estimate e(ll) (for more information type help return in the Stata command window), in the scalar named m1.Below is the output. In order to perform the likelihood ratio test we will need to keep track of the log likelihood (-102.44), the syntax for this example (above) does this by storing the value in a scalar. Since it is not our primary concern here, we will skip the interpretation of the rest logistic regression model. Note that storing the returned estimate does not produce any output.use http://www.ats.ucla.edu/stat/stata/faq/nested_tests, clear logit hiwrite female read scalar m1 = e(ll)
Iteration 0: log likelihood = -137.41698
Iteration 1: log likelihood = -104.79885
Iteration 2: log likelihood = -102.52269
Iteration 3: log likelihood = -102.44531
Iteration 4: log likelihood = -102.44518
Logistic regression Number of obs = 200
LR chi2(2) = 69.94
Prob > chi2 = 0.0000
Log likelihood = -102.44518 Pseudo R2 = 0.2545
------------------------------------------------------------------------------
hiwrite | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.403022 .3671964 3.82 0.000 .6833301 2.122713
read | .1411402 .0224042 6.30 0.000 .0972287 .1850517
_cons | -7.798179 1.235685 -6.31 0.000 -10.22008 -5.376281
------------------------------------------------------------------------------
The first line of syntax below runs the second model, that is, the model
with all four predictor variables. The second line of code stores the value of
the log likelihood for the model (-84.4), which is temporarily stored as the returned estimate (
e(ll) ), in the
scalar named m2. Again, we won't say
much about the output except to note that the coefficients for both math and
science are both
statistically significant. So we know that, individually, they are statistically significant
predictors of hiwrite.
logit hiwrite female read math science
scalar m2 = e(ll)
Iteration 0: log likelihood = -137.41698
Iteration 1: log likelihood = -90.166892
Iteration 2: log likelihood = -84.909776
Iteration 3: log likelihood = -84.42653
Iteration 4: log likelihood = -84.419844
Iteration 5: log likelihood = -84.419842
Logistic regression Number of obs = 200
LR chi2(4) = 105.99
Prob > chi2 = 0.0000
Log likelihood = -84.419842 Pseudo R2 = 0.3857
------------------------------------------------------------------------------
hiwrite | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.805528 .4358101 4.14 0.000 .9513555 2.6597
read | .0529536 .0275925 1.92 0.055 -.0011268 .107034
math | .1319787 .0318836 4.14 0.000 .069488 .1944694
science | .0577623 .027586 2.09 0.036 .0036947 .1118299
_cons | -13.26097 1.893801 -7.00 0.000 -16.97275 -9.549188
------------------------------------------------------------------------------
Now that we have the log likelihoods from both models, we can perform a likelihood ratio test.
The first line of syntax below calculates the likelihood ratio test statistic. The second
line of syntax below finds the p-value associated with our test statistic with two
degrees of freedom. Looking below we see that the test statistic is 36.05, and that the
associated p-value is very low (less than 0.0001). The results show that adding math and science as predictor variables together (not just
individually) results in a statistically significant improvement in model fit. Note that if we performed a likelihood ratio test for adding a single variable to the model, the results would be the same as the significance test for the coefficient for that variable presented in the
above table.
di "chi2(2) = " 2*(m2-m1) di "Prob > chi2 = "chi2tail(2, 2*(m2-m1)) chi2(2) = 36.050677 Prob > chi2 = 1.485e-08
Below is the output. Since it is not our primary concern here, we will skip the interpretation of the logistic regression model. Note that storing the estimates does not produce any output.use http://www.ats.ucla.edu/stat/stata/faq/nested_tests, clear logit hiwrite female read estimates store m1
The first line of syntax below this paragraph runs the second model, that is the model with all four predictor variables. The second line of syntax saves the estimates from this model, and names them m2. Below the syntax is the output generated. Again, we won't say much about the output except to note that the coefficients for both math and science are both statistically significant. So we know that, individually, they are statistically significant predictors of hiwrite. The tests below will allow us to test whether adding both of these variables to the model significantly improves the fit of the model, compared to a model that contains just female and read.Iteration 0: log likelihood = -137.41698-137.41698-137.41698 Iteration 1: log likelihood = -104.79885 Iteration 2: log likelihood = -102.52269 Iteration 3: log likelihood = -102.44531 Iteration 4: log likelihood = -102.44518 Logistic regression Number of obs = 200 LR chi2(2) = 69.94 Prob > chi2 = 0.0000 Log likelihood = -102.44518 Pseudo R2 = 0.2545 ------------------------------------------------------------------------------ hiwrite | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | 1.403022 .3671964 3.82 0.000 .6833301 2.122713 read | .1411402 .0224042 6.30 0.000 .0972287 .1850517 _cons | -7.798179 1.235685 -6.31 0.000 -10.22008 -5.376281 ------------------------------------------------------------------------------
The first line of syntax below tells Stata that we want to run an lr test, and that we want to compare the estimates we have saved as m1 to those we have saved as m2. The output reminds us that this test assumes that A is nested in B, which it is. It also gives us the chi-squared value for the test (36.05) as well as the p-value for a chi-squared of 36.05 with two degrees of freedom. Note that the degrees of freedom for the lr test, along with the other two tests, is equal to the number of parameters that are constrained (i.e., removed from the model), in our case, 2. Note that the results are the same as when we calculated the lr test by hand above. Adding math and science as predictor variables together (not just individually) results in a statistically significant improvement in model fit. As noted when we calculated the likelihood ratio test by hand, if we performed a likelihood ratio test for adding a single variable to the model, the results would be the same as the significance test for the coefficient for that variable presented in the table above.logit hiwrite female read math science estimates store m2 Iteration 0: log likelihood = -137.41698 Iteration 1: log likelihood = -90.166892 Iteration 2: log likelihood = -84.909776 Iteration 3: log likelihood = -84.42653 Iteration 4: log likelihood = -84.419844 Iteration 5: log likelihood = -84.419842 Logistic regression Number of obs = 200 LR chi2(4) = 105.99 Prob > chi2 = 0.0000 Log likelihood = -84.419842 Pseudo R2 = 0.3857 ------------------------------------------------------------------------------ hiwrite | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | 1.805528 .4358101 4.14 0.000 .9513555 2.6597 read | .0529536 .0275925 1.92 0.055 -.0011268 .107034 math | .1319787 .0318836 4.14 0.000 .069488 .1944694 science | .0577623 .027586 2.09 0.036 .0036947 .1118299 _cons | -13.26097 1.893801 -7.00 0.000 -16.97275 -9.549188 ------------------------------------------------------------------------------
The entire syntax for a likelihood ratio test, all in one block, looks like this:lrtest m1 m2 Likelihood-ratio test LR chi2(2) = 36.05 (Assumption: A nested in B) Prob > chi2 = 0.0000
logit hiwrite female read estimates store m1 logit hiwrite female read math science estimates store m2 lrtest m1 m2
quietly: logit hiwrite female read math science test math science ( 1) math = 0 ( 2) science = 0 chi2( 2) = 27.53 Prob > chi2 = 0.0000
Please note that the user-written testomit is no longer available in Stata.
In order to perform the score test, you will need to download two user written packages for Stata. These packages are called enumopt and testomit . If your computer is online, you can type findit enumopt in the Stata command window. (For more information or help see our FAQ page How do I use findit to search for programs and additional help? ) Assuming the necessary packages are installed, the syntax below shows how to run a score test. The first line of syntax runs the model with just female and read as predictor variables (recall that the score test uses a model with fewer variables and tests for omitted variables). The next line uses the command predict to generate a new variable called test that contains the score for each case. Without going into too much detail, the scores here are based on the model estimated and the value of the variables in the model for each case. The third line of syntax uses the testomit command to examine whether the variables math and/or science are variables which were incorrectly omitted from the model. The option score(test) tells Stata the name of the variable containing the scores, although it is in the options section (i.e., after the comma), this is required.Please note that the user-written testomit is no longer available in Stata.
The first part of the output gives the type of model run, followed by a table of results. The results of the score test are distributed chi-squared with degrees of freedom equal to the number of variables added to the model. The table has three columns, the first giving the value of the test statistic, the second the number of degrees of freedom for the test, and the third giving the p-value associated with a chi-squared of a given value with a given number of degrees of freedom. The variables math and science appear separately in their own rows, the first two rows contain the results for a test of whether adding either (but not both) of these variables to the model would significantly improve the fit of the model. The bottom row, labeled simultaneous test, tests whether adding both variables to the model will significantly improve the fit of the model. The results shown in the table are consistent with the Wald and lr tests we performed above. They are also consistent with the regression output above, in which the coefficients for math and science were statistically significant.quietly: logit hiwrite female read predict test, score testomit math science, score(test) logit: score tests for omitted variables Term | score df p ---------------------+---------------------- math | 28.94 1 0.0000 science | 15.39 1 0.0001 ---------------------+---------------------- simultaneous test | 35.51 2 0.0000 ---------------------+----------------------
The command testomit behaves somewhat differently for different estimation
commands. Below are examples of how to use testomit with several other regression commands.
Most multiple equation commands will use a syntax similar to the syntax for mlogit. Two exceptions
are ologit and oprobit, and regress, which are shown separately.
Please note that the user-written testomit is no longer available in Stata.
For mlogit and many other multiple-equation commands:
mlogit perform read write predict mslo mshi, score testomit (low: science math) (high: science math), score(mslo mshi)
For ologit and oprobit:
ologit perform read write predict coef cut1 cut2, score testomit (LP: math science), score(coef)
For regress:
reg read write math predict regs, score testomit (mean: science female), score(regs)
The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.