Mplus FAQ
Why is Mplus excluding cases with missing values when the model does not specify listwise deletion?

Note this page was created using Mplus 5.2, other versions of Mplus may behave differently.

Mplus can be used to estimate a model in which some of the variables have missing values using full information maximum likelihood (FIML). Starting in version 5 this is done by default, in earlier versions this type of estimation could be requested using type = missing;. However, for some models, Mplus drops cases with missing values on any of the predictors. Below is an example of this using count models, but we have encountered this behavior with other types of models (e.g. models with categorical outcomes). We hope that this page will help you recognize this when it happens, and help you understand what Mplus is doing. A method of specifying the model so that cases with missing values on the predictor are included is presented at the bottom of the page.

Below are the descriptive statistics for a small dataset (the output is from another package). The variable d1 is a binary variable we will use as a predictor, x1 and x2 are continuous predictors, and count is the outcome (true to its name it is a count variable). We know from working with the data that there are 150 cases with complete data on all of the variables. Note that x1, x2, and count all have missing values. The Mplus version of the dataset can be downloaded by clicking on fiml_count.dat.

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          d1 |       200        .545    .4992205          0          1
          x1 |       180    52.71111    9.388861         33         75
          x2 |       188    52.35106    10.83193         26         71
       count |       177    1.615819    1.796265          0          7

Below is the Mplus input file to run a model with x1, x2, and d1 predicting count.

Data:
  File is fiml_count.dat ;
Variable:
  Names are d1 x1 x2 count;
  Missing are all (-9999) ;
  count is count;
Model:
  count on x1 x2 d1;

When we run the model we receive the following output. A warning message tell us that Mplus has excluded 31 cases because they have missing values on the x-variables (predictors). Mplus has also excluded 19 cases with missing values on "all variables except the x-variables," that is, cases missing on the outcome. Below the warning messages we see that the number of the observations used to estimate the model was 150.

*** WARNING
  Data set contains cases with missing on x-variables.
  These cases were not included in the analysis.
  Number of cases with missing on x-variables:  31
*** WARNING
  Data set contains cases with missing on all variables except
  x-variables.  These cases were not included in the analysis.
  Number of cases with missing on all variables except x-variables:  19
   2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
   
SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         150

Why has Mplus excluded cases with missing values on the predictor variables, when it typically includes such cases? For some models (including count models) the predictor variables (called observed covariates by Mplus) are not included in the model in the same manner as other variables, and hence their missing values cannot be handled using maximum likelihood based techniques. Note that in some models, for example a model similar to this one, but with a continuous outcome, predictor variables are considered part of the model, and cases with missing values on these variables are included in the analysis. It is possible to add predictors that would not otherwise be considered part of a model to the model (see below), which will allow for missing values. Note that when the predictors are included in the model the same distributional assumptions that are made about other variables in the model (e.g. normality) are now also made about the predictor variables.

There are two differences between the model shown below and the one shown above. Looking at the bottom of the input file, we have added [x1 x2 d1] to the model command, including the name of a predictor variable in square brackets includes the mean of the variable in the model. We have also added the analysis command with the integration = montecarlo option because this model requires the use of monte carlo integration. If we leave out this option, Mplus will prompt us to include it.

Data:
  file is fiml_count.dat ;
Variable:
  names are d1 x1 x2 count;
  missing are all (-9999) ;
  count is count;
Analysis:
  integration = montecarlo;
Model:
  count on x1 x2 d1;
  [x1 x2 d1];

When we run the model, we find that Mplus does not print any error messages, and the number of observations is 200 (i.e. all cases were included in the analysis).

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.