Statistical Computing Seminar Introduction to Multilevel Modeling Using MLwiN

This seminar is based on the examples in the paper Using SAS Proc Mixed to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models by Judith Singer and can be downloaded from Professor Singer's web site.

MLwiN data files, hsb12.ws , hsb12a.ws and willett.ws

Outline

• Starting MLwiN and opening data
• Model 1: Predicting MathAch from meanses
• Model 2: Predicting MathAch from cses
• Model 3: Predicting MathAch from lots of things
• Repeating the examples
• Inputting data into MLwiN
• Model 1 (again): Predicting MathAch from meanses
• Model 2 (again): Predicting MathAch from cses
• Model 3 (again): Predicting MathAch from lots of things

Starting MLwiN and opening data

To open an existing MLwiN worksheet, you can do the following.

• File
Open Worksheet
Choose hsb12.ws

Once you have the data open, you can see the variable names, number of cases (valid and missing) and min/max like this.

• Data Manipulation
Names

You can view the data in a kind of spreadsheet like this.

• Data Manipulation
View or Edit Data

Click on the View button to select more/different variables to view.

Model 1: Predicting MathAch from meanses

This model is from page 331 of the Singer article.

This model uses a school level predictor, meanses, to predict MathAch.  Further, we recognize that each school may have a different intercept, so the intercept is random at level 2 and level 1.  We can write this model using multiple equations as shown below.

Level 1: MathAchij β0j + rij
Level 2: β0j =  γ00 + γ01(MeanSES) + u0j

Combining the two equations into one by substituting the level 2 equation into the level 1 equation, we have the equation below, with the random effects identified by placing them in square brackets.

MathAchij  γ00 + γ01(MeanSES) + [ u0j + rij ]

• The term u0j is a random effect at level 2, representing random variation in the average math achievement among (between) schools.

• The term rij is a random effect at level 1, representing random variation in the math achievement of students within schools.

We can create and run this model in MLwiN as shown below.

• Model
Equations
• Click y
Select MathAch as Y variable
Select 2 levels
Level 2 is school

Level 1 is student

Click done
• Click β0
Select cons
Click j(school) to indicate it random at level 2

Click i(student) to indicate is random at level 1
Click Done
• Click Add Term and click β1
Select meanses
Click Done
• Show names by clicking Name
• Show subscripts by clicking Subs
• Expand the model by clicking + +

The Equations window should then look like the one below.

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.

• Fixed Effects
1. The coefficient for meanses is 5.863 with a standard error of 0.359.  You can divide 5.863 / 0.359 to get a Z test (or wald test) for this coefficient and since it is greater than 2, this is significant.  This indicates that when the average SES for a school increases by 1 point, the students match achievement is predicted to increase by 5.863 points.
2. The coefficient for the constant is the predicted math achievement when all predictors are 0, so when the average school ses is 0, the students math achievement is predicted to be 12.65.  A range of plausible values for school means, given that all schools have meanses of zero, is 12.65 ± 1.96 *(2.59)1/2 = (10.11, 15.19).
• Random Effects
1. Do school achievement means still vary significantly once meanses is controlled? The variance of the intercept is 2.59 with a standard error of 0.392, suggesting that after we  control for meanses, significant variation among school mean math achievement still remains to be explained.
2. We can also calculate the conditional intraclass correlation conditional on the values of meanses. 2.59/(2.59 + 39.16) = 0.06 measures the degree of dependence among observations within schools that are of the same meanses.

Model 2: Predicting MathAch from cses

This model is from page 335 of the Singer article.

This model is referred to as a random-coefficient model by Raudenbush and Bryk. Pretend that we will a run regression of MathAch on centered ses on each school; that is, we are going to run 160 regressions. We can then ask the following questions.

1. What would be the average of the 160 intercepts, (both intercept and slope)?
2. How much do the regression equations vary from school to school?
3. What is the correlation between the intercepts and slopes?

Here is the model expressed using multiple equations.

Level 1: MathAchij β0j + β1j (cses) + rij
Level 2: β0j =  γ00  + u0j
Level 2: β1j =  γ10  + u1j

Combining the two equations into one by substituting the level 2 equation to level 1 equation, we have the following with the random effects identified by placing them in square brackets.

MathAchij γ00  + γ10(cses) +  [ u1j(cses) + u0j + rij ]

Instead of using meanses we would like to use centered SES or cses to predict math achievement.  Change meanses to cses and make it random at level 2. The Equation window should look like this.

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.

• Fixed Effects
1. When cses is held at 0, the predicted MathAch is 12.65. The 95% plausible value range for the school means is 12.65 ± 1.96 *(8.62)1/2 = (6.89, 18.40).
2. The average cses-MathAch slope is 2.19. The 95% plausible value range for the SES-achievement slope is 2.19 ± 1.96 *(.68)1/2 = (.57, 3.81).
• Random Effects
1. Notice that the residual variance is now 36.70, compared with the residual variance of 39.15 in the prior model without cses. We can compute the proportion variance explained at level 1 by (39.15 - 36.70) / 39.15 = .063. This means using student-level SES as a predictor of math achievement reduced the within-school variance by 6.3%.
2. The estimate of cses is  0.678 with standard error 0.284. Since 0.678 is more than 2 times the standard error of 0.284, there remain significant differences in slopes among schools.
3. The covariance estimate is 0.050  with standard error 0.393. This gives no evidence that the effect of cses depending upon the average math achievement in the school.

Model 3: Predicting MathAch from meanses, sector, cses, meansesBYcses and sectorBYcses

This model is shown in the Singer article on pages 337 and 338.

This model is referred as an intercepts and slopes-as-outcomes model by Raudenbush and Bryk. The questions that we are interested in are:

1. Do meanses and sector significantly predict the intercept?
2. Do meanses and sector significantly predict the within-school slopes?
3. How much variation in the intercepts and the slopes is explained by meanses and sector?

We can express this as a multiple equation model like this.

Level 1: MathAchij β0j + β1j (cses) + rij
Level 2: β0j =  γ00  + γ01(MeanSES) + γ02(Sector) + u0j
Level 2: β1j =  γ10  + γ11(MeanSES) + γ12(Sector) + u1j

Combining the two equations into one by substituting the level 2 equations into the level 1 equation, we have the following equation.  The random effects are identified by placing them in square brackets.

MathAchij  γ00  + γ01(MeanSES) + γ02(Sector) + γ10 (cses) + γ11(MeanSES*cses) +  γ12(Sector*cses) + [  u0j + u1j(cses) +  rij ]

We can do this in MLwiN like this.

• Click Add Term and click β2
Select meanses
Click Done
• Click Add Term and click β3
Select sector
Click Done
• Click Add Term and click β4
Select meansesBYcses
Click Done
• Click Add Term and click β5
Select sectorBYcses
Click Done

Your model should look like this (but maybe out of order).

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.

Let us focus on the interactions with cses. The sectorBYcses interaction is significant.  When sector is 0 (public), the slope for cses is 2.939 and when sector is 1 (Catholic) the slope for cses is 2.939 + -1.644 = 1.295. We can view this interaction by generating the predicted values and graphing them.

• Model
• Predictions

• Graph
• Customized Graph
• Click Other and then Group Code
• Fill in values as below

Click Apply

The slope of cses for public schools (sector 0) is 2.939 because this is the coefficient for cses.  Catholic schools (sector = 1) have a slope of 2.939 +  -1.644 = 1.295 (coefficient for cses plus the coefficient for csesBYsector, illustrated in the graph above).

The meansesBYcses interaction is tricky to interpret.  The interaction term is 1.042, which means that for every one unit increase in the average SES for a school, the slope between cses and MathAch is expected to increase by 1.042.  In short, as the SES for a school increases, we predict the slope of cses would increase.  We can visually see this as shown below.

Viewing predicted values for meansesBYcses interaction

• Model
• Predictions

• Graph
• Customized Graph
• Fill in values as below

Click Apply.

The slope of cses when meanses is 0 is 12.114.  As meanses increases, so does cses, because the meansesBYcses interaction term is positive (1.042).  In fact, when meanses increases by 1 point, the slope for cses increases by 1.042.  So if a school has a meanses of 1, then the predicted slope for cses would be 12.114 + 1.042.

Repeating the Examples

In the examples above, we have simplified matters in a number of ways to help focus on the process of creating models in the MLwiN equation editor and getting the results.  However, we have concealed a number of tasks that needed to be performed to be able to run the models; namely, how to input the data into MLwiN and how to create some of the variables used in the models.  We will repeat the steps of running examples 1, 2 and 3 again but this time show how to input the data into MLwiN and how to create the variables.

Inputting Data

We believe that the easiest way to get your data into MLwiN is to convert it from a Stata data file to an MLwiN data file.  There are other methods for getting your data into MLwiN, but we have found them to be much more difficult than this method.  We have the data file as a Stata data file called hsb12a.dta .  Notice this is called hsb12a -- this file is unlike the prior file not only because it is a Stata file, but also because it does not have lots of the variables already computed for us, namely meanses, cses, meansesBYcses and sectorBYcses

We use a Stata command called stata2mlwin to help convert the Stata data file to an MLwiN data file.  You can get stata2mlwin from within Stata like this.

net from http://www.ats.ucla.edu/stat/stata/ado/analysis
net install stata2mlwin

Assuming you have downloaded hsb12a.dta as c:\alda\hsb12a.dta you can convert the file like this.

stata2mlwin using hsb12a

Then, you can complete the conversion by starting MLwiN and doing the following steps.

• Data Manipulation
Command Interface
at the command prompt type obey c:\alda\hsb12a.obe
• To save the file choose File then Save as Worksheet and then C:\alda\hsb12a

Model 1 (again): Predicting MathAch from meanses

This model is from page 331 of the Singer article.

This process is much like before, however we need to do a couple of extra steps that were done for us before:
1) we must sort the data on school and then student (our level 2 and level 1 variables)
2) We need to create meanses

• Model
Equations
• Click y
Select MathAch as Y variable
Select 2 levels
Level 2 is school

Level 1 is student

Click done
• Sort the data by school  and student
Choose Data Manipulation then Sort

Sort on two keys
school
student
Choose all input columns
Under Output Columns click Same as input
Click Execute
Close window
• Click B0
Select cons
Click j(school) to indicate it random at level 2

Click i(student) to indicate is random at level 1
Click Done
• Make meanses by
Choose Data Manipulation
Multivariate Data Manipulations
Operation - Average
On blocks defined by - School
Input Columns - ses
Output Column - Choose c13 and press Ctrl-N to rename it to meanses

Click Execute

Close window
• Click Add Term and click B1
Select meanses
Click Done
• Show names by clicking Name
• Show subscripts by clicking Subs
• Expand the model by clicking the + button, and then the + button again.

The Equations window should then look like the one below.

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.

Model 2 (again): Predicting MathAch from cses

This model is from page 335 of the Singer article.

We need to add an extra step that we did not perform before -- we need to create cses (the group mean centered version of ses).  We do this by taking ses minus the average value of ses within the school meanses.

• Make cses by choosing Data Manipulation then Calculate

Click on c14 and press Ctrl-N to rename it to cses
Double click on cses

Click = button then ses then - button then meanses then Calculate button.
Close window
• Click B1
Change meanses to cses
Check the school box to make it random at level 2.

The Equations window should then look like the one below.

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.

Model 3 (again): Predicting MathAch from meanses, sector, cses, meansesBYcses and sectorBYcses

This model is shown in the Singer article on pages 337 and 338.

To run this model we need to create meansesBYcses by multiplying meanses by cses.  We also need to create the interaction of cses by sector (a continuous by categorical interaction).

• Click Add Term and click B2
Select meanses
Click Done
• Make meansesBYcses by choosing Data Manipulation then Calculate

Click on c15 and press Ctrl-N to rename it to meansesBYcses
Double click on meansesBYcses

Click = button then double click meanses then * button then double click meanses then Calculate button.

Close window
• Click Add Term and click B3
Select meansesBYcses
Click Done
• Create and add cses by sector interaction by clicking Model then Main Effects and Interactions
Under categorical pick sector

Select sector

Enter 0 for public and 1 for catholic then Apply

Close the window
Under categorical, select sector (again)
Under continuous select cses
Click Main Effects and fill in the boxes to get the main effects and interactions like this.

Click Build
Close the window

Your model should look like this (but maybe out of order).

• Run the model by clicking Start
• Show estimates by clicking Estimates a couple of times.

The results of model should look like this below.