Help the Stat Consulting Group by giving a gift

**Input Data and Creating the "MDM" file**- from a level-1 and a level-2 SPSS file
- from a single SPSS file
- from a level-1 and a level-2 SAS data file

**Exploratory Data Analysis**- summary statistics
- data-based graphs

**Model Building**- unconditional means model
- regression with means-as-outcomes
- random-coefficient model
- intercepts and slopes-as-outcomes model

**Hypothesis Testing, Model Fit**- Multivariate hypothesis tests on fixed effects
- Multivariate Tests of variance-covariance components specification
- Model-based graphs

**Other Issues**- Modeling Heterogeneity of Level-1 Variances
- Models Without a Level-1 Intercept
- Constraints on Fixed Effects

The data file used for this presentation is a subsample from the 1982 High School and Beyond Survey and
is used extensively in *Hierarchical Linear Models* by Raudenbush and Bryk.
It consists of 7185 students nested in 160 schools. Here is a list of 15 or so
rows from the data file.

Let's list all the variables used in this presentation.

- id: school id, the linking variable to define the 2-level structure
- mathach: student-level math achievement score, continuous outcome variable
- student-level:
femaleandses, the social-economic-status at student level- school-level:
schtypeschool type (0 = public and 1 = private) andmeanses(ses aggregated to school level)

HLM 6 uses an "MDM" file (Multivariate Data Matrix) for hierarchical linear models. An MDM file is a binary file and is constructed based on an MDM template file. A template file is an ASCII file containing information on the location and the structure of the data files. Once the MDM file is created, HLM does not need the original data files anymore for the subsequent analyses. This enables HLM to perform very efficient calculations for the models.

It is worth mentioning that HLM does not have any data management capability. That is to say that most of the variables in a model have to be created outside HLM, in other statistical packages, such as in SPSS. For example, if you have a categorical variable at level-1 and you want to include it and possibly some interaction terms with other level-1 variables in the model, then you have create all the dummy variables and all the interaction terms before entering your data into HLM. In short, HLM assumes that you have cleaned your data files and have done all the exploratory statistical analysis and ready to do your multilevel analysis.

1. Creating MDM from a level-1 and a level-2 data files in SPSS format

HLM website has many examples including some detailed ones with screen shots on how to create an MDM file using SPSS input file.

- Two data sets are usually required for a two-level model. A level-1 data file and a level-2 data file. The two files are linked by a common level-2 id variable.
- Level-1 cases must be grouped together by their level-2 id. A usual strategy is to sort both the level-1 data file and the level-2 data file by the level-2 id variable and save them before entering them into HLM.
- The ID variable can be either numeric or character.
- All other variables in the data file must be numeric.

2. Creating MDM from a single SPSS data file

One improvement that HLM 6 offers is that HLM 6.x allows the use of a single data file containing both the level-1 and level-2 variables. The single data set should be sorted by the level-2 id variable and the steps are basically the same as the steps for using level-1 and level-2 data files, except the same data file is used twice, once for level-1 and once for level-2. HLM will figure out that it has to aggregate the single data file to get the level-2 variables. If the single file is huge, it might be more efficient use the two-file approach.

For level-1, we choose these variables:

For level-2, we choose these variables:

The last steps consist of a couple of clicks: Make MDM => Check Stats => Done.

3. Creating MDM from a level-1 and a level-2 data files in SAS format

Let's say that we have the HS&B file in SAS sas7bdat format,
hsb1.sas7bdat and hsb2.sas7bdat. We can
follow a similar routine to import the data files. HLM uses DBMSCOPY to import data
files of different formats. For example, to import files in .sas7bdat format, the
first thing to do is to set the type of data to **other non-ASCII** data via
the **File** then **Preferences** pull-down menu.

Following similar steps as described in the example of import SPSS files and also by choosing the right data file type when we "Browse" to choose, we will get to the following window:

The rest of the routine is fairly straightforward and we will demonstrate during the seminar and skip the minute details here.

4. What files have been created?

Let's now go back to the approach of using a single SPSS input file and find out what files have been created and how to use them in the future. Here is the list of files that are created during the process of creating the MDM file:

The MDM file test.mdm can be opened directly in HLM for analyses. What needs
to point out is the template file. The template file **test.mdmt** is an
ASCII file and here what it contains:

#HLM2 MDM CREATION TEMPLATE growthmodel:n rawdattype:spss l1fname:C:\Data\for_hlm.sav l2fname:C:\Data\for_hlm.sav l1missing:n timeofdeletion:now mdmname:test.mdm *begin l1vars level2id:ID MINORITY FEMALE SES MATHACH *end l1vars *begin l2vars level2id:ID SECTOR MEANSES *end l2vars

If we just want to add a few new variables from the original data file, we can open this template file from within HLM or edit the template file directly.

The .STS file contains the descriptive statistics and is useful in checking if the data file used in creating the MDM file is what we think it is.

LEVEL-1 DESCRIPTIVE STATISTICS

VARIABLE NAME N MEAN SD MINIMUM MAXIMUM MINORITY 7185 0.27 0.45 0.00 1.00 FEMALE 7185 0.53 0.50 0.00 1.00 SES 7185 0.00 0.78 -3.76 2.69 MATHACH 7185 12.75 6.88 -2.83 24.99

LEVEL-2 DESCRIPTIVE STATISTICS

VARIABLE NAME N MEAN SD MINIMUM MAXIMUM SECTOR 160 0.44 0.50 0.00 1.00 MEANSES 160 -0.00 0.41 -1.19 0.83

HLM offers some really nice data-based graphs. It is always a good idea to plot our data before constructing our models.

**1. Box-whisker plot**

**2. Scatter plot**

Model 1:Unconditional Means Model

This model is referred as a one-way random effect ANOVA and is the
simplest possible random effect linear model. The motivation for this model is
the question on how much schools vary in their mean mathematics
achievement. In terms of equations, we have the following, where r_{ij}
~ N(0, σ^{2}) and u_{0j }~ N(0, τ^{2}),

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= γ_{00 }+ u_{0j}

The data source for this run = C:\Data\test.mdm The command file for this run = whlmtemp.hlm Output file name = C:\Data\hlm2.txt The maximum number of level-1 units = 7185 The maximum number of level-2 units = 160 The maximum number of iterations = 100 Method of estimation: restricted maximum likelihood

Weighting Specification ----------------------- Weight Variable Weighting? Name Normalized? Level 1 no Level 2 no Precision no

The outcome variable is MATHACH

The model specified for the fixed effects was: ----------------------------------------------------

Level-1 Level-2 Coefficients Predictors ---------------------- --------------- INTRCPT1, B0 INTRCPT2, G00

The model specified for the covariance components was: ---------------------------------------------------------

Sigma squared (constant across level-2 units)

Tau dimensions INTRCPT1

Summary of the model specified (in equation format) ---------------------------------------------------

Level-1 Model

Y = B0 + R

Level-2 Model B0 = G00 + U0

Iterations stopped due to small change in likelihood function ******* ITERATION 4 *******

Sigma_squared = 39.14831

Tau INTRCPT1,B0 8.61431

Tau (as correlations) INTRCPT1,B0 1.000

---------------------------------------------------- Random level-1 coefficient Reliability estimate ---------------------------------------------------- INTRCPT1, B0 0.901 ----------------------------------------------------

The value of the likelihood function at iteration 4 = -2.355840E+004 The outcome variable is MATHACH

Final estimation of fixed effects: ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.636972 0.244412 51.704 159 0.000 ----------------------------------------------------------------------------

The outcome variable is MATHACH

Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.636972 0.243628 51.870 159 0.000 ----------------------------------------------------------------------------

Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, U0 2.93501 8.61431 159 1660.23259 0.000 level-1, R 6.25686 39.14831 -----------------------------------------------------------------------------

Statistics for current covariance components model -------------------------------------------------- Deviance = 47116.793477 Number of estimated parameters = 2

**Notes:**

- The model we fit was

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= γ_{00 }+ u_{0j}

Filling in the parameter estimates we get

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= 12.64_{ }+ u_{0j }

V(r_{ij}) = 39.15

V(u_{0j}) = 8.61

- If we describe our model in terms of a single equation, we will have to
substitute the level-2 equation back to level-1 equation. Here is how it
will look like in a single equation as shown in the HLM "
**mixed**" window: MATHACH_{ij }= γ_{00 }+ u_{0j }+ r_{ij}. - The estimated between variance, τ
^{2}corresponds to the term**INTRCPT1**in the output of Final estimation of variance components and the estimated within variance, σ^{2}, corresponds to the term**level-1**in the same output section. - Based on the covariance estimates, we can compute the intra-class correlation: 8.61431/(8.61431 + 39.14831) = .18. This tells us the portion of the total variance that occurs between schools.
- To measure the magnitude of the variation among schools in their mean
achievement levels, we can calculate the
*plausible values range*for these means, based on the between variance we obtained from the model: 12.64 ± 1.96*(8.61)^{1/2}= (6.89, 18.39). - The
**reliability**of the random effect of level-1 intercept is the average reliability of the level-2 units. It measures the overall reliability of the OLS estimates for each of the intercept.

Model 2: Including Effects of School Level (level 2) Predictors --predictingmathachfrommeanses

This model is referred as regression with Means-as-Outcomes by Raudenbush and
Bryk. The motivation of this model is the question on if the schools with high
**MEANSES** also have high math achievement. In other words, we want to
understand why there is a school difference on mathematics achievement. In terms
of regression equations, we have the following.

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= γ_{00 }+ γ_{01}(MEANSES)
+ u_{0j}

Final estimation of fixed effects: ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.649436 0.149280 84.736 158 0.000 MEANSES, G01 5.863538 0.361457 16.222 158 0.000 ----------------------------------------------------------------------------

The outcome variable is MATHACH

Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.649436 0.148377 85.252 158 0.000 MEANSES, G01 5.863538 0.320211 18.311 158 0.000 ----------------------------------------------------------------------------

Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, U0 1.62441 2.63870 158 633.51744 0.000 level-1, R 6.25756 39.15708 -----------------------------------------------------------------------------

Statistics for current covariance components model -------------------------------------------------- Deviance = 46959.446959 Number of estimated parameters = 2

**Notes:**

- The model we fit was

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= γ_{00 }+ γ_{01}(MEANSES) + u_{0j}Filling in the parameter estimates we get

MATHACH_{ij }= β_{0j }+ r_{ij}

β_{0j }= 12.65 +5.86(MEANSES) + u_{0j }V(r_{ij}) = 39.16

V(u_{0j}) = 2.64

- In a single equation our model will be written as:
MATHACH
_{ij }= γ_{00 }+ γ_{01}(MEANSES) + u_{0j }+ r_{ij}. - The coefficient for the constant is the predicted math achievement when all predictors are 0, so when the school has mean SES of 0, the students' math achievement is predicted to be 12.65.
- A range of plausible values for school means, given that all schools
having
**meanses**of zero, is 12.65 ± 1.96 *(2.64)^{1/2}= (9.47, 15.83). - The variance component representing variation between schools decreases
greatly (from 8.61 to 2.64). This means that the level-2 variable
**meanses**explains a large portion of the school-to-school variation in mean math achievement. More precisely, the proportion of variance explained by**meanses**is (8.61 - 2.64)/8.61 = .69, that is about 69% of the explainable variation in school mean math achievement scores can be explained by**meanses**. - Do school achievement means still vary
significantly once
**meanses**is controlled? The output of Final estimation of variance components gives the test for the variance component for the INTRCPT1 to be zero with chi-square of 633.52 of 158 degrees of freedom. This is highly significant. Therefore, we conclude that after controlling for**meanses**, significant variation among school mean math achievement still remains to be explained.

Model 3:Including Effects of Student-Level Predictors--predictingmathachfrom student-level ses

This model is referred as a random-coefficient model by Raudenbush and Bryk.
Pretend that we run regression of **mathach** on **ses** on
each school, that is we are going to run 160 regressions.

- What would be the average of the 160 regression equations (both intercept and slope)?
- How much do the regression equations vary from school to school?
- What is the correlation between the intercepts and slopes?

These are some of the questions that motivates the following model.

MATHACH_{ij }= β_{0j }+ β_{1j}
SES + r_{ij}

β_{0j }= γ_{00 } + u_{0j
}β_{1j }= γ_{10 } + u_{1j}

Sigma_squared = 36.82835 Tau INTRCPT1,B0 4.82978 -0.15399 SES,B1 -0.15399 0.41828 Tau (as correlations) INTRCPT1,B0 1.000 -0.108 SES,B1 -0.108 1.000 ---------------------------------------------------- Random level-1 coefficient Reliability estimate ---------------------------------------------------- INTRCPT1, B0 0.797 SES, B1 0.179 ---------------------------------------------------- The value of the likelihood function at iteration 21 = -2.331928E+004 The outcome variable is MATHACH Final estimation of fixed effects: ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.664935 0.189874 66.702 159 0.000 For SES slope, B1 INTRCPT2, G10 2.393878 0.118278 20.240 159 0.000 ---------------------------------------------------------------------------- The outcome variable is MATHACH Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.664935 0.189251 66.921 159 0.000 For SES slope, B1 INTRCPT2, G10 2.393878 0.117697 20.339 159 0.000 ---------------------------------------------------------------------------- Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, U0 2.19768 4.82978 159 905.26472 0.000 SES slope, U1 0.64675 0.41828 159 216.21178 0.002 level-1, R 6.06864 36.82835 ----------------------------------------------------------------------------- Statistics for current covariance components model ------------------------------------------------ Deviance = 46638.560929 Number of estimated parameters = 4

**Notes:**

- The model we fit was

MATHACH_{ij }= β_{0j }+ β_{1j}(SES) + r_{ij}

β_{0j }= γ_{00 }+ u_{0j }β_{1j }= γ_{10 }+ u_{1j}Filling in the parameter estimates we get

MATHACH_{ij }= β_{0j }+ β_{1j}(SES) + r_{ij}

β_{0j }= 12.66_{ }+ u_{0j }β_{1j }= 2.39 + u_{1j}

V(r_{ij}) = 36.82

V(u_{0j}) = 4.83

V(u_{1j}) = .42

- In a single equation our model will be written as:

MATHACH_{ij }= γ_{00 }+ u_{0j }+ (γ_{10 }+ u_{1j})(SES) + r_{ij = }γ_{00 }+ γ_{10 }*(SES) + u_{0j }+ u_{1j}*(SES) + r_{ij} - The estimate for the variance of the slope fo
**ses**is 0.42. The p-value is .002. The test being significant tells us that we can not accept the hypothesis that there is no difference in slopes of**ses**among schools. - The 95% plausible value range for the school means when the
**ses**is zero is 12.66 ± 1.96 *(4.83)^{1/2}= (8.35, 16.97). - The 95% plausible value range for the
SES-achievement slope is 2.39 ± 1.96
*(.42)
^{1/2}= (1.12, 3.66). - Notice that the residual variance is now 36.82, comparing with the residual variance of 39.15 in the one-way ANOVA with random effects model. We can compute the proportion variance explained at level 1 by (39.15 - 36.82) / 39.15 = .060. This means using student-level SES as a predictor of math achievement reduced the within-school variance by 6%.

Model 4:Including Both Level-1 and Level-2 Predictors --predictingmathachfrommeanses, schtype,group-centeredsesand the cross level interaction ofmeansesandschtypewith group-centeredses.

This model is referred as an intercepts and slopes-as-outcomes model by Raudenbush and Bryk. We have examined the variability of the regression equations across schools. Now we are ready to build our final model based on our theory and our preliminary analyses.

MATHACH_{ij }= β_{0j }+ β_{1j}
(SES - MEANSES) + r_{ij}

β_{0j }= γ_{00 } + γ_{01}(SCHTYPE)
+ γ_{02}(MEANSES) + u_{0j
}β_{1j }= γ_{10 } + γ_{11}(SCHTYPE) + γ_{12}(MEANSES)
+ u_{1j}

Sigma_squared = 36.70313 Tau INTRCPT1,B0 2.37996 0.19058 SES,B1 0.19058 0.14892 Tau (as correlations) INTRCPT1,B0 1.000 0.320 SES,B1 0.320 1.000 ---------------------------------------------------- Random level-1 coefficient Reliability estimate ---------------------------------------------------- INTRCPT1, B0 0.733 SES, B1 0.073 ---------------------------------------------------- The value of the likelihood function at iteration 61 = -2.325094E+004 The outcome variable is MATHACH Final estimation of fixed effects: ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.096006 0.198734 60.865 157 0.000 SCHTYPE, G01 1.226384 0.306272 4.004 157 0.000 MEANSES, G02 5.333056 0.369161 14.446 157 0.000 For SES slope, B1 INTRCPT2, G10 2.937981 0.157135 18.697 157 0.000 SCHTYPE, G11 -1.640954 0.242905 -6.756 157 0.000 MEANSES, G12 1.034427 0.302566 3.419 157 0.001 ---------------------------------------------------------------------------- The outcome variable is MATHACH Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.096006 0.173699 69.638 157 0.000 SCHTYPE, G01 1.226384 0.308484 3.976 157 0.000 MEANSES, G02 5.333056 0.334600 15.939 157 0.000 For SES slope, B1 INTRCPT2, G10 2.937981 0.147620 19.902 157 0.000 SCHTYPE, G11 -1.640954 0.237401 -6.912 157 0.000 MEANSES, G12 1.034427 0.332785 3.108 157 0.003 ---------------------------------------------------------------------------- Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, U0 1.54271 2.37996 157 605.29503 0.000 SES slope, U1 0.38590 0.14892 157 162.30867 0.369 level-1, R 6.05831 36.70313 ----------------------------------------------------------------------------- Statistics for current covariance components model -------------------------------------------------- Deviance = 46501.875643 Number of estimated parameters = 4

Notes:

- The model we fit was

MATHACH_{ij }= β_{0j }+ β_{1j}(SES - MEANSES) + r_{ij}

β_{0j }= γ_{00 }+ γ_{01}(SCHTYPE) + γ_{02}(MEANSES) + u_{0j }β_{1j }= γ_{10 }+ γ_{11}(SCHTYPE) + γ_{12}(MEANSES) + u_{1j}Filling in the parameter estimates we get

MATHACH_{ij }= β_{0j }+ β_{1j}(SES - MEANSES) + r_{ij}

β_{0j }= 12.10_{ }+ 1.22(SCHTYPE) + 5.33(MEANSES) + u_{0j }β_{1j }= 2.94 + -1.64(SCHTYPE) + 1.03(MEANSES) + u_{1j}V(r

_{ij}) = 36.7

V(u_{0j}) = 2.37

V(u_{1j}) = .15

- In a single equation our model will be written as:

MATHACH_{ij }= γ_{00 }+ γ_{01}(MEANSES) + γ_{02}(SCHTYPE) + u_{0j }+ (γ_{10 }+ γ_{11}(MEANSES) + γ_{12}(SCHTYPE) + u_{1j})* (SES - MEANSES) + r_{ij}

= γ_{00 }+ γ_{01}(MEANSES) + γ_{02}(SCHTYPE)

+ γ_{10}*(SES-MEANSES)_{ }+ γ_{11}*MEANSES*(SES-MEANSES) + γ_{12*}SCHTYPE*(SES-MENASES)

+ u_{0j }+ u_{1j}* (SES - MEANSES) + r_{ij} - The estimate for the variance of the SES slope is .15 with p-value .369.
That means that the hypothesis that the there is no significant variation among the
slope of grouped-centered
**ses**can not be rejected. We may want to use a simpler model where the slope of SES varies non-randomly with respect to level-2 variable**meanses**and**schtype**. We will show later how to compare the two models. - The correlation between the level-1 intercept and the slope for SES is given as .32 from the earlier part of the output.

1. Multivariate Hypothesis Tests on Fixed Effects

We will test the effect of **schtype** on the intercept and on the slope of
**ses** simultaneously. This will be a test of two degrees of freedom.

Click on the box labeled "1" and then fill out the boxes below to indicate we wish to test
jointly that γ_{01} = 0 and γ_{11 } =
0 .

Results of General Linear Hypothesis Testing ----------------------------------------------------------------------------- Coefficients Contrast ----------------------------------------------------------------------------- For INTRCPT1, B0 INTRCPT2, G00 12.096006 0.000 0.000 SCHTYPE, G01 1.226384 1.000 0.000 MEANSES, G02 5.333056 0.000 0.000 For SES slope, B1 INTRCPT2, G10 2.937981 0.000 0.000 SCHTYPE, G11 -1.640954 0.000 2.000 MEANSES, G12 1.034427 0.000 0.000 Chi-square statistic = 60.596880 Degrees of freedom = 2 P-value = 0.000000

2. Multivariate Tests of Variance-Covariance Components Specification

From Model 4 that we ran before, we saw that the variance for the slope of
group-centered **ses** is not very large and its p-value is not statistically significant. This suggests that we may not want to model
the group-centered **ses** as a random effect. A simpler model will be that the slope of variable
**ses**
varies non-randomly on level-2 variables **schtype** and **meanses**. We may
want to compare these two models to decide if the simpler model is just about as
good as the previous one.

- REML (restricted maximum likelihood) vs. FML (full maximum
likelihood)
- REML and FML will usually produce similar results for the level-1
residual (σ
^{2}), but there can be noticeable differences for the variance-covariance matrix of the random effects - REML is the default estimation method for HLM.
- If the number of level-2 units is large, then the difference will be small.
- If the number of level-2 units is small , then FML variance estimates will be smaller than REML, leading to artificially short confidence interval and significant tests.

- REML and FML will usually produce similar results for the level-1
residual (σ
- Nested Models
- fixed effects are the same, only fewer random effects , then REML or FML are both fine for likelihood ratio tests;
- One model has fewer fixed effects and possibly fewer random effects, then use FML to compare models using likelihood ratio tests.

To compare two models, we will have to obtain the deviance (which is just
-2*log likelihood) for the first model and enter it to the **Hypothesis Testing
**before running the second model.

Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, U0 1.54118 2.37524 157 604.29895 0.000 level-1, R 6.06351 36.76611 -----------------------------------------------------------------------------

Statistics for current covariance components model --------------------------------------------------Deviance = 46502.952743Number of estimated parameters = 2

Statistics for current covariance components model -------------------------------------------------- Deviance = 46501.875643 Number of estimated parameters = 4

Variance-Covariance components test ----------------------------------- Chi-square statistic = 1.07710 Number of degrees of freedom = 2 P-value = >.500

3. Model-based Graphs

HLM 6 offers many model-based graphs. The graphs below are based on the following model.

Level 1 equation Graphing:

Level-2 EB/OLS coefficient confidence intervals

1. Modeling Heterogeneity of Level-1 Variances

Sometimes, the level-1 variance might be heterogeneous. For example, we may expect that female students and male students have different variances. Thus, we want to model the level-1 variance to be a function of variable female.

From pull-down menu **Other Settings** => **Estimation Settings =>
Heterogeneous .sigma^2**. We then have a choice on which variable(s) to
choose to model the heterogeneity. Here we picked the variable **female**.

RESULTS FOR HETEROGENEOUS SIGMA-SQUARED (macro iteration 4) Var(R) = Sigma_squared and log(Sigma_squared) = alpha0 + alpha1(FEMALE) Model for level-1 variance -------------------------------------------------------------------- Standard Parameter Coefficient Error Z-ratio P-value -------------------------------------------------------------------- INTRCPT1 ,alpha0 3.66570 0.024718 148.301 0.000 FEMALE ,alpha1 -0.12106 0.033936 -3.567 0.001 -------------------------------------------------------------------- Summary of Model Fit ------------------------------------------------------------------- Model Number of Deviance Parameters ------------------------------------------------------------------- 1. Homogeneous sigma_squared 10 46494.59261 2. Heterogeneous sigma_squared 11 46482.09334 ------------------------------------------------------------------- Model Comparison Chi-square df P-value ------------------------------------------------------------------- Model 1 vs Model 2 12.49926 1 0.001

2. Models Without a Level-1 Intercept

Sometimes, we may want to exclude the intercept from our model. For example, we may have a level-1 categorical variable and we want to include all the categories of this variable in the model. To this end, we have to exclude the intercept, otherwise our model will be over-parameterized. To this end, we are going to create another binary variable for male (=1-female). As we have mentioned before, since HLM does not have any data management facility, we have to create this variable outside HLM. We chose SPSS for this task and modified the template file created earlier to create a new MDM file.

The outcome variable is MATHACH Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For FEMALE slope, B1 INTRCPT2, G10 10.684432 0.298122 35.839 158 0.000 SCHTYPE, G11 2.932540 0.446512 6.568 158 0.000 For MALE slope, B2 INTRCPT2, G20 12.174859 0.322616 37.738 158 0.000 SCHTYPE, G21 2.597771 0.487027 5.334 158 0.000 ---------------------------------------------------------------------------- Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- FEMALE slope, U1 2.41260 5.82064 121 481.99916 0.000 MALE slope, U2 2.64370 6.98917 121 483.25462 0.000 level-1, R 6.22438 38.74285 -----------------------------------------------------------------------------

3. Constraints on Fixed Effects

Let's say that we believe that the effect of **schtype **is the same on
both female and male. We need to impose the constraint
γ_{11 = }γ_{21. }

The outcome variable is MATHACH

Final estimation of fixed effects (with robust standard errors) ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For FEMALE slope, B1 INTRCPT2, G10 10.723664 0.295717 36.263 158 0.000 SCHTYPE, G11 * 2.804823 0.417646 6.716 158 0.000 For MALE slope, B2 INTRCPT2, G20 12.103608 0.313462 38.613 159 0.000 ---------------------------------------------------------------------------- The "*" gammas have been constrained. See the table on the header page.

Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- FEMALE slope, U1 2.40847 5.80071 121 484.11557 0.000 MALE slope, U2 2.63048 6.91943 121 483.35444 0.000 level-1, R 6.22449 38.74426 -----------------------------------------------------------------------------

HLM has some very nice features for multilevel data analysis, including

- a very intuitive interface for specifying the model using a multi-equation format;
- easy to create cross-level interactions;
- produces many data-based and model-based graphs;
- latent variable regression;
- use of multiple imputed data;
- use of sampling weight

- Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling by Tom Snijders and Roel Bosker
- Introduction to Multilevel Modeling by Ita Kreft and Jan de Leeuw
- Multilevel Analysis: Techniques and Applications by Joop Hox
- Hierarchical Linear Models, Second Edition by Stephen Raudenbush and Anthony Bryk
- HLM 6 - Hierarchical Linear and Nonlinear Modeling by Raudenbush et al.

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.