[http://www.ats.ucla.edu/stat/_headers/header1.htm][http://www.ats.ucla.edu/stat/sas/faq/header.htm][http://www.ats.ucla.edu/stat/_headers/header2.htm]

SAS FAQ:
How do I fit a variogram model to my spatial data in SAS using Proc Mixed?

We often examine data with the aim of making predictions.  Spatial data analysis is no exception.  Given measurements of a variable at a set of points in a region, we might like to extrapolate to points in the region where the variable was not measured or, possibly, to points outside the region that we believe will behave similarly.  We can base these predictions on our measured values alone by kriging or we can incorporate covariates and make predictions using a regression model.  In both scenarios, we will need to first fit a variogram model to our data. 

You can fit a variogram model graphically using proc variogram to calculate and then plot the possible models; or you can fit several variogram models using proc mixed and compare the model fits. This page walks through the second approach.  For an example of the other approach, see SAS FAQ: How do I fit a variogram model to my spatial data in SAS using Proc Variogram?.

There are several shapes that a variogram might follow and, in fitting a variogram model, we aim to mathematically describe the shape. Some commonly used variogram models are the spherical, exponential and Gaussian models.  In all three of these models, the variogram increases with distance at small distances and then levels off.  This general shape is suggestive of a spatial correlation that is positive and strong at small distances and becomes less so as distances increase until reaching a certain distance d.  Pairs of points separated by a distance greater than d appear uncorrelated. 

Proc mixed supports these three model types, as well as several others.  We will be using the thick dataset provided in the SAS documentation for proc variogram, which includes the measured thickness of coal seams at different coordinates.

data thick; 
  input east north thick @@; 
  datalines; 
   0.7  59.6  34.1   2.1  82.7  42.2   4.7  75.1  39.5  
   4.8  52.8  34.3   5.9  67.1  37.0   6.0  35.7  35.9 
   6.4  33.7  36.4   7.0  46.7  34.6   8.2  40.1  35.4    
  13.3   0.6  44.7  13.3  68.2  37.8  13.4  31.3  37.8 
  17.8   6.9  43.9  20.1  66.3  37.7  22.7  87.6  42.8  
  23.0  93.9  43.6  24.3  73.0  39.3  24.8  15.1  42.3 
  24.8  26.3  39.7  26.4  58.0  36.9  26.9  65.0  37.8  
  27.7  83.3  41.8  27.9  90.8  43.3  29.1  47.9  36.7 
  29.5  89.4  43.0  30.1   6.1  43.6  30.8  12.1  42.8 
  32.7  40.2  37.5  34.8   8.1  43.3  35.3  32.0  38.8 
  37.0  70.3  39.2  38.2  77.9  40.7  38.9  23.3  40.5 
  39.4  82.5  41.4  43.0   4.7  43.3  43.7   7.6  43.1 
  46.4  84.1  41.5  46.7  10.6  42.6  49.9  22.1  40.7 
  51.0  88.8  42.0  52.8  68.9  39.3  52.9  32.7  39.2 
  55.5  92.9  42.2  56.0   1.6  42.7  60.6  75.2  40.1 
  62.1  26.6  40.1  63.0  12.7  41.8  69.0  75.6  40.1 
  70.5  83.7  40.9  70.9  11.0  41.7  71.5  29.5  39.8 
  78.1  45.5  38.7  78.2   9.1  41.7  78.4  20.0  40.8 
  80.5  55.9  38.7  81.1  51.0  38.6  83.8   7.9  41.6 
  84.5  11.0  41.5  85.2  67.3  39.4  85.5  73.0  39.8  
  86.7  70.4  39.6  87.2  55.7  38.8  88.1   0.0  41.6 
  88.4  12.1  41.3  88.4  99.6  41.2  88.8  82.9  40.5  
  88.9   6.2  41.5  90.6   7.0  41.5  90.7  49.6  38.9  
  91.5  55.4  39.0  92.9  46.8  39.1  93.4  70.9  39.7  
  94.8  71.5  39.7  96.2  84.3  40.3  98.2  58.2  39.5 
; 

We will be looking at the fit statistics of models with our three different covariance structures and comparing the likelihoods of these models. For a baseline likelihood, we can run a model without specifying a covariance structure and obtain the likelihood of a "null" model that we hope to improve with information about the spatial structure of the data. 


proc mixed data = thick;	
  model thick = ;
run;

The Mixed Procedure

                  Model Information
Data Set                     WORK.THICK
Dependent Variable           thick
Covariance Structure         Diagonal
Estimation Method            REML
Residual Variance Method     Profile
Fixed Effects SE Method      Model-Based
Degrees of Freedom Method    Residual

           Fit Statistics
-2 Res Log Likelihood           341.8
AIC (smaller is better)         343.8
AICC (smaller is better)        343.8
BIC (smaller is better)         346.1

Next, we can run the same model with a spherical covariate structure. To indicate a spatial covariance structure in proc mixed, we use the type= option in the repeated statement. We indicate spatial with sp and then, in parentheses, specify the type of spatial structure.  Then, we provide the coordinate variables.  For example, the spherical model is specified with type = sp(sph)(east north).

proc mixed data = thick;	
  model thick = ;
  repeated / subject = intercept type = sp(sph)(east north);
run;

The Mixed Procedure

                  Model Information
Data Set                     WORK.THICK
Dependent Variable           thick
Covariance Structure         Spatial Spherical
Subject Effect               Intercept
Estimation Method            REML
Residual Variance Method     Profile
Fixed Effects SE Method      Model-Based
Degrees of Freedom Method    Between-Within

           Fit Statistics
-2 Res Log Likelihood           139.3
AIC (smaller is better)         143.3
AICC (smaller is better)        143.5
BIC (smaller is better)         147.9


  Null Model Likelihood Ratio Test
    DF    Chi-Square      Pr > ChiSq
     1        202.46          <.0001

We can see that the difference in the likelihoods between these two models (341.8-139.3 = 202.46) is statistically significant, so the spherical covariance structure better fits the data than the diagonal covariance structure that is the default in SAS. Let's repeat this process for exponential and Gaussian models.

proc mixed data = thick;	
  model thick = ;
  repeated / subject = intercept type = sp(exp)(east north);
run;
The Mixed Procedure

                  Model Information
Data Set                     WORK.THICK
Dependent Variable           thick
Covariance Structure         Spatial Exponential
Subject Effect               Intercept
Estimation Method            REML
Residual Variance Method     Profile
Fixed Effects SE Method      Model-Based
Degrees of Freedom Method    Between-Within

           Fit Statistics
-2 Res Log Likelihood           155.0
AIC (smaller is better)         159.0
AICC (smaller is better)        159.1
BIC (smaller is better)         163.6


  Null Model Likelihood Ratio Test
    DF    Chi-Square      Pr > ChiSq
     1        186.78          <.0001

proc mixed data = thick;	
  model thick = ;
  repeated / subject = intercept type = sp(gau)(east north);
run;

The Mixed Procedure

                  Model Information

Data Set                     WORK.THICK
Dependent Variable           thick
Covariance Structure         Spatial Gaussian
Subject Effect               Intercept
Estimation Method            REML
Residual Variance Method     Profile
Fixed Effects SE Method      Model-Based
Degrees of Freedom Method    Between-Within

           Fit Statistics
-2 Res Log Likelihood            80.8
AIC (smaller is better)          84.8
AICC (smaller is better)         85.0
BIC (smaller is better)          89.4


  Null Model Likelihood Ratio Test
    DF    Chi-Square      Pr > ChiSq
     1        260.95          <.0001

From the fit statistics of these models, we can see that the Gaussian covariance structure best fits our data.  If we were to further model coal seam thickness in this dataset and wished to indicate a spatial correlation in the outcome, we would choose Gaussian. This is consistent with the findings of the graphical fitting of a variogram model seen in SAS FAQ: How do I fit a variogram model to my spatial data in SAS using Proc Variogram?.  See the SAS OnlineDoc 9.1.3 for the Web for further details on the spatial covariance structures suppored by proc mixed.  

See also

References

 

[http://www.ats.ucla.edu/stat/sas/footer.htm]