UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS FAQ:
How can I calculate Moran's I in SAS?

Note: Moran's I calculations in the SAS variogram procedure is experimental in version 9.2 and unavailable in earlier versions of SAS.

Moran's I is a measure of spatial autocorrelation--how related the values of a variable are based on the locations where they were measured.  Using an autocorrelation option in proc variogram, we can generate Moran's I in SAS.

Let's look at an example. Our dataset, ozone.sas7bdat, contains ozone measurements from thirty-two locations in the Los Angeles area aggregated over one month. The dataset includes the station number (Station), the latitude and longitude of the station (Lat and Lon), and the average of the highest eight hour daily averages (Av8top). This data, and other spatial datasets, can be downloaded from the University of Illinois's Spatial Analysis Lab. We can look at a summary of our location variables to see the range of locations under consideration.

proc print data = ozone (obs = 10);
run;

Obs    Station     Av8top      Lat         Lon

  1       60      7.22581    34.1358    -117.924
  2       69      5.89919    34.1761    -118.315
  3       72      4.05289    33.8236    -118.188
  4       74      7.18145    34.1994    -118.535
  5       75      6.07661    34.0669    -117.751
  6       84      3.15726    33.9292    -118.210
  7       85      5.20161    34.0150    -118.060
  8       87      4.71774    34.0672    -118.226
  9       88      6.53226    34.0833    -118.107
 10       89      7.54032    34.3875    -118.535

To calculate Moran's I using proc variogram, we use the autocorrelation option in the compute line.  In parentheses, we indicate if we wish for our weights matrix to contain distance values or binary values.  In the coordinates line, we provide the variable names of our x- and y-coordinate variables. Finally, in the var line, we indicate which variable we are interested in testing for spatial autocorrelation.  In this example, we will look at the Av8top variable.

proc variogram data=ozone; 
  compute  novar autoc (weights=distance); 
  coordinates xc=Lon yc=Lat; 
  var Av8top; 
run; 

The VARIOGRAM Procedure
Dependent Variable: Av8top

Number of Observations Read          32
Number of Observations Used          32

             Pairs Information
Number of Lags                            11
Lag Distance                            0.25
Maximum Data Distance in Lon            2.30
Maximum Data Distance in Lat            1.06
Maximum Data Distance                   2.53

               Pairwise Distance Intervals
                                     Number
  Lag                                    of    Percentage
Class    ---------Bounds---------     Pairs      of Pairs
    0          0.00          0.13        13        2.62%
    1          0.13          0.38       106       21.37%
    2          0.38          0.63       122       24.60%
    3          0.63          0.89       101       20.36%
    4          0.89          1.14        71       14.31%
    5          1.14          1.39        41        8.27%
    6          1.39          1.65        16        3.23%
    7          1.65          1.90        13        2.62%
    8          1.90          2.15        10        2.02%
    9          2.15          2.41         3        0.60%
   10          2.41          2.66         0        0.00%

                        Autocorrelation Statistics
Assumption   Coefficient   Observed   Expected   Std Dev       Z   Pr > |Z|
Normality    Moran's I       0.0484    -0.0323    0.0085    9.49     <.0001
Normality    Geary's c       0.9392     1.0000    0.0263   -2.31     0.0206

Based on the p-value of the reported Moran's I, we can reject the null hypothesis that there is zero spatial autocorrelation in the values of Av8top

The distance matrix used in the above calculations is a 32x32 matrix where each off-diagonal entry [i, j] in the matrix is equal to 1/(1+distance between point i and point j).  Note that this is just one of several ways in which we can calculate an inverse distance matrix.   If there exists some threshold distance d such that pairs with distances less than d are "connected" or "close" and pairs with distances greater than d are not, you can indicate to SAS that binary weights should be used.  This is done in the code below for d = .75.  In the compute line, we indicate that our weights are binary and give a lagdistance of .75. 

proc variogram data=ozone; 
  compute novar lagdistance = .75 autoc (weights=binary); 
  coordinates xc=Lon yc=Lat; 
  var Av8top; 
run;
The VARIOGRAM Procedure
Dependent Variable: Av8top

Number of Observations Read          32
Number of Observations Used          32


             Pairs Information

Number of Lags                            11
Lag Distance                            0.25
Maximum Data Distance in Lon            2.30
Maximum Data Distance in Lat            1.06
Maximum Data Distance                   2.53


               Pairwise Distance Intervals

                                     Number
  Lag                                    of    Percentage
Class    ---------Bounds---------     Pairs      of Pairs

    0          0.00          0.13        13        2.62%
    1          0.13          0.38       106       21.37%
    2          0.38          0.63       122       24.60%
    3          0.63          0.89       101       20.36%
    4          0.89          1.14        71       14.31%
    5          1.14          1.39        41        8.27%
    6          1.39          1.65        16        3.23%
    7          1.65          1.90        13        2.62%
    8          1.90          2.15        10        2.02%
    9          2.15          2.41         3        0.60%
   10          2.41          2.66         0        0.00%


                        Autocorrelation Statistics

Assumption   Coefficient   Observed   Expected   Std Dev       Z   Pr > |Z|

Normality    Moran's I        0.188    -0.0323    0.0323    6.82     <.0001
Normality    Geary's c        0.794     1.0000    0.0851   -2.42     0.0156

This change in distance measure does not change our interpretation.  Based on the p-value of the reported Moran's I, we can reject the null hypothesis that there is zero spatial autocorrelation in the values of Av8top


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.