UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS FAQ
How can I do path analysis in SAS?

It is possible to estimate recursive path models using ordinary least squares regression, but using the SAS proc tcalis can make the processes easier and will also provide estimates of direct and indirect effects.

Let's say that we want to estimate the following path model using the hsb2 (hsb2.sas7bdat) dataset.

We will begin computing the correlation between the two exogenous variables, read and write. We assume that the data file, hsb2.sas7bdat, is located in the data directory on the C: drive. You may need to change these values for your particular computer configuration.
proc corr data='C:\data\hsb2';
  var read write;
run;


The CORR Procedure

2 Variables: READ WRITE


Simple Statistics

Variable  N    Mean      Std Dev    Sum     Minimum    Maximum  Label
READ     200  52.23000  10.25294   10446   28.00000   76.00000   reading score
WRITE    200  52.77500   9.47859   10555   31.00000   67.00000   writing score


Pearson Correlation Coefficients, N = 200
Prob > |r| under H0: Rho=0

                    READ          WRITE

READ              1.00000       0.59678
reading score                    <.0001

WRITE             0.59678       1.00000
writing score      <.0001
This path analysis is really just two regression models. The first model is math = constant + read + write while the second model is science = constant + math + read + write. In proc tcalis we set up the model by entering the response variable with each predictor variable and the name of the parameter being estimated in the path part of the command. In the effpart part of the command we list the paths for direct and indirect effects.
proc tcalis data='C:\data\hsb2';
  path                      /* specification of path model */
   science <- math  beta1,
   science <- read  beta2,
   science <- write beta3,
   math    <- read  beta4,
   math    <- write beta4;
  effpart                   /* for direct and indirect effects */
   science <- read write; 
run;
We can now run the proc tcalis command which produces the output shown below. There is a lot of output but we will be focusing on the standardized results given near the end and shown in bold.
The TCALIS Procedure
Covariance Structure Analysis: Model and Initial Values

     Modeling Information

Data Set            WC000001.HSB2
N Records Read      200
N Records Used      200
N Obs               200
Model Type          PATH


       Variables in the Model

Endogenous    Manifest    MATH  SCIENCE
             Latent
Exogenous     Manifest    READ  WRITE
             Latent

 Number of Endogenous Variables = 2
 Number of Exogenous Variables  = 2


        Initial Estimates for PATH List

---------Path---------    Parameter      Estimate

SCIENCE    <-    MATH     beta1                 .
SCIENCE    <-    READ     beta2                 .
SCIENCE    <-    WRITE    beta3                 .
MATH       <-    READ     beta4                 .
MATH       <-    WRITE    beta4                 .


  Initial Estimates for Variance Parameters

Variance
Type         Variable    Parameter      Estimate

Exogenous    READ        _Add1                 .
             WRITE       _Add2                 .
Error        MATH        _Add3                 .
             SCIENCE     _Add4                 .

NOTE: Parameters with prefix '_Add' are added by PROC TCALIS.


Initial Estimates for Covariances Among Exogenous Variables

Var1     Var2    Parameter      Estimate

WRITE    READ    _Add5                 .

NOTE: Parameters with prefix '_Add' are added by PROC TCALIS.

                Simple Statistics

       Variable                  Mean       Std Dev

READ       reading score      52.23000      10.25294
WRITE      writing score      52.77500       9.47859
MATH       math score         52.64500       9.36845
SCIENCE    science score      51.85000       9.90089


       Initial Estimation Methods

      1    Observed Moments of Variables
      2    McDonald Method


             Optimization Start
             Parameter Estimates

   N    Parameter      Estimate      Gradient

   1    beta1           0.31901    1.4539E-15
   2    beta2           0.30153    -6.995E-16
   3    beta3           0.20653    -5.578E-16
   4    beta4           0.38137       0.00682
   5    _Add1         105.12271    1.3207E-19
   6    _Add2          89.84359    5.1911E-20
   7    _Add3          42.65279    7.0238E-18
   8    _Add4          49.01931    -2.963E-18
   9    _Add5          57.99673    -5.525E-20

 Value of Objective Function = 0.0026412093


Levenberg-Marquardt Optimization

Scaling Update of More (1978)

Parameter Estimates                                 9
Functions (Observations)                           10

                             Optimization Start

Active Constraints                     0    Objective Function    0.0026412093
Max Abs Gradient Element    0.0068163427    Radius                           1


                                                                                         Ratio
                                                                                       Between
                                                                                        Actual
                                                      Objective   Max  Abs                 and
                Function       Active      Objective   Function   Gradient           Predicted
Iter   Restarts    Calls  Constraints       Function     Change    Element   Lambda     Change
  1          0         4            0        0.00264   1.593E-6   3.735E-8        0      1.000

                                     Optimization Results

Iterations                                    1  Function Calls                              7
Jacobian Calls                                3  Active Constraints                          0
Objective Function                  0.002639616  Max Abs Gradient Element          3.735413E-8
Lambda                                        0  Actual Over Pred Change          0.9999999993
Radius                             0.0035701627

Convergence criterion (ABSGCONV=0.00001) satisfied.


                           Fit Summary

Modeling Info       N Observations                             200
                    N Variables                                  4
                    N Moments                                   10
                    N Parameters                                 9
                    N Active Constraints                         0
                    Independence Model Chi-Square         369.6536
                    Independence Model Chi-Square DF             6
Absolute Index      Fit Function                            0.0026
                    Chi-Square                              0.5253
                    Chi-Square DF                                1
                    Pr > Chi-Square                         0.4686
                    Z-Test of Wilson & Hilferty             0.0617
                    Hoelter Critical N                        1457
                    Root Mean Square Residual (RMSR)        0.6981
                    Standardized RMSR (SRMSR)               0.0075
                    Goodness of Fit Index (GFI)             0.9987
Parsimony Index     Adjusted GFI (AGFI)                     0.9868
                    Parsimonious GFI                        0.1664
                    RMSEA Estimate                          0.0000
                    RMSEA Lower 90% Confidence Limit             .
                    RMSEA Upper 90% Confidence Limit        0.1673
                    Probability of Close Fit                0.5686
                    ECVI Estimate                           0.0954
                    ECVI Lower 90% Confidence Limit              .
                    ECVI Upper 90% Confidence Limit         0.1264
                    Akaike Information Criterion           -1.4747
                    Bozdogan CAIC                          -5.7730
                    Schwarz Bayesian Criterion             -4.7730
                    McDonald Centrality                     1.0012
Incremental Index   Bentler Comparative Fit Index           1.0000
                    Bentler-Bonett NFI                      0.9986
                    Bentler-Bonett Non-normed Index         1.0078
                    Bollen Normed Index Rho1                0.9915
                    Bollen Non-normed Index Delta2          1.0013
                    James et al. Parsimonious NFI           0.1664


                                 PATH List

                                                      Standard
---------Path---------    Parameter      Estimate         Error       t Value

SCIENCE    <-    MATH     beta1           0.31901       0.07599       4.19778
SCIENCE    <-    READ     beta2           0.30153       0.06691       4.50637
SCIENCE    <-    WRITE    beta3           0.20653       0.07139       2.89302
MATH       <-    READ     beta4           0.38090       0.02625      14.50821
MATH       <-    WRITE    beta4           0.38090       0.02625      14.50821


                           Variance Parameters

Variance                                              Standard
Type         Variable    Parameter      Estimate         Error       t Value

Exogenous    READ        _Add1         105.12271      10.53865       9.97497
             WRITE       _Add2          89.84359       9.00690       9.97497
Error        MATH        _Add3          42.65279       4.27598       9.97497
             SCIENCE     _Add4          49.01931       4.91423       9.97497


              Covariances Among Exogenous Variables

                                             Standard
Var1     Var2    Parameter      Estimate         Error       t Value
WRITE    READ    _Add5          57.99673       8.02265       7.22912


        Squared Multiple Correlations

                Error         Total
Variable      Variance      Variance    R-Square

MATH          42.65279      87.76788      0.5140
SCIENCE       49.01931      97.93776      0.4995

The TCALIS Procedure
Covariance Structure Analysis: Maximum Likelihood Estimation

Stability Coefficient of Reciprocal Causation = 0

Stability Coefficient < 1

Total and Indirect Effects Converge


                   Effects on SCIENCE
          Effect / Std Error / tValue / pValue

                 Total            Direct          Indirect

READ            0.4230            0.3015            0.1215
                0.0609            0.0669            0.0301
                6.9458            4.5064            4.0324
                     0                 0                 0

WRITE           0.3280            0.2065            0.1215
                0.0658            0.0714            0.0301
                4.9860            2.8930            4.0324
                     0          0.003816                 0


                    Standardized Results for PATH List

                                                      Standard
---------Path---------    Parameter      Estimate         Error       t Value

SCIENCE    <-    MATH     beta1           0.30199       0.07066       4.27365
SCIENCE    <-    READ     beta2           0.31240       0.06792       4.59947
SCIENCE    <-    WRITE    beta3           0.19781       0.06789       2.91347
MATH       <-    READ     beta4           0.41686       0.02202      18.93443
MATH       <-    WRITE    beta4           0.38538       0.02065      18.66302


               Standardized Results for Variance Parameters

Variance                                              Standard
Type         Variable    Parameter      Estimate         Error       t Value

Exogenous    READ        _Add1           1.00000
             WRITE       _Add2           1.00000
Error        MATH        _Add3           0.48597       0.04940       9.83792
             SCIENCE     _Add4           0.50051       0.05015       9.98091


  Standardized Results for Covariances Among Exogenous Variables

                                              Standard
Var1     Var2    Parameter      Estimate         Error       t Value

WRITE    READ    _Add5           0.59678       0.04564      13.07520


             Standardized Effects on SCIENCE
          Effect / Std Error / tValue / pValue

                 Total            Direct          Indirect
READ            0.4383            0.3124            0.1259
                0.0594            0.0679            0.0304
                7.3808            4.5995            4.1472
                     0                 0                 0

WRITE           0.3142            0.1978            0.1164
                0.0613            0.0679            0.0281
                5.1272            2.9135            4.1365
                     0          0.003574                 0
We will focus our attention on the bolded parts of the output above which include the standardized results for path list, standardized results for variance parameters and the standardized effects on science. We will use the standardized estimates as our path coefficients and the square root of the variance estimates for the error. The error values are sqrt(0.48597) = .6971 (approx = 0.7) for math and sqrt(0.50051) = .70747 (approx = 0.7) for science. Now we can add the path coefficients and errors to the path diagram as shown below.

The proc tcalis also provides estimates of the direct, indirect and total effect for the two exogenous variables because we include the effpart substatement in our model. From these results we see that the indirect effect of read is about one third that of the direct effect. While for write the indirect effect is a bit more than half the size of the direct effect. For this example, the estimates for all of the direct and indirect effects were statistically significant. This is not necessarily a very common occurrence.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.