UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS Class Notes
Analyzing Data


1.0 SAS statements and procs in this unit

proc ttest t-tests, including one sample, two sample and paired
proc freqUsed here for chi-squared tests
proc reg Simple and multiple regression
proc glm Used here for ANOVA models
proc logistic Logistic regression
proc npar1way Non-parametric analyses
proc univariateUsed here for signrank tests

2.0 Demonstration and explanation

2.1 Chi-squared test

Below we use proc freq to perform a chi-squared test and to show the expected frequencies used to compute the test statistic.

proc freq data='c:\sas_data\hs0';
  table prgtype*ses / chisq expected;
run; 

2.2 t-tests

This is the one-sample t-test, testing whether the sample of writing scores was drawn from a population with a mean of 50.

proc ttest data='c:\sas_data\hs1' H0=50;
  var write;
run;

This is the paired t-test, testing whether or not the mean of write equals the mean of read.

proc ttest data='c:\sas_data\hs1';
  paired write*read;
run;

This is the two-sample independent t-test. The output includes the t-test for both equal and unequal variances. The class statement is necessary in order to indicate which groups are to be compared.

proc ttest data='c:\sas_data\hs1';
  class female;
  var write;
run;

2.3 ANOVA

SAS has a procedure called proc anova, but it is only used when there are an equal number of observations in each of the ANOVA cells (which is called a balanced design). proc glm is a much more general procedure that will work with any balanced or unbalanced design (unbalanced meaning an unequal number of observations in each cell).

In this example we are using proc glm to perform a one-way analysis of variance. The class statement is used to indicate that prog is a categorical variable. We use the ss3 option to indicate that we are only interested in looking at the Type III sums of squares, which are the sums of squares that are appropriate for an unbalanced design.

proc glm data='c:\sas_data\hs1';
  class prog;
  model write=prog / ss3;
run;
quit;

Here proc glm performs an analysis of covariance (ANCOVA). In this example, prog is the categorical predictor and read is the continuous covariate.

proc glm data='c:\sas_data\hs1';
  class prog;
  model write = read prog / ss3;
run;
quit;

2.4 Regression

Plain old OLS regression.  proc reg is a very powerful and versatile procedure.  In the following examples we will illustrate just a few of the many uses that proc reg has.

proc reg data='c:\sas_data\hs1';
  model write = female read;
run;
quit;

Specifying plots=diagnostics on the proc reg statement produces a number of diagnostic graphs. The output statement creates a new dataset, called temp, which includes the predicted values (by using the p = option) and the residuals (by using the r = option). The proc print displays the values of selected variables from the temp dataset.

ods graphics on;
proc reg data ='c:\sas_data\hs1' plots=diagnostics;
  model math = write socst;
  output out=temp p=predict r=resid;
run;
quit;
ods graphics off;
proc print data=temp (obs=20);
  var math predict resid;
run;

2.5 Logistic regression

In order to demonstrate logistic regression, we will create a dichotomous variable called honcomp (honors composition), which will be equal to 1 when the logical test of write >= 60 is true and equal to zero when it is not true.  This variable is created purely for illustrative purposes only.

data hs2;
  set 'c:\sas_data\hs1';
  honcomp = (write >= 60);
run;

The proc logistic performs a logistic regression. It is necessary to include the descending option when a variable is coded 0/1 with 1 representing the event whose probability is being modeled. This is needed so that the odds ratios are calculated correctly.

proc logistic data=hs2 descending;
  model honcomp = female read;
run; 

2.6 Nonparametric Tests

The signtest is the nonparametric analog of the single-sample t-test.  The sign test is part of the output of the tests of location in proc univariate. The value that is being tested is specified by the mu0 option on the proc univariate statement.

proc univariate data='c:\sas_data\hs1' mu0=50;
  var write;
run;

The signrank test is the nonparametric analog of the paired t-test. To obtain this test, it is necessary to first compute the difference between the variables to be compared in a separate data step. Then the new difference variable is tested in proc univariate. The signrank test is found in the section of the output called "tests of location".

data hs1c;
  set 'c:\sas_data\hs1';
  diff = read - write;
run;

proc univariate data=hs1c;
  var diff;
run;

The ranksum test is the nonparametric analog of the independent two-sample t-test.

proc npar1way data='c:\sas_data\hs1';
  class female;
  var write;
run;

The kruskal wallis test is the nonparametric analog of the one-way ANOVA.

proc npar1way data='c:\sas_data\hs1';
  class ses;
  var write;
run;

3.0 For more information


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.