|
|
|
||||
|
|
|||||
| proc contents | Contents of a SAS dataset |
| proc print | Displays the data |
| proc means | Descriptive statistics |
| proc univariate | More descriptive statistics |
| proc sort | Sort a dataset |
| proc boxplot | Boxplots |
| proc freq | Frequency tables and crosstabs |
| proc chart | ASCII histogram |
| proc corr | Correlation matrix |
| proc reg | OLS regression |
Before we start our statistical exploration we will look at the data using the proc contents and proc print statements. Note that the variable prog is a string variable.
options nocenter;
proc contents position data='c:\sas\hs0'; run;
proc print data='c:\sas\hs0' (obs=20); run;
proc print data='c:\sas\hs0'; var gender id race ses schtyp prgtype read; run;
The proc means with the class statement gives descriptive statistics within groups. Along with proc means, we also show the proc univariate displaying additional descriptive statistics.
proc means data='c:\sas\hs0'; run;
With the var statement, we can specify which variables we want to analyze. Also, the n mean median std var options allow us to indicate which statistics we want computed.
proc means data='c:\sas\hs0' n mean median std var; var read math science write; run;
We use the where statement below to look at just those students with a reading score of 60 or higher.
proc means data='c:\sas\hs0' n mean median std var; where read>=60; var read math science write; run;
With the class statement, we get the descriptive statistics broken down by prgtype.
proc means data='c:\sas\hs0' n mean median std var; class prgtype; var read math science write; run;
We can use proc univariate to get detailed descriptive statistics for write along with a histogram with a normal overlay.
proc univariate plot data='c:\sas\hs0'; var write; histogram / normal; run;
We can use proc boxplot to get side-by-side boxplots for the variable write broken down by the levels of prgtype; however, this requires that we first sort the data using proc sort.
proc sort data='c:\sas\hs0'; by prgtype; run; proc boxplot data='c:\sas\hs0'; plot write*prgtype / boxstyle=schematic boxwidth=10; run;
Below proc freq is used to get a frequency table for ses and proc chart shows a bar chart of this distribution.
proc freq data='c:\sas\hs0'; table ses; run; proc chart data='c:\sas\hs0'; vbar ses / discrete; run;
We use proc freq to get frequencies for write and this illustrates why it can sometimes be undesirable to do frequencies for continuous variables.
proc freq data='c:\sas\hs0'; table write; run;
Here we use proc freq to get frequencies for gender, schtyp and prgtype, each table shown separately.
proc freq data='c:\sas\hs0'; table gender schtyp prgtype; run;
Below we show how to get a crosstab of prgtype by ses, and the next example shows how to include a chi square test and how to get the expected frequencies.
proc freq data='c:\sas\hs0'; table prgtype*ses; run;
proc freq data='c:\sas\hs0'; table prgtype*ses / chisq expected; run;
proc corr is used to get correlations among variables. By default, proc corr uses pairwise deletion for missing observations. If you use the nomiss option, proc corr uses listwise deletion and omits all observations with missing data on any of the named variables.
proc corr data='c:\sas\hs0'; var write read science; run; proc corr data='c:\sas\hs0' nomiss; var write read science; run;
We conclude with proc reg showing a simple regression predicting write from read along with a scatterplot and regression line.
proc reg data='c:\sas\hs0'; model write=read; plot write*read ; run;
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services