### SAS Learning Module Comparing SAS and Stata side by side

SAS and Stata: Side-by-side

#### Introduction

The table below contains the SAS and Stata commands for the same program. You can copy-and-paste the SAS commands into SAS program editor. Likewise, you can copy-and-paste the Stata commands into Stata's do-file editor.

 /* SAS Program */ options nocenter; data hsb200; infile 'hsb200.txt'; input id gender $race ses schtype$ prog read write math science socst; /* convert string variable to numeric */ sex = .; if gender = 'm' then sex = 2; if gender = 'f' then sex = 1; /* create dichotomous dummy variable */ /* assumes all write scores are non-missing*/ honor = (write ge 60); /* create difference variable for paired t-test */ rminusw = read - write; run; /* descriptive statistics */ proc means data=hsb200; var read write math; /* exploratory data analysis */ proc univariate data=hsb200 PLOT; var write; /* frequency tables */ proc freq data=hsb200; tables gender race ses; /* cross tabulation */ proc freq data=hsb200; table gender*race / chisq; /* correlations */ proc corr data=hsb200; var read write math sex; /* paired t-test */ proc means n mean t stderr prt data=hsb200; var rminusw; /* independent t-test */ proc ttest data=hsb200; class gender; var write; /* multiple regression */ proc reg data=hsb200; model write = sex read math science; quit; /* logistic regression */ proc logistic data=hsb200 descending; model honor = sex read math science; /* factorial anova */ proc glm data=hsb200; class gender prog; model write = gender prog gender*prog; quit; run; /* Stata Program */ infile id str1 gender race ses /* */ str3 schtype prog read write /* */ math science socst /* */ using hsb200.txt /* convert string variable to numeric */ encode gender, generate(sex) /* create dichotomous dummy variable */ /* assumes all write scores are non-missing*/ generate honor = (write >= 60) /* do not need to create difference variable for paired t-test */ /* descriptive statistics */ summarize read write math /* exploratory data analysis */ summarize write, detail stem write graph write, box pnorm write /* frequency tables */ tab1 gender race ses /* cross tabulation */ tabulate gender race, chi2 /* correlations */ correlate read write math sex /* paired t-test */ ttest read = write /* independent t-test */ ttest write, by(gender) /* multiple regression */ regress write sex read math science /* logistic regression */ logistic honor sex read math science logit /* factorial anova */ anova write gender prog gender*prog 

#### Additional Notes

• SAS represents missing values as "negative infinity" while Stata represents missing values as "positive infinity". For example, within Stata, numbers are ordered like this

all nonmissing numbers < . < .a < .b < ... < .z

In SAS the order is reversed.
• Both SAS and Stata represent dates in a similar way, as the number of days before or since Jan 1 1960. So the date Jan 2 1960 is represented as a 1 in both SAS and Stata.
• SAS permits you to have multiple active data files at once, while Stata only permits one active data file (in memory) at once.

#### Web notes

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.

## GIS and Visualization