UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS Class Notes
Managing Data


1.0 SAS statements and procs in this unit

libname Set library
keep Keeps named variables
drop Drops named variables
set Reads in named file(s). If more than one is named, files are combined (append)
proc sortSorts cases in a dataset
merge Merges files

2.0 Demonstration and explanation

2.1 Creating a library

Creating a library allows us to refer to a file in a specific directory (folder) without typing out the full file path. The command libname creates a shortcut that refers back to a specified directory. The two proc print commands below that show that you get the same results by either referring to the file name using the library name or the file path.
libname mylib "c:\sas_data\";

proc print data=mylib.hs1 (obs=10);
  var write read science;
run;
proc print data="c:\sas_data\hs1" (obs=10);
  var write read science;
run;

2.2 Selecting cases using where

Suppose we wish to analyze just a subset of the hs1 data file.  In fact, we are studying "good readers" and just want to focus on the students who had a reading score of 60 and higher. The following shows how we can take the hs1 dataset to create and store a copy of our data which just has the students with reading scores of 60 or higher.

data mylib.goodread;
  set mylib.hs1;
  where (read >=60);
run;

proc means data=mylib.goodread;
  var read;
run;

2.3 Keeping variables

Further suppose that our data file had many variables, say 2000 variables, but we only care about just a handful of them, id, female, read and write. We can subset our data file to keep just those variables as shown below.

data mylib.hskept;
  set mylib.goodread;
  keep id female read write;
run;

proc contents data=mylib.hskept;
run;

2.4 Dropping variables

Instead of wanting to keep just a handful of variables, it is possible that we want to get rid of just a handful of variables in our data file. Below we how to remove the variables ses and prog from the dataset.

data mylib.hsdropped;
  set mylib.goodread;
  drop ses prog;
run;

proc contents data=mylib.hsdropped;
run;

2.5 Appending datasets

In this example we start with two datasets, one for males (called hsmale) and one for the females (called hsfemale). We need to combine these files together to be able to analyze them, as shown below. In this example, we are adding cases, sometimes called "stacking" the data files. We do this by listing both data file names on the set statement in data step.

proc freq data=mylib.hsmale;
  tables female;
run;

proc freq data=mylib.hsfemale;
  tables female;
run;

data mylib.hsmaster;
  set mylib.hsmale mylib.hsfemale;
run;

proc freq data=mylib.hsmaster;
  tables female;
run;

2.6 Merging datasets

Again, we have been given two files. However, in this case, we have a file that has the demographic information (called hsdem) and a file with the test scores (called hstest), and we wish to merge these files together. To merge files together, each file must first be sorted by the same variable and then saved. Both the sorting and the saving can be done with proc sort.  Next, a data step with the merge and by statements is used to combine the datasets.

Before we beging, we should look at the data sets.

proc print data=mylib.hsdem (obs=10);
run;

proc print data=mylib.hstest (obs=10);
run;

Next, we will sort the data sets by the variable that identifies in both datasets, in this case, the variable id.

proc sort data=mylib.hsdem out=dem;
  by id;
run;

proc sort data=mylib.hstest out=test;
  by id;
run;

Now we can merge the files and look at the resulting data set.

data mylib.all;
  merge dem test;
  by id;
run;

proc contents data="d:\sas_data\all";
run;

3.0 For more information


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.