SAS Class Notes
Entering Data


1.0 SAS statements and procs in this unit

data Begins a data step which manipulates datasets
infileIdentifies an external raw data file to read
inputLists variable names in the input file
datalines Indicates internal data
libname An engine to connect to Microsoft files
set Reads a SAS data set
proc contents Contents of a data set
proc print Prints observations of variables in a data set
proc copy Copies SAS files from one location to another

2.0 Demonstration and explanation

Import Wizard, Libnames and Proc import

We will start with inputting an Excel file into SAS first through the SAS Import Wizard.

Below is the SAS syntax to import the same excel file.

proc import datafile="c:\sas_data\hs0.xls" dbms = xls out=hs0;
run;

Another way is to use the libname statement, which will be reintroduced in a later unit. 

libname xlsdata 'c:\sas_data\hs0.xls';
proc print data = xlsdata."hs0$"n (obs=10);
run;
libname xlsdata clear;

Both of the methods above (menus or syntax) work for other file formats, such as comma-separated or tab-delimited files, and Stata or SPSS datasets. Now we can look at the data or even modify them if we want.

Data Steps

One of the more commonly used ASCII data formats is the comma-separated-values (.csv) format. Files of this type can be read in through the Import Wizard or proc import as shown above, or through a little bit of programming. We will now show how to read in a .csv file with a SAS data step. The following segment is the beginning part of the hs0 file in .csv format. This data file doesn't have variable names on the first line.  Also notice that the line in bold italics has two consecutive commas near the end. This means that there is a missing value in between. In order to read in the data correctly, we use the option dsd in the infile statement.

0,70,4,1,1,general,57,52,41,47,57
1,121,4,2,1,vocational,68,59,53,63,61
0,86,4,3,1,general,44,33,54,58,31
0,141,4,3,1,vocational,63,44,47,53,56
0,172,4,2,1,academic,47,52,57,53,61
0,113,4,2,1,academic,44,52,51,63,61
0,50,3,2,1,general,50,59,42,53,61
0,11,1,2,1,academic,34,46,45,39,36
0,84,4,2,1,general,63,57,54,,51
0,48,3,2,1,academic,57,55,52,50,51
0,75,4,2,1,vocational,60,46,51,53,61
0,60,5,2,1,academic,57,65,51,63,61

The following data step will read the data file and name it temp. The infile statement tells SAS where the location and the name of the ASCII file is. The input statement gives the names of the variables in the dataset in the same order as the comma separated file. The $ after prgtype tells SAS that prgtype is a string variable, that is, a variable that can contain letters as well as numbers. The length statement tells SAS that the variable prgtype is a string (as in the input statement, the $ indicates a string variable) and has ten characters (indicated by the 10 following the $). By default, SAS allows a string variable to be 8 or fewer characters. If the string is to be longer, you have to tell SAS using the length statement. Note that if you have already specified that the variable is a string in the length it is not necessary to include the $ after prgtype in the input statement; however, doing so is not problematic.

data temp;
  infile 'c:\sas_data\hs0.csv' delimiter=',' dsd;
  length prgtype $10;
  input gender id race ses schtyp prgtype $ read write math science socst ;
run;

Once we have entered the data, we can list the first ten observations to check that the inputting was successful. Note that proc print "prints" the data to the output window, not to a physical printer.

proc print data = temp (obs=10);
run;
Another type of commonly used ASCII data format is fixed format. It always requires a codebook to specify which column corresponds to which variable. Here is a small example of this type of data with a codebook.
        195  094951
        26386161941
        38780081841
        479700  870
        56878163690
        66487182960
        786  069  0
        88194193921
        98979090781
       107868180801
variable namecolumn number
id1-2
a13-4
t15-6
gender7
a28-9
t210-11
tgender12
data fixed;
  infile "c:\sas_data\schdat.fix";
  input id 1-2 a1 3-4 t1 5-6 gender 7 a2 8-9 t2 10-11 tgender 12;
run;

proc print data = fixed;
run;
Sometimes we may want to input data directly from within SAS and here is what to do.
data hsb10;
  input id female race ses schtype $ prog
        read write math science socst;
datalines;
 147 1 1 3 pub 1 47  62  53  53  61
 108 0 1 2 pub 2 34  33  41  36  36
  18 0 3 2 pub 3 50  33  49  44  36
 153 0 1 2 pub 3 39  31  40  39  51
  50 0 2 2 pub 2 50  59  42  53  61
  51 1 2 1 pub 2 42  36  42  31  39
 102 0 1 1 pub 1 52  41  51  53  56
  57 1 1 2 pub 1 71  65  72  66  56
 160 1 1 2 pub 1 55  65  55  50  61
 136 0 1 2 pub 1 65  59  70  63  51
;
run;

proc print data=hsb10;
run;

Saving SAS Data Files

So far, all the SAS data sets that we have created are temporary. When we quit SAS, all temporary data sets will be gone. To save a SAS data file to disk we can use a data step. The example below saves the dataset temp from above as c:\sas_data\hs0 (SAS will automatically add the file extension .sas7bdat to the file name hs0).

data 'c:\sas_data\hs0';
  set temp;
run;

We can use permanent SAS data files by referring to them by their path and file name.

proc print data='c:\sas_data\hs0';
run;

To save all the files from one library to another, we can use proc copy. For example, we can save all the SAS data files we have created so far in the work library all to another location.

libname myout 'h:\sas_data\';
proc copy out=myout in=work;
run;

3.0 For more information

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.