|data||Begins a data step which manipulates datasets|
|infile||Identifies an external raw data file to read|
|input||Lists variable names in the input file|
|datalines||Indicates internal data|
|libname||An engine to connect to Microsoft files|
|set||Reads a SAS data set|
|proc contents||Contents of a data set|
|proc print||Prints observations of variables in a data set|
|proc copy||Copies SAS files from one location to another|
Import Wizard, Libnames and Proc import
We will start with inputting an Excel file into SAS first through the SAS Import Wizard.
File Import Data Choose Excel .xls format (this is the default) Click on Next Click on Browse to select a file: c:\sas_data\hs0.xls The default option is to read variable names from the first line, leave as it is. Click on Next Enter a name (hs0) for the data set Click on Finish
Below is the SAS syntax to import the same excel file.
proc import datafile="c:\sas_data\hs0.xls" dbms = xls out=hs0; run;
Another way is to use the libname statement, which will be reintroduced in a later unit.
libname xlsdata 'c:\sas_data\hs0.xls'; proc print data = xlsdata."hs0$"n (obs=10); run; libname xlsdata clear;
Both of the methods above (menus or syntax) work for other file formats, such as comma-separated or tab-delimited files, and Stata or SPSS datasets. Now we can look at the data or even modify them if we want.
Explorer Libraries Work Double click on hs0 Edit Edit Mode Click on data to modify data
One of the more commonly used ASCII data formats is the comma-separated-values (.csv) format. Files of this type can be read in through the Import Wizard or proc import as shown above, or through a little bit of programming. We will now show how to read in a .csv file with a SAS data step. The following segment is the beginning part of the hs0 file in .csv format. This data file doesn't have variable names on the first line. Also notice that the line in bold italics has two consecutive commas near the end. This means that there is a missing value in between. In order to read in the data correctly, we use the option dsd in the infile statement.
0,70,4,1,1,general,57,52,41,47,57 1,121,4,2,1,vocational,68,59,53,63,61 0,86,4,3,1,general,44,33,54,58,31 0,141,4,3,1,vocational,63,44,47,53,56 0,172,4,2,1,academic,47,52,57,53,61 0,113,4,2,1,academic,44,52,51,63,61 0,50,3,2,1,general,50,59,42,53,61 0,11,1,2,1,academic,34,46,45,39,36 0,84,4,2,1,general,63,57,54,,51 0,48,3,2,1,academic,57,55,52,50,51 0,75,4,2,1,vocational,60,46,51,53,61 0,60,5,2,1,academic,57,65,51,63,61
The following data step will read the data file and name it temp. The infile statement tells SAS where the location and the name of the ASCII file is. The input statement gives the names of the variables in the dataset in the same order as the comma separated file. The $ after prgtype tells SAS that prgtype is a string variable, that is, a variable that can contain letters as well as numbers. The length statement tells SAS that the variable prgtype is a string (as in the input statement, the $ indicates a string variable) and has ten characters (indicated by the 10 following the $). By default, SAS allows a string variable to be 8 or fewer characters. If the string is to be longer, you have to tell SAS using the length statement. Note that if you have already specified that the variable is a string in the length it is not necessary to include the $ after prgtype in the input statement; however, doing so is not problematic.
data temp; infile 'c:\sas_data\hs0.csv' delimiter=',' dsd; length prgtype $10; input gender id race ses schtyp prgtype $ read write math science socst ; run;
Once we have entered the data, we can list the first ten observations to check that the inputting was successful. Note that proc print "prints" the data to the output window, not to a physical printer.
Another type of commonly used ASCII data format is fixed format. It always requires a codebook to specify which column corresponds to which variable. Here is a small example of this type of data with a codebook.proc print data = temp (obs=10); run;
195 094951 26386161941 38780081841 479700 870 56878163690 66487182960 786 069 0 88194193921 98979090781 107868180801
Sometimes we may want to input data directly from within SAS and here is what to do.
variable name column number id 1-2 a1 3-4 t1 5-6 gender 7 a2 8-9 t2 10-11 tgender 12data fixed; infile "c:\sas_data\schdat.fix"; input id 1-2 a1 3-4 t1 5-6 gender 7 a2 8-9 t2 10-11 tgender 12; run; proc print data = fixed; run;
data hsb10; input id female race ses schtype $ prog read write math science socst; datalines; 147 1 1 3 pub 1 47 62 53 53 61 108 0 1 2 pub 2 34 33 41 36 36 18 0 3 2 pub 3 50 33 49 44 36 153 0 1 2 pub 3 39 31 40 39 51 50 0 2 2 pub 2 50 59 42 53 61 51 1 2 1 pub 2 42 36 42 31 39 102 0 1 1 pub 1 52 41 51 53 56 57 1 1 2 pub 1 71 65 72 66 56 160 1 1 2 pub 1 55 65 55 50 61 136 0 1 2 pub 1 65 59 70 63 51 ; run; proc print data=hsb10; run;
Saving SAS Data Files
So far, all the SAS data sets that we have created are temporary. When we quit SAS, all temporary data sets will be gone. To save a SAS data file to disk we can use a data step. The example below saves the dataset temp from above as c:\sas_data\hs0 (SAS will automatically add the file extension .sas7bdat to the file name hs0).
data 'c:\sas_data\hs0'; set temp; run;
We can use permanent SAS data files by referring to them by their path and file name.
proc print data='c:\sas_data\hs0'; run;
To save all the files from one library to another, we can use proc copy. For example, we can save all the SAS data files we have created so far in the work library all to another location.
libname myout 'h:\sas_data\'; proc copy out=myout in=work; run;
The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.