|
|
|
||||
|
|
|||||
Note: This page is done using SAS version 9.1.3
It is quite easy to read a file that uses a comma as a delimiter using proc import in SAS. There are two slightly different ways of reading a comma delimited file using proc import. In SAS, a comma delimited file can be considered as a special type of external file with special file extension .csv, which stands for comma-separated-variables. We show here the first sample program making use of this feature. Let's say we have following data stored in a file called comma.csv.
AMC,22,3,2930,0,11:11 AMC,17,3,3350,0,11:30 AMC,22,,2640,0,12:34 Audi,17,5,2830,1,13:20 Audi,23,3,2070,1,11:11
Then the following proc import statement will read it in and create a temporary data set called mydata.
proc import datafile="comma.csv" out=mydata dbms=csv replace;
getnames=no;
run;
proc print data=mydata;
run;
As you can see in the output below, the data was read properly. Also notice that SAS creates default variable names as VAR1-VARn when variables names are not present in the raw data file.
Obs VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 1 AMC 22 3 2930 0 11:11 2 AMC 17 3 3350 0 11:30 3 AMC 22 . 2640 0 12:34 4 Audi 17 5 2830 1 13:20 5 Audi 23 3 2070 1 11:11
You might have a file where you have the names at the top of the file like the one below. With such a file you would like SAS to use the variable names from the file (e.g., make mpg etc.).
make,mpg,rep78,weight,foreign,time AMC,22,3,2930,0,11:11 AMC,17,3,3350,0,11:30 AMC,22,,2640,0,12:34 Audi,17,5,2830,1,13:20 Audi,23,3,2070,1,11:11
We can use the getnames=yes; statement to tell SAS we want it to read the variable names from the first line of the data file, as illustrated below.
proc import datafile="comma1.csv" out=mydata dbms=csv replace; getnames=yes; run; proc print data=mydata; run;
As you can see from the output of the proc print shown below, the data are read correctly.
Obs make mpg rep78 weight foreign time 1 AMC 22 3 2930 0 11:11 2 AMC 17 3 3350 0 11:30 3 AMC 22 . 2640 0 12:34 4 Audi 17 5 2830 1 13:20 5 Audi 23 3 2070 1 11:11
Another way of reading a comma delimited file is to consider a comma as an ordinary delimiter. Here is a program that shows how to use the dbms=dlm and delimiter="," option to read a file just like we did above. Also notice that the external file doesn't have to have .csv extension.
You may want to create a permanent SAS data file using proc import. Suppose that we want to create a permanent SAS data file called mydata in the directory "c:\dissertation". We can do the following.proc import datafile="comma1.txt" out=mydata dbms=dlm replace; delimiter=","; getnames=yes; run;
Another feature of proc import is that you can read in the input file starting from a specific row number using datarow= statement. Let's say that we want to read from observation 4 of the text file comma1.txt. Since variables have names on the first row in the raw data file, we have to use datarow=5.libname dis "c:\dissertation"; proc import datafile="comma1.txt" out=dis.mydata dbms=dlm replace; delimiter=","; getnames=yes; run;
Now we can see from the output below the data has been read correctly.proc import datafile="comma1.txt" out=mydata dbms=dlm replace; delimiter=","; getnames=yes; datarow=5; run; proc print data=mydata; run;
On the other hand, if our variables don't have names in the raw file, we need to use getnames=no and datarow=4 as shown below.Obs make mpg rep78 weight foreign time 1 Audi 17 5 2830 1 13:20 2 Audi 23 3 2070 1 11:11
proc import datafile="comma2.txt" out=mydata dbms=dlm replace; delimiter=","; getnames=no; datarow=4; run;
It is quite easy to read a file that uses a tab as a delimiter using proc import in SAS. There are two slightly different ways of reading a tab delimited file using proc import. In SAS, a tab delimited file can be considered as a special type of external file with file extension .txt. We show here the first sample program making use of this feature. Let's say we have the following data stored in a file called tab.txt.
AMC Concrod 22 2930 4099 AMC Pacer 17 3350 4749 AMC Sprint 22 2640 3799 Buick Century 22 3250 4816 Buick Electra 15 4080 7827
Then the following proc import statement will read it in and create a temporary data set called mydata.
proc import datafile="tab.txt" out=mydata dbms=tab replace; getnames=no; run; proc print data=mydata; run;
As you can see in the output below, the data was read properly. Also notice that SAS creates default variable names as VAR1-VARn when variables names are not present in the raw data file.
Obs VAR1 VAR2 VAR3 VAR4 1 AMC Concrod 22 2930 4099 2 AMC Pacer 17 3350 4749 3 AMC Sprint 22 2640 3799 4 Buick Century 22 3250 4816 5 Buick Electra 15 4080 7827
You might have a file where you have the names at the top of the file like the one below. With such a file you would like SAS to use the variable names from the file (e.g., make mpg etc.).
MAKE MPG WEIGHT PRICE AMC Concrod 22 2930 4099 AMC Pacer 17 3350 4749 AMC Sprint 22 2640 3799 Buick Century 22 3250 4816 Buick Electra 15 4080 7827
We can use the getnames=yes; statement to tell SAS we want it to read the variable names from the first line of the data file, as illustrated below.
proc import datafile="tab1.txt" out=mydata dbms=tab replace; getnames=yes; run; proc print data=mydata; run;
As you can see from the output of the proc print shown below, the data are read correctly.
OBS MAKE MPG WEIGHT PRICE 1 AMC Concord 22 2930 4099 2 AMC Pacer 17 3350 4749 3 AMC Spirit 22 2640 3799 4 Buick Century 20 3250 4816 5 Buick Electra 15 4080 7827
Another way of reading a tab delimited file is to consider a tab as an ordinary delimiter. Here is a program that shows how to use the delimiter option to read a file just like we did above.
proc import datafile="tab1.txt" out=mydata dbms=dlm replace; delimiter='09'x; getnames=yes; run;
You may want to create a permanent SAS data file using proc import. Suppose that we want to create a permanent SAS data file called mydata in the directory "c:\dissertation". We can do the following.
libname dis "c:\dissertation"; proc import datafile="tab1.txt" out=dis.mydata dbms=dlm replace; delimiter='09'x; getnames=yes; run;
It is very easy to read a file that uses a space as a delimiter to separate variables using proc import in SAS. Consider the following sample data file below.
AMC 22 2930 4099 AMC 17 3350 4749 AMC 22 2640 3799 Buick 20 3250 4816 Buick 15 4080 7827
Here is a sample program that reads the text file into SAS.
proc import datafile="space.txt" out=mydata dbms=dlm replace; getnames=no; run;
Now we can use proc print to see if the data file has been read correctly into SAS.
proc print data=mydata; run;
Obs VAR1 VAR2 VAR3 VAR4
1 AMC 22 2930 4099
2 AMC 17 3350 4749
3 AMC 22 2640 3799
4 Buick 20 3250 4816
5 Buick 15 4080 7827
Notice that we use the getnames=no option because in the raw data file variables don't have names. SAS will generate variable names as VAR1-VARn. If our raw file has names for variables on the first line as shown below, then we need to use the option getnames=yes. For example, we have following text file called space1.txt.
MAKE MPG WEIGHT PRICE AMC 22 2930 4099 AMC 17 3350 4749 AMC 22 2640 3799 Buick 20 3250 4816 Buick 15 4080 7827
Then the following program reads the file in with the variable names.
proc import datafile="space1.txt" out=mydata dbms=dlm replace; getnames=yes; run;
What if we want to the SAS data set created above to be permanent? Let's say we want to save the permanent file in the directory "c:\dissertation". The answer is to use libname statement as shown below.
libname dis "c:\dissertation"; proc import datafile="space1.txt" out=dis.mydata dbms=dlm replace; getnames=yes; run;
Another feature of proc import is that you can read in the input file starting from a specific row number using datarow= statement. Let's say that we want to read from observation 3 of the text file space1.txt. Since variables have names on the first row in the raw data file, we have to use datarow=4.
proc import datafile="space1.txt" out=mydata dbms=dlm replace; getnames=yes; datarow=4; run; proc print data=mydata; run;
Now we can see from the output below the data has been read correctly.
Obs MAKE MPG WEIGHT PRICE 1 AMC 22 2640 3799 2 Buick 20 3250 4816 3 Buick 15 4080 7827
On the other hand, if our variables don't have names in the raw file, we need to use getnames=no and datarow=3 as shown below.
proc import datafile="space1.txt" out=mydata dbms=dlm replace; getnames=no; datarow=3; run;
You can use delimiter= on the infile statement to tell SAS what delimiter you are using to separate variables in your raw data file. For example, below we have a raw data file that uses exclamation points ! to separate the variables in the file.
22!2930!4099 17!3350!4749 22!2640!3799 20!3250!4816 15!4080!7827
The example below shows how to read this file by using delimiter='!' on the infile statement.
DATA cars; INFILE 'readdel1.txt' DELIMITER='!' ; INPUT mpg weight price; RUN; PROC PRINT DATA=cars; RUN;
As you can see in the output below, the data was read properly.
OBS MPG WEIGHT PRICE 1 22 2930 4099 2 17 3350 4749 3 22 2640 3799 4 20 3250 4816 5 15 4080 7827
It is possible to use multiple delimiters. The example file below uses either exclamation points or plus signs as delimiters.
22!2930!4099 17+3350+4749 22!2640!3799 20+3250+4816 15+4080!7827
By using delimiter='!+' on the infile statement, SAS will recognize both of these as valid delimiters.
DATA cars; INFILE 'readdel2.txt' DELIMITER='!+' ; INPUT mpg weight price; RUN; PROC PRINT DATA=cars; RUN;
As you can see in the output below, the data was read properly.
OBS MPG WEIGHT PRICE 1 22 2930 4099 2 17 3350 4749 3 22 2640 3799 4 20 3250 4816 5 15 4080 7827
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services