UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

What are the Issues on Reading an ASCII Data file

The easiest way to read an ASCII fie is to convert the file to Limdep format directly via Stat/transfer. The other way is to convert your ASCII file to an Excel file with version 4 or 3. LimDep handles Excel file version 4 and 3. 

Sometimes, you may have to read your ASCII file directly. You may be able to use the option Import Variables in the Project pull-down menu. But there are a few issues that you need to be aware of and when these issues occur, the best way is to use the command line syntax.   

If your ASCII data file already have variable names in it, you definitely want to keep them when you input the data to LimDep. You need to be aware though that LimDep has a limit on the number of characters that each line can have. For data lines, each line must be 500 characters (255 for in the mainframe version) or less. Lines for variable names have to be 80 or fewer characters containing the variable names, separated by spaces and/or comma. If your file is wider than 500 characters long, you may end up reading in fewer variables than the original data file and LimDep does not give out warnings for this matter. When a data file has multiple lines for its variable names, you can not use the pull-down menu to import variables as shown in the FAQ page How to Create LimDep Program Files since it uses the default of single line of variable names. You have to use READ command where you can specify that LimDep reads multiple lines of variable names from your data file by using ;Names=n option. 

For example, we have a comma separated ASCII data set with 50 variables and 1000 observations. The variable names takes the first three lines of our data file and each observation takes a single line. We put the following lines into Command Window and highlighted it and ran it from pull-down menu Run, select Run Selection). 

Read ; Nobs=1000 ; Nvar=50 ; Names=3;
file = "e:\limdep\data\r50.txt" $
After running the command, we look at the Trace from Output window and it says that
SAMPLE set to observations 1 to 1000
There are 50 variables in the data work area.
This indicates that our reading the ASCII file has  worked, since there is no missing values or unrecognized variable names. 

What happens when your ASCII data file has missing values? LimDep considers any value not readable as a number to be missing and fills with value -999. A blank is normally not a missing value; it is just a blank. Let's say we have the following ASCII data set with 4 observations and some values are missing: 

id,female,race,ses,schtyp,prog,read,write,math,science,socst
70,     0,   4,  1,     1,    ,  57,   52,  41,     47,   57 
  ,     1,   4,  2, 	1,   3,  68,   59,  53,     63,   61
86,      ,   4,  3,     1,   1,  44,   33,  54,     58,   31 
141,    0,   4,  3,     1,   3,  63,   44,  47,     53,
When you read it in LimDep, you will see from the Trace Window that only 2 observations are read and these two observations are not right either. This is because that Limdep treats the blanks between commas simply as a blank and ignores it completely. This is different from what SAS, STATA or most of other statistical packages will do. What you have to do is to replace those blanks as dots "." or simply with  word "missing" before reading. Then LimDep will recognize them as missing values. Once again, it will be easier to use Stat/Transfer to convert it directly to LimDep file to avoid any further problem. 
In the current world of LimDep, every variable is numeric. If you have a string variable in your ASCII data file, you will not be able to read it into LimDep not to say to use for further statistical analysis. Therefore, if you intend to use any string variable for your data analysis, do the dummy coding first outside LimDep and then convert it to LimDep. For example, we have a data set of the following.
id female race ses schtyp prog read write math science socst
70      0    4   1      1  aca   57    52   41      47    57 
121     1    4   2      1  aca   68    59   53      63    61
86      0    4   3      1  gen   44    33   54      58    31 
141     0    4   3      1  gen   63    44   47      53    56
172     0    4   2      1  aca   47    52   57      53    61
Using Import Variables from pull-down menu Project, we read it into LimDep. The Trace Window has a message saying that 5 missing values are converted to -999. This is how LimDep would do if it encounters non numerical values. Therefore when we open the Data Editor, we see the following. 


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California