|
|
|
||||
|
|
|||||
Stata command odbc allows Stata to load, write, or view data from ODBC sources. ODBC, an acronym for Open DataBase Connectivity, is a standardized set of function calls that can be used to access data stored in database management systems. A dBASE file, an Excel file, or an ACCESS file are all examples of data files created by database systems. Before we start to load or view such files, let's list all the possible ODBC data sources that Stata can manage.
odbc listData Source Name Driver ------------------------------------------------------------------------------- Visual FoxPro Database Microsoft Visual FoxPro Driver Visual FoxPro Tables Microsoft Visual FoxPro Driver MQIS SQL Server MS Access Database Microsoft Access Driver (*.mdb) Excel Files Microsoft Excel Driver (*.xls) dBASE Files Microsoft dBase Driver (*.dbf) DeluxeCD Microsoft Access Driver (*.mdb) ECDCMusic Microsoft Access Driver (*.mdb) -------------------------------------------------------------------------------
The Data Source Name is required when we use Stata command odbc to access a file of a specific type, since we need to know it for the specific type of files that we are going to manage.
Now let's look at some of the examples on how to read, write or view a dBASE file, an Excel file or an ACCESS file with odbc command.
Example 1. Reading a dBASE file mostly through pointing and clicking
The easiest way of loading a dBASE file into Stata is via a single command:
odbc query "dBASE Files"
This will display all the possible dBASE files in the current working directory, since we have specified dBASE Files as our data source. Once we specify that the database name is "dBASE Files", the rest of the task will remain associated with dBASE files. Notice that the list of the file names is in blue, that means we can actually click on them. By clicking on a file name we get a description of the file. At the beginning of the description, we see the file name followed by (load) in blue. By clicking on it, we have just loaded the file into memory.
One thing that we need to notice is that this query will display any file with .dbf extension. Some of the files may not be dBASE files. For example, another file called hsb2_foxpro.dbf is also being displayed. But it is not a dBASE file, instead it is a FoxPro file.

descContains data obs: 200 vars: 11 size: 18,400 (98.2% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- ID double %10.0g FEMALE double %10.0g RACE double %10.0g SES double %10.0g SCHTYP double %10.0g PROG double %10.0g READ double %10.0g WRITE double %10.0g MATH double %10.0g SCIENCE double %10.0g SOCST double %10.0g ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
Example 2. Reading a dBASE file through command lines
From the example above we can already see the commands that we want to use if we want to read a dBASE file via command lines. Notice we don't need the dialog option anymore.
It seems a good idea to start with odbc query command, since this will set up the database type and will also display file names for loading. We add one option "lower" to command odbc load so variable names will be all in lower case.
clear odbc query "dBASE Files"DataSource: dBASE Files Path : d:\data\classdata ------------------------------------------------------------------------------- hsb2 hsb2_foxpro -------------------------------------------------------------------------------odbc desc "hsb2"DataSource: dBASE Files (query) Table: hsb2 (load) ------------------------------------------------------------------------------- Variable Name Variable Type ------------------------------------------------------------------------------- ID Numeric FEMALE Numeric RACE Numeric SES Numeric SCHTYP Numeric PROG Numeric READ Numeric WRITE Numeric MATH Numeric SCIENCE Numeric SOCST Numeric -------------------------------------------------------------------------------odbc load, table("hsb2") lowerdescContains data obs: 200 vars: 11 size: 18,400 (98.2% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- id double %10.0g ID female double %10.0g FEMALE race double %10.0g RACE ses double %10.0g SES schtyp double %10.0g SCHTYP prog double %10.0g PROG read double %10.0g READ write double %10.0g WRITE math double %10.0g MATH science double %10.0g SCIENCE socst double %10.0g SOCST ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
Example 3. Reading a Microsoft Excel file through command lines
An Excel file can be loaded into Stata using odbc directly. Here is an example. Let's say we have an Excel file test.xls located in folder d:\data. It has two sheets in it, s1 and s2. We want to read the first sheet into Stata. We can use command odbc load for this task as shown below. Notice that we have to specify in full the type of the data file and the location of the data file in dsn option and we use table("s1$") for reading in the first sheet. Both the quotation marks and the dollar sign are necessary here for the table option and both dsn and table options are needed to load the entire data. Also notice that we have to clear up the memory first.
clear odbc load, dsn("Excel Files;DBQ=d:\data\test.xls") table("s1$") descContains data obs: 5 vars: 2 size: 100 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- a double %10.0g b double %10.0g ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
Example 4. Reading a Microsoft Access file through command lines
A file in an Access database can be loaded into Stata using odbc directly almost the same way as as Excel file. Here is an example. Let's say we have an Access database hsb2.mdb located in folder d:\data. It has a data set called hsb2 in it and we want to read it into Stata. We can use command odbc load for this task as shown below. Notice that we have to specify in full the type of the database and the location of the database file in dsn option and we use table("hsb2") to specify which data set in it we want to retrieve. Both the dsn and table options are needed in order to retrieve the entire data set. Also notice that we have to clear up the memory first.
clear odbc load, dsn("MS Access Database;DBQ=d:\data\hsb2.mdb") table("hsb2") desContains data obs: 200 vars: 11 size: 5,200 (99.5% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- ID int %8.0g FEMALE int %8.0g RACE int %8.0g SES int %8.0g SCHTYP int %8.0g PROG int %8.0g READ int %8.0g WRITE int %8.0g MATH int %8.0g SCIENCE int %8.0g SOCST int %8.0g ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
Another way of reading Excel or Access file into Stata is via Data Source. This might be handy an Excel file has many sheets or the Access file has many individual data sets in it since you will only have set up the association once. The key step here is to register your database with your computer's ODBC system. Here is an example showing how to do it for Windows XP. You may also want to read Stata manual on Data Management for more details. In this example, we have one Access file, db2.mdb. The same procedure works for both Excel and Access files.
Step 1: From the Start Menu, select Settings and then the Control Panel.
Step 2: Select Administrator Tools and then select Data Sources (ODBC). This will bring up a window like the following window:

Step 3: Click on Add... and this will bring up the following window.

Step 4: Select Microsoft Access Driver (*.mdb) and then click on Finish.

Step 5: The name for Data Source can be arbitrary. Here we will use test as our data source name.

Now, the last step is to select all the Access files that you want to be associated with this data source. This is done by "Select". Click on "Select", you will see something similar to the following:

Now you can select your .mdb files to be in the data source test that we are creating. After select your files, you can then click on OK to close this window and click on OK again to close previous window.

Notice that we have a new entry in the window above, that is our test data source.
After setting up our data source in previous example, we are ready to get access to it via odbc in Stata. After typing "odbc list", you can just follow the links to access to our data source test. The way to load it is the same as in the examples earlier and can also be done by following the link.
odbc listData Source Name Driver ------------------------------------------------------------------------------- MS Access Database Microsoft Access Driver (*.mdb) Excel Files Microsoft Excel Driver (*.xls) dBASE Files Microsoft dBase Driver (*.dbf) test Microsoft Access Driver (*.mdb) -------------------------------------------------------------------------------odbc query "test"DataSource: test Path : D:\data\other_format\db2 ------------------------------------------------------------------------------- Hsb2_small -------------------------------------------------------------------------------odbc desc "Hsb2_small", dialog(complete)DataSource: test (query) Table: Hsb2_small (load) ------------------------------------------------------------------------------- Variable Name Variable Type ------------------------------------------------------------------------------- ID COUNTER "id" INTEGER "female" INTEGER "race" INTEGER "ses" INTEGER "schtyp" INTEGER "prog" INTEGER "read" INTEGER "write" INTEGER "math" INTEGER "science" INTEGER "socst" INTEGER -------------------------------------------------------------------------------
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services