UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata Code Fragments
How can I read binary data in Stata?

In this example, we show how to read a binary file into Stata. The original binary file can be downloaded following the link and its codebook can be downloaded here. The zip file here contains data file, the pdf file of the codebook and the Stata code example in case the links above are not available. 

The data file contains 3520 bytes of header information in ASCII and here are the beginning part of it.

CCSD3ZF0000100000001CCSD3VS00006PRODUCER
Product_File_Name =                      JA1_IGD_2PcP243_093;
Producer_Agency_Name = CNES;
Processing_Center = SSALTO;
File_Data_Type = IGDR;
Reference_Document =                     SMM-ST-M-EA-10879-CN Issue 4.0;
Reference_Software =     CMAV9.2_01/G5OS5;
Operating_System =            SunOS 5.9;
Product_Creation_Time = 2008-08-20T13:57:26.000000;
CCSD$$MARKERPRODUCERCCSD3KS00006PASSFILE
Mission_Name = Jason-1;
Altimeter_Sensor_Name = POSEIDON-2;
Radiometer_Sensor_Name = JMR;
DORIS_Sensor_Name = DORIS-2 GM;
Acquisition_Station_Name =      JTCCS          ;
Cycle_Number =   243;
Absolute_Revolution_Number = 30781;
Pass_Number =  93;
Absolute_Pass_Number = 61561;
Equator_Time = 2008-08-14T09:54:37.743000;
Equator_Longitude = +235.99<deg>;
First_Measurement_Time = 2008-08-14T10:00:01.008141;
Last_Measurement_Time = 2008-08-14T10:22:43.766073;
First_Measurement_Latitude = +15.81<deg>;
Last_Measurement_Latitude = +66.15<deg>;
First_Measurement_Longitude = +241.82<deg>;
Last_Measurement_Longitude = +318.85<deg>;
Pass_Data_Count =   765;
Ocean_Pass_Data_Count =   483;
Ocean_PCD =   0<%>;
Time_Epoch = 1958-01-01T00:00:00.000000;

It includes the information on the operating system used, the number of observations and time the first measurement is taken. There are many variables in the data set. In this example, we only show how to read the first eight variables. This gives us a chance to demonstrate how to use the file seek command. Here is the code for reading the data into Stata.

clear
set mem 100m

/********************************************************
 A couple of numbers here:
 The total number of bytes of this file is 340120.
 3520 is the number of bytes of the header.
 The number of bytes for the scientific records
 are then 340120 - 3520 = 336600 for this file.
 There are total of 765 records from header information: 
 Pass_Data_Count = 765 leading to 440 bytes per record. 
********************************************************/

set obs 765

gen day =.
gen long time_sec = .
gen long time_ms = .
gen long latitude = .
gen long longitude = .
gen byte surface_type = .
gen byte alt_echo_type = .
gen byte rad_surf_type = .

file open t using test1.dat, read binary
file set  t byteorder hilo
file seek  t 3520

quietly foreach i of numlist 1/765 {
  tempname word1

  file read t %4bu `word1'
  replace day = `word1' in `i'

  file read t %4bu `word1'
  replace time_sec = `word1' in `i'

  file read t %4bu `word1'
  replace time_ms = `word1' in `i'

  file read t %4b `word1'
  replace latitude = `word1' in `i'

  file read t %4bu `word1'
  replace longitude = `word1' in `i'

  file read t %1bu `word1'
  replace surface_type = `word1' in `i'

  file read t %1bu `word1'
  replace alt_echo_type = `word1' in `i'

  file read t %1bu `word1'
  replace rad_surf_type = `word1' in `i'

  local a = 440*`i' + 3520
  file seek t `a'
}

file close t
gen date = day -(d('1jan1960')-d('1jan1958'))
format date %d
clist date time_sec time_ms latitude longitude surface_type in 1/10

          date      time_sec       time_ms      latitude     longitude  surfac~e
  1. 14aug2008         36001          8141      15806591     241819737         0
  2. 14aug2008         36002         27717      15856153     241839314         0
  3. 14aug2008         36003         47293      15905712     241858902         0
  4. 14aug2008         36004         66870      15955268     241878502         0
  5. 14aug2008         36005         86444      16004821     241898113         0
  6. 14aug2008         36006        106023      16054371     241917737         0
  7. 14aug2008         36007        125598      16103918     241937372         0
  8. 14aug2008         36008        145174      16153462     241957019         0
  9. 14aug2008         36009        164750      16203004     241976678         0
 10. 14aug2008         36010        184327      16252542     241996348         0

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.