UCLA Academic Technology Services HomeServicesClassesContactJobs

R FAQ:
How can I write a binary data file by column?

Binary files offer an efficient and easy-to-recover way to store data. If you wish to convert a data frame in R to binary form, there are a few basics to learn that will allow you to both read and write binary files.

In the binary data file, your information will be stored in groups of binary digits. Each binary digit is a zero or one and eight binary digits grouped together is a byte. In order to successfully read the binary file you write, you must know keep in mind how you are parsing your information into binary.  For example, if you have a matrix of data that you are writing to a binary file, are you reading the matrix across the rows or down the columns? If your data consists of integers, how may bytes should represent one integer in your data?  On what platform are you working while writing the file?

The binary file you write will be much easier to read if you can answer these questions.  This page will provide an example of writing binary data by column.  If you wish to write binary data by row, see FAQ: How can I write a binary data file in R by row?. If you are not sure of some of the answers, you can explore the available options in R and consider where you plan to later read in the data to decide which are most appropriate. These are the same options that will be available when reading binary data in R.

Suppose we have a dataset in R, hsb2, and we wish to read a subset of the variables in this dataset to a binary file.


hsb2<-read.table("http://www.ats.ucla.edu/stat/R/notes/hsb2.csv", sep=",", header=T)
hsb2[1:5,]

   id female  race    ses schtyp     prog read write math science socst
1  70   male white    low public  general   57    52   41      47    57
2 121 female white middle public vocation   68    59   53      63    61
3  86   male white   high public  general   44    33   54      58    31
4 141   male white   high public vocation   63    44   47      53    56
5 172   male white middle public academic   47    52   57      53    61

To get started, we establish a connection to a file and indicate that we will be using the connection to read in binary data.  We do this with the file command, providing first the pathname, and the "wb" for "writing binary".  For more details, see help(file) in R.

to.write = file("C:/binfile.dat", "wb")

If we wish to write a binary file containing the reading, writing, and math scores from the hsb2 dataset, there are several ways in which this can be done.  Keep in mind that we are essentially taking a matrix of information and making it into one long list.  There are several ways to go about getting the matrix of information into list form. Will we include or omit variable names? If we include them, will we list a single variable name followed by all of the information in the variable? Or list all of the variable names and then list the information going across the matrix column by column?  

For this example, we will list the variable names first, then all of the values for the first variable named followed by all of the values for the second variable named, and so on.  This is an arbitrary choice, but it's important to note that the choices you make in writing the binary file define the correct way to read the file.

To write information to the file we connected to, we will use the writeBin command.  The first argument we give writeBin is the integer/string/vector that we wish to write to the binary file.  The second argument we give writeBin is the open connection we established. In the command below, we are passing writeBin a vector containing three variable names.

writeBin(colnames(hsb2)[7:9], to.write)

We can continue to write to the file. R will concatenate additional information to what we have already written. These three variables all contain integer values. 

writeBin(hsb2$read, to.write)
writeBin(hsb2$write, to.write)
writeBin(hsb2$math, to.write)

We could have equivalently written the three sets of variable values with one writeBin statement where the first argument is a concatenated list of the variable values (c(hsb2$read, hsb2$write, hsb2$math)). Now that we have written all of our desired information to the binary file, we can close the connection.

close(to.write)

To verify that you have successfully written the data to a binary file, try to read in the file you just wrote using readBin. For help with this, see R FAQ: How can I read binary data into R?.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.