|
|
|
||||
|
|
|||||
You may at times wish to read a set of data files into R. The code below demonstrates how to do so looping through the names of the files to be read in. We will show how to do this for all of the files in a single folder. The datasets we read in here are from Stata, so we will be using the foreign package.
library(foreign)
setwd("D:/data/Stata data")
a<-list.files()
a
[1] "charity.dta" "spelling.dta"
for (x in a) {
u<-read.dta(x)
usum<-summary(u[,1:3])
print(usum)
}
ta1 ta2 ta3
Min. : 0.000 Min. : 0.000 Min. : 0.000
1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 1.000
Median : 1.000 Median : 2.000 Median : 1.000
Mean : 1.102 Mean : 1.418 Mean : 1.068
3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 1.000
Max. : 3.000 Max. : 3.000 Max. : 3.000
NA's :64.000 NA's :37.000 NA's :20.000
male i1 i2
Min. :0.0 Min. :0.0000 Min. :0.0000
1st Qu.:0.0 1st Qu.:0.0000 1st Qu.:0.0000
Median :0.5 Median :1.0000 Median :1.0000
Mean :0.5 Mean :0.5333 Mean :0.5333
3rd Qu.:1.0 3rd Qu.:1.0000 3rd Qu.:1.0000
Max. :1.0 Max. :1.0000 Max. :1.0000
The code above allows you to read in multiple files without typing in their names, but assumes you want all of the files in the working directory folder. The code below allows you to list out the files you want to read.
library(foreign)
setwd("D:/data/Stata data")
for (x in c("charity.dta", "spelling.dta")) {
u<-read.dta(x)
usum<-summary(u[,1:3])
print(usum)
}
ta1 ta2 ta3
Min. : 0.000 Min. : 0.000 Min. : 0.000
1st Qu.: 1.000 1st Qu.: 1.000 1st Qu.: 1.000
Median : 1.000 Median : 2.000 Median : 1.000
Mean : 1.102 Mean : 1.418 Mean : 1.068
3rd Qu.: 2.000 3rd Qu.: 2.000 3rd Qu.: 1.000
Max. : 3.000 Max. : 3.000 Max. : 3.000
NA's :64.000 NA's :37.000 NA's :20.000
male i1 i2
Min. :0.0 Min. :0.0000 Min. :0.0000
1st Qu.:0.0 1st Qu.:0.0000 1st Qu.:0.0000
Median :0.5 Median :1.0000 Median :1.0000
Mean :0.5 Mean :0.5333 Mean :0.5333
3rd Qu.:1.0 3rd Qu.:1.0000 3rd Qu.:1.0000
Max. :1.0 Max. :1.0000 Max. :1.0000
Very similar code can be used for reading in multiple .csv files.
setwd("D:/data/")
for (x in c("diet.csv", "lsat.csv")) {
u<-read.table(x, header = T, sep = ",")
usum<-summary(u[,1:3])
print(usum)
}
id t c
Min. : 1.0 Min. :1.000 Min. :7.470
1st Qu.:106.8 1st Qu.:1.000 1st Qu.:7.840
Median :208.5 Median :1.000 Median :7.940
Mean :189.8 Mean :1.185 Mean :7.938
3rd Qu.:274.2 3rd Qu.:1.000 3rd Qu.:8.040
Max. :337.0 Max. :2.000 Max. :8.390
id item wt2
Min. : 1.00 Min. :1 Min. : 0.00
1st Qu.: 8.25 1st Qu.:2 1st Qu.: 3.00
Median :16.50 Median :3 Median : 11.00
Mean :16.41 Mean :3 Mean : 30.52
3rd Qu.:24.00 3rd Qu.:4 3rd Qu.: 28.00
Max. :32.00 Max. :5 Max. :298.00
If you have multiple datasets with the same sets of variables and you wish to append them to one long dataset, you can use the code below. Inside the loop, we create a variable indicating which dataset a given record came from.
setwd("D:/data/") diet <- c()for (x in c("diet1.csv", "diet2.csv")) { u<-read.table(x, header = T, sep = ",") u$dataset = x diet <- rbind(diet, u) }
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services