How to read csv inside a folder in R? - r

I am working in a directory, but the data I want to read is in a subdirectory. I get an error when I try to read the csv files, my code is the following:
setwd("~/Documents/")
files <- list.files(path = "data/")
f <- list()
for (i in 1:length(files)) {
f[[i]] <- read.csv(files[i], header = T, sep = ";")
}
And the error I get is:
Error in file(file, "rt"): cannot open the connection
What am I doing wrong?

The following will work, assuming you have correctly specified the other read.csv parameters.
setwd("~/Documents/")
files <- list.files(path = "data/")
f <- list()
for (i in 1:length(files)) {
f[[i]] <- read.csv(paste0("data/",files[i]), header = T, sep = ";")
}
Alternatively, you could drop the paste0 and simply set your working directory to ~/Documents/data/ in the first place.
setwd("~/Documents/data/")
files <- list.files() #No parameter necessary now since you're in the proper directory
f <- list()
for (i in 1:length(files)) {
f[[i]] <- read.csv(files[i], header = T, sep = ";")
}
If you need to be in ~/Documents/ at the end of this loop, then finish it up by adding the following after the loop.
setwd("~/Documents/")

Related

How to write out multiple files to the environment after apply a function?

I would like to transform 96 .txt files to matrix in R with data.matirx
Here is part of input data in one files
Domain Phylum Class Order
OTU10001 Fungi Ascomycota Dothideomycetes Capnodiales
OTU10004 Fungi Ascomycota Dothideomycetes Pleosporales
And the code for single files:
BC76_OTU <- data.matrix(BC76.frequencytable)
I am trying to process all the files with data.matirx and write out each file to the environment with the following code:
Feature_to_matrix <- function(x) {
x <- as.matrix (files)
return(x)
}
files <- list.files(path="path to directory", pattern="*.txt", full.names=TRUE, recursive=FALSE)
lapply(files, function(Feature_to_matrix) {
t <- read.table(Feature_to_matrix, header=TRUE, row.names=1, sep="")
out <- t
})
But this code doesn't generate output files to the R environment.
Any suggestions?
Thanks!
I also try to write a loop for it
temp = list.files(pattern="*.txt"
for (i in 1:length(temp)) {
sample[i] <- read.csv(temp[i], header = TRUE,row.names=1,sep = "") write.matrix(sample[i]) }
but get an error as follow
Error in sample[i] <- read.csv(temp[i], header = TRUE, row.names = 1, : object of type 'closure' is not subsettable
could anyone give me some suggestion to modify the code?

Dynamic output file name in R

I am so close to getting my code to work, but cannot seem to figure out how to get a dynamic file name. Here is what Ivve got:
require(ncdf)
require(raster)
require(rgdal)
## For multiple files, use a for loop
## Input directory
dir.nc <- 'inputdirectoy'
files.nc <- list.files(dir.nc, full.names = T, recursive = T)
## Output directory
dir.output <- 'outputdirectory'
## For simplicity, I use "i" as the file name, but would like to have a dynamic one
for (i in 1:length(files.nc)) {
r.nc <- raster(files.nc[i], varname = "precipitation")
writeRaster(r.nc, paste(dir.output, i, '.tiff', sep = ''), format = 'GTiff', prj = T, overwrite = T)
}
## END
I appreciate any help. So close!!
You can do this in different ways, but I think it is generally easiest to first create all the output filenames (and check if they are correct) and then use these in the loop.
So something like this:
library(raster)
infiles <- list.files('inputpath', full.names=TRUE)
ff <- extension(basename(infiles), '.tif')
outpath <- 'outputpath'
outfiles <- file.path(outpath, ff)
To assure that you are writing to an existing folder, you can create it first.
dir.create(outpath, showWarnings=FALSE, recursive=TRUE)
And then loop over the files
for (i in 1:length(infiles)) {
r <- raster(infiles[i])
writeRaster(r, paste(outfiles[i], overwrite = TRUE)
}
You might also use something along these lines
outfiles <- gsub('in', 'out', infiles)
Here is the code that finally worked:
# Imports
library(raster)
#Set source file
infiles <- list.files('infilepath', full.names=TRUE)
#create dynamic file names and choose outfiles to view list
ff <- extension(basename(infiles), '.tif')
outpath <- 'outfilepath'
outfiles <- file.path(outpath, ff)
#run da loop
for (i in 1:length(infiles)) {
r <- raster(infiles[i])
writeRaster(r, paste(outfiles[i]), format ='GTiff', overwrite = T)
}
## END

Looping through files using dynamic name variable in R

I have a large number of files to import which are all saved as zip files.
From reading other posts it seems I need to pass the zip file name and then the name of the file I want to open. Since I have a lot of them I thought I could loop through all the files and import them one by one.
Is there a way to pass the name dynamically or is there an easier way to do this?
Here is what I have so far:
Temp_Data <- NULL
Master_Data <- NULL
file.names <- c("f1.zip", "f2.zip", "f3.zip", "f4.zip", "f5.zip")
for (i in 1:length(file.names)) {
zipFile <- file.names[i]
dataFile <- sub(".zip", ".csv", zipFile)
Temp_Data <- read.table(unz(zipFile,
dataFile), sep = ",")
Master_Data <- rbind(Master_Data, Temp_Data)
}
I get the following error:
In open.connection(file, "rt") :
I can import them manually using:
dt <- read.table(unz("D:/f1.zip", "f1.csv"), sep = ",")
I can create the sting dynamically but it feels long winded - and doesn't work when I wrap it with read.table(unz(...)). It seems it can't find the file name and so throws an error
cat(paste(toString(shQuote(paste("D:/",zipFile, sep = ""))),",",
toString(shQuote(dataFile)), sep = ""), "\n")
But if I then print this to the console I get:
"D:/f1.zip","f1.csv"
I can then paste this into `read.table(unz(....)) and it works so I feel like I am close
I've tagged in data.table since this is what I almost always use so if it can be done with 'fread' that would be great.
Any help is appreciated
you can use the list.files command here:
first set your working directory, where all your files are stored there:
setwd("C:/Users/...")
then
file.names = list.files(pattern = "*.zip", recursive = F)
then your for loop will be:
for (i in 1:length(file.names)) {
#open the files
zipFile <- file.names[i]
dataFile <- sub(".zip", ".csv", zipFile)
Temp_Data <- read.table(unz(zipFile,
dataFile), sep = ",")
# your function for the opened file
Master_Data <- rbind(Master_Data, Temp_Data)
#write the file finaly
write_delim(x = Master_Data, path = paste(file.names[[i]]), delim = "\t",
col_names = T )}

Generating new output filenames in for-loop

I want to write many raster files using a for loop.
path <- "D:/FolderA/FolderB/FolderC/FolderD/"
files1 <- c("FolderE1/raster.tif",
"FolderE2/raster.tif",
"FolderE3/raster.tif")
files2 <- c("FolderF1/raster.tif",
"FolderF2/raster.tif",
"FolderF3/raster.tif")
for (i in 1:length(files1)) {
raster1 <- raster(paste(path, files1[i], sep = ""), band = 1)
is.na(raster1[[0]])
raster2 <- raster(paste(path, files2[i], sep = ""), band = 1)
is.na(raster2[[0]])
mosaicraster <- mosaic(raster1, raster2, fun = mean)
NAvalue(mosaicraster) <- 0
outputfile <- paste(path, "mosaics/", files1[i], sep = "")
writeRaster(mosaikraster, outputfile , type = "GeoTIFF", datatype = "INT1U", overwrite = TRUE)
print(c(i, "of", length(files1)))
}
How do I create for each file a new folder within "D:/FolderA/FolderB/FolderC/FolderD/mosaics/" which includes FolderE1/, E2/... etc. plus the filename, e.g. mosaic.tif ?
outputfile <- paste(path, "mosaics/", files1[i], sep = "")
Does not give a satisfying result.
Just to demonstrate one method of making folders within a loop: If you have the directories in an object just looping over the elements of that object.
folders1 <- c("FolderE1",
"FolderE2",
"FolderE3")
for(i in folders1)
{
dir.create(i) #creates a dir named after the ith element of folders1
setwd(i) #goes into that directory
tiff('raster.tif') #plots your picture
plot(rnorm(10,rnorm(10)))
dev.off()
setwd('../') #goes out to the original folder
}
Just a warning: this is all a bit dangerous because mistakes can make a big mess.

Looping a function over multiple files

I wrote a simple function:
myfunction <- function(fileName, stringsAsFactors=TRUE,
check.names=FALSE,
skip =1,...) {
Data <- read.delim(fileName, skip = skip,
stringsAsFactors=stringsAsFactors,
check.names = check.names, ...)
cb <- list()
Index <- as.numeric(as.factor(Data[,1]))
cb <- cbind(Data, Index)
return(cb)
}
This function takes the first column of the file named Data, create an Index according to that first column and then cbind the file Data and the index created.
This function will be applied in file named: myfile_00.txt, myfile_01.txt and so on. For one single file it looks like:
myfunction (fileName = "myfile_00.txt")
myfunction (fileName = "myfile_01.txt")
.......
I have around 1000 files so I suppose, the loop can be as from another post:
mytxt <- dir(pattern=".txt")
n <- length(mytxt)
mylist <- vector("list", n)
for(i in 1:n) {
mylist[[i]] <- read.delim(mytxt[i], header = F, skip = 1)
}
then:
d <- lapply(mylist, myfunction)
Unfortunately it does not work... When using lapply an error occurs:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
'file' must be a character string or connection
Since I' m new in R probably I' m doing mistakes I'm not able to figure out.
Like #Arun pointed out, you are trying to run your function twice: once on the files and once one the data frames you have created... Instead, your code should look like this:
files <- list.files(pattern = ".txt")
mylist <- lapply(files, myfunction)

Resources