I am trying to create a large rasterStack in R. I have 255 .nc files in a directory. So far I have the following code:
files = list.files(pattern = "*.nc")
st<- stack()
for (i in 1:length(files)) {
r<-raster(files[i], level = 1, crs = newproj, varname = "SWE" )
st<- addLayer(r)
}
When I run the code outside of a for loop with only one file, it works fine, but when I run it with the for loop (trying to add every file to the stack, I get this error:
Error in sapply(x, fromDisk) & sapply(x, inMemory) :
operations are possible only for numeric, logical or complex types
If someone could explain the error to me and where I am going wrong, that would be awesome!
Try this: replace st<- addLayer(r) with st<- addLayer(st, r).
Related
Okay, so I needed to split a larger file into a bunch of CSV's to run through a non-R program. I used this loop to do it:
for(k in 1:14){
inner_edge = 2000000L*(k-1) + 1
outter_edge = 2000000L*(k)
part <- slice(nc_tu_CLEAN, inner_edge:outter_edge)
out_name = paste0("geo/geo_CLEAN",(k),".csv")
write_csv(part,out_name)
Sys.time()
}
which worked great. Except I'm having a problem in this other program, and need to read a bunch of these back in to trouble shoot. I tried to write this loop for it, and get the following error:
for(k in 1:6){
csv_name <- paste0("geo_CLEAN",(k),".csv")
geo_CLEAN_(k) <- fread(file= csv_name)
}
|--------------------------------------------------|
|==================================================|
Error in geo_CLEAN_(k) <- fread(file = csv_name) :
could not find function "geo_CLEAN_<-"
I know I could do this line by line, but I'd like to have that be a loop if possible. What I want is for geo_CLEAN_1 to relate to fread geoCLEAN1.csv; geo_CLEAN_2 to relate to fread geoCLEAN2.csv, etc.
We need assign if we are interested in creating objects
for(k in 1:6){
csv_name <- paste0("geo_CLEAN",(k),".csv")
assign(sub("\\.csv", "", csv_name), fread(file= csv_name))
}
I am trying to create a loop where I select one file name from a list of file names, and use that one file to run read.capthist and subsequently discretize, fit, derived, and save the outputs using save. The list contains 10 files of identical rows and columns, the only difference between them are the geographical coordinates in each row.
The issue I am running into is that capt needs to be a single file (in the secr package they are 'captfile' types), but I don't know how to select a single file from this list and get my loop to recognize it as a single entity.
This is the error I get when I try and select only one file:
Error in read.capthist(female[[i]], simtraps, fmt = "XY", detector = "polygon") :
requires single 'captfile'
I am not a programmer by training, I've learned R on my own and used stack overflow a lot for solving my issues, but I haven't been able to figure this out. Here is the code I've come up with so far:
library(secr)
setwd("./")
files = list.files(pattern = "female*")
lst <- vector("list", length(files))
names(lst) <- files
for (i in 1:length(lst)) {
capt <- lst[i]
femsimCH <- read.capthist(capt, simtraps, fmt = 'XY', detector = "polygon")
femsimdiscCH <- discretize(femsimCH, spacing = 2500, outputdetector = 'proximity')
fit <- secr.fit(femsimdiscCH, buffer = 15000, detectfn = 'HEX', method = 'BFGS', trace = FALSE, CL = TRUE)
save(fit, file="C:/temp/fit.Rdata")
D.fit <- derived(fit)
save(D.fit, file="C:/temp/D.fit.Rdata")
}
simtraps is a list of coordinates.
Ideally I would also like to have my outputs have unique identifiers as well, since I am simulating data and I will have to compare all the results, I don't want each iteration to overwrite the previous data output.
I know I can use this code by bringing in each file and running this separately (this code works for non-simulation runs of a couple data sets), but as I'm hoping to run 100 simulations, this would be laborious and prone to mistakes.
Any tips would be greatly appreciated for an R novice!
I have an issue that really bugs me: I've tried to convert to Rproj lately, because I would like to make my data and scripts available at some point. But with one of them, I get an error that, I think, should not occur. Here is the tiny code that gives me so much trouble, the R.proj being available at: https://github.com/fredlm/mockup.
library(readxl)
list <- list.files(path = "data", pattern = "file.*.xls") #List excel files
#Aggregate all excel files
df <- lapply(list, read_excel)
for (i in 1:length(df)){
df[[i]] <- cbind(df[[i]], list[i])
}
df <- do.call("rbind", df)
It gives me the following error right after "df <- lapply(list, read_excel)":
Error in read_fun(path = path, sheet = sheet, limits = limits, shim =
shim, : path[1]="file_1.xls": No such file or directory
Do you know why? When I do it old school, i.e. using 'setwd' before creating 'list', everything works just fine. So it looks like lapply does not know where to look for the file when used in a Rproj, which seems very odd...
What did I miss?
Thanks :)
Thanks to a non-stackoverflower, a solution was found. It's silly, but 'list' was missing a directory, so lapply couldn't aggregate the data. The following works just fine:
list <- paste("data/", list.files(path = "data", pattern = pattern = "file.*.xls"), sep = "") #List excel files
So, Im creating a loop in R that reads through multiple csv files in a directory called "specdata", and afterwards, tells you the mean of a particular colum in common inside those files. This function is represented in the next parragraph the arguments you specify are the directory in which those files are located, the colum you want means to be calculated, and id sequence, that tells you how many files do you want to read depending of de object number represented throudh subsetting []
HERE IS THE FUNCTION:
pollutantmean <- function(directory,pollutant,id) {
for (i in id) {archivo <- list.files(directory)[i]
file(archivo[i])
datapollution <- read.csv(archivo[i],header = TRUE)
datamatrix <- data.matrix(datapollution)
mean(datamatrix[pollutant],na.rm = TRUE)}}
the problem is that when the function is called:
pollutantmean("specdata",sulfurate,1:15)
it gives the following error message:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
Show Traceback
Rerun with Debug
Error in file(file, "rt") : cannot open the connection
The interesting part is that the error does not occur when you call the part of the function that gives the error indepently of the function, like this:
file(list.files("specdata")[2])
in this case, It gives the desired conection, later when you apply read.csv("specdata")[2] it works perfectly also.
So here is my question, what Im I missing? It should be conecting and reading all the files the same way it does when the subsetting is on [2] , but replacing the number 2 with the respective i, looping through the function and making me happy. Why does it give an error here but not when subsetting on 2 is executed?
I kind of read somewhere that I have to use Rbind, but either way that would be after generating the conection and reading the files listed, I need to solve this first warning message before that ( not sure how I would do it afterwards...).
Yep, im from coursera, sorry to be a cliche, but im a really nice guy PLEASE HELP :)
files <- list.files(directory, full.names = TRUE, pattern = ".csv") # be sure your working directoy contains this data
pollutantmean <- function(directory, pollutant, id) {
for (i in id) {
datapollution <- read.csv(files[i], header = TRUE, stringsAsFactors = FALSE)
datamatrix <- data.matrix(datapollution)
mean(datamatrix[pollutant],na.rm = TRUE)}
}
pollutantmean("specdata",sulfurate,1:15)
so it worked, just adding full.names = TRUE, eliminating the files function, and elimating i on the subsetting of list.files did the trick on solving that problem.
function(directory,pollutant,id) {
for (i in id) {archivo <- list.files(directory,full.names = TRUE)
datapollution <- read.csv(archivo[i],header = TRUE)
datamatrix <- data.matrix(datapollution)
resultmean <- mean(datamatrix[pollutant],na.rm = TRUE)}
print(resultmean)}
I would like to understand though:
What does the full.names = TRUE argument on list.files function actually do?
Why is no file() function needed? is the conection generated atuomatically with list.files()?
I am trying to loop through all the subfolders of my wd, list their names, open 'data.csv' in each of them and extract the second and last value from that csv file.
The df would look like this :
Name_folder_1 2nd value Last value
Name_folder_2 2nd value Last value
Name_folder_3 2nd value Last value
For now, I managed to list the subfolders and each of the file (thanks to this thread: read multiple text files from multiple folders) but I struggle to implement (what I'm guessing should be) a nested loop to read and extract data from the csv files.
parent.folder <- "C:/Users/Desktop/test"
setwd(parent.folder)
sub.folders1 <- list.dirs(parent.folder, recursive = FALSE)
r.scripts <- file.path(sub.folders1)
files.v <- list()
for (j in seq_along(r.scripts)) {
files.v[j] <- dir(r.scripts[j],"data$")
}
Any hints would be greatly appreciated !
EDIT :
I'm trying the solution detailed below but there must be something I'm missing as it runs smoothly but does not produce anything. It might be something very silly, I'm new to R and the learning curve is making me dizzy :p
lapply(files, function(f) {
dat <- fread(f) # faster
dat2 <- c(basename(dirname(f)), head(dat$time, 1), tail(dat$time, 1))
write.csv(dat2, file = "test.csv")
})
Not easy to reproduce but here is my suggestion:
library(data.table)
files <- list.files("PARENTDIR", full.names = T, recursive = T, pattern = ".*.csv")
lapply(files, function(f) {
dat <- fread(f) # faster
# Do whatever, get the subfolder name for example
basename(dirname(f))
})
You can simply look recursivly for all CSV files in your parent directory and still get their corresponding parent folder.