Clarity on issue with nested functions and environments in R - r

I created a function to save data to a specific location which is loaded as part of a package when launching rstudio:
save_data <- function(fileName, file_name){
file_path <- paste0("~/", file_name)
save(list=deparse(substitute(fileName)), file=file_path)
}
but when calling this function from inside another function it returns an "Error saving the following file: ...".
To reproduce the issue I created a savetestfunction:
savetest <- function(fileName){
data1 <- fileName * 10
save_data(data1, file_name = "test.RData")
data1
}
and a small savetestscript:
source("savetestfunction.R")
x <- c(1:10)
data1 <- savetest(x)
I spent a long time assuming the issue stemmed from the environments and what was being seen from each (e.g. save_data function loaded on startup couldn't see the temporary environment created when calling the savetestfunction) but as a test I tried adding a print(fileName) to the save_data function and to my surprise it could in fact see what fileName was at this point.
The Fix: I updated the function so that it assigned fileName within the environment created by the save_data function and it now functions as intended...
save_data <- function(fileName, file_name){
file_path <- paste0("//placeholder//", file_name)
assign(deparse(substitute(fileName)), fileName)
save(list=deparse(substitute(fileName)), file=file_path)
}
The Confusion: This then led me to believe that fileName was perhaps a promise when it was being saved however save() contains an argument eval.promises which is True by default so it can't be this!
The Question: What was causing this issue? (Confusion related to environments? Promises?) What could have been done to avoid this?
Edit: Tried to use saveRDS but didn't have success with this. More interested in why the save doesn't work in the first place rather than an actual fix as the assign within the save_data function already lets it function.

Related

NetCDF: HDF error only inside a loop in R

I have a script to loop through a selection of net cdf files. The files are opened, data extracted, then closed again. I have used this many times before and it works with no issue. I was recently sent a new selection of files to run through the same code. I can check the files individually using the ncdf4 package and nc_open() function. The files look fine and are not corrupt. However, when I run through the loop the function will not let me open the files and I get this error:
Error in R_nc4_open: NetCDF: HDF error
When I run though the loop to check, all is fine and the file opens. It just cannot open in the loop. There is no issue with the code.
Has anyone come across this before with non-corrupt net cdf files getting this error only on occasion. Even outside the loop I can run the code and get the error first time, then run it again without changing anything and the connection works.
Not sure how to trouble shoot this one, so just looking for advice as to why this might be happening.
Code snippet:
targetYear <- '2005-2019'
variables <- c('CHL','SSH')
ncNam <- list.files(folderdir, '.nc', recursive = TRUE)
for(v in 1:length((variables)))
{
varNam <- unlist(unique(variables))[v]
# Get names corresponding to variable
varLs <- ncNam[grep(varNam, basename(ncNam))]
varLs <- varLs[grep(targetYear, varLs)]}
varLs <- varLs[1]
export <- paste0(exportdir,varNam,'/')
dir.create(export, recursive = TRUE)
if(varNam == 'Proximity1km' | varNam == 'Proximity200m'| varNam ==
'ProximityCoast'| varNam == 'Bathymetry'){
fileNam <- varLs
ncfilename <- paste0(folderdir, fileNam)
print(ncfilename)
# Read ncfile
ncfile <- nc_open(ncfilename)
nc_close(ncfile)
gc()
} else {
fileNam <- varLs
ncfilename <- paste0(folderdir, fileNam)
print(ncfilename)
# Read ncfile
ncfile <- nc_open(ncfilename)
nc_close(ncfile)
gc()}`
I figured out the issue. It was to do with the error detection filer in the .nc files.
I removed the filter and the files work fine inside the loop. Still a bit strange.
Perhaps the ncdf4 package is not up to date with this filtering.

Read many .sas7bdat files via a loop in R

I am a noob in R and a experience a lot of trouble with the following:
I have to read in over 200 datasets and I want to do this automatically. I wrote some code that works perfectly for Rdata extensions but if I try it for SAS-files it always blocks...
path= "road"
# I make a list of all the different paths of all the files in my folder
File_pathnames <- list.files (path= Road, pattern = "*.sas7bdat", full.names=T)
# I create an empty list
list.data<-list()
# I try to run a loop to load all the SAS files:
for (i in 1: length(File_pathnames))
{
list.data[[i]] <- read_sas(File_pathnames[i])
}
Problem: it does not load the tables into my global environment (when I used the rdata files I used the load function and all the data appeared in the global environment). How Can I solve this?
many thanks!
Actually, your data ARE in the global environment, as elements of list.data (check list.data[[1]], list.data[[2]], ...)
The issue you have is linked to the fact that load loads an object in the environment using the name it had when it was saved. As an example
x <- 10
save(x, file='tmp')
rm(x)
x
load('tmp')
x
save x and reload it, while read_sas only load the data that you have to assign to a variable.
If you want to assign specifically each data set, you have to define a name for each of them and assign the data. Your loop would look like
for (i in 1: 1: length(File_pathnames))
{
namei <- paste0("name",i)
data <- read_sas(File_pathnames[i])
assign(namei, data)
}
and your data would be stored in "name1", "name2", ...
You should the assign each SAS files read in File_pathnames[i] as an object named FilenamesS[i]. Try
for (i in 1: length(File_pathnames))
{
data <- read_sas(File_pathnames[i])
assign (FilenamesS[i], data)
}

R not remembering objects written within functions

I'm struggling to clearly explain this problem.
Essentially, something has seemed to have happened within the R environment and none of the code I write inside my functions are working and not data is being saved. If I type a command line directly into the console it works (i.e. Monkey <- 0), but if I type it within a function, it doesn't store it when I run the function.
It could be I'm missing a glaring error in the code, but I noticed the problem when I accidentally clicked on the debugger and tried to excite out of the browser[1] prompt which appeared.
Any ideas? This is driving me nuts.
corr <- function(directory, threshold=0) {
directory <- paste(getwd(),"/",directory,"/",sep="")
file.list <- list.files(directory)
number <- 1:length(file.list)
monkey <- c()
for (i in number) {
x <- paste(directory,file.list[i],sep="")
y <- read.csv(x)
t <- sum(complete.cases(y))
if (t >= threshold) {
correl <- cor(y$sulfate, y$nitrate, use='pairwise.complete.obs')
monkey <- append(monkey,correl)}
}
#correl <- cor(newdata$sulfate, newdata$nitrate, use='pairwise.complete.obs')
#summary(correl)
}
corr('specdata', 150)
monkey```
It's a namespace issue. Functions create their own 'environment', that isn't necessarily in the global environment.
Using <- will assign in the local environment. To save an object to the global environment, use <<-
Here's some information on R environments.
I suggest you give a look at some tutorial on using functions in R.
Briefly (and sorry for my horrible explanation) objects that you define within functions will ONLY be defined within functions, unless you explicitly export them using (one of the possible approaches) the return() function.
browser() is indeed used for debugging, keeps you inside the function, and allows you accessing objects created inside the function.
In addition, to increase the probability to have useful answers, I suggest that you try to post a self-contained, working piece of code allowing quickly reproducing the issue. Here you are reading some files we have no access to.
It seems to me you have to store the output yourself when you run your script:
corr_out <- corr('specdata', 150)

Can convert a string to an object but can't save() it -- why? [duplicate]

I am repeatedly applying a function to read and process a bunch of csv files. Each time it runs, the function creates a data frame (this.csv.data) and uses save() to write it to a .RData file with a unique name. Problem is, later when I read these .RData files using load(), the loaded variable names are not unique, because each one loads with the name this.csv.data....
I'd like to save them with unique tags so that they come out properly named when I load() them. I've created the following code to illustrate .
this.csv.data = list(data=c(1:9), unique_tag = "some_unique_tag")
assign(this.csv.data$unique_tag,this.csv.data$data)
# I want to save the data,
# with variable name of <unique_tag>,
# at a file named <unique_tag>.dat
saved_file_name <- paste(this.csv.data$unique_tag,"RData",sep=".")
save(get(this.csv.data$unique_tag), saved_file_name)
but the last line returns:
"Error in save(get(this_unique_tag), file = data_tag) :
object ‘get(this_unique_tag)’ not found"
even though the following returns the data just fine:
get(this.csv.data$unique_tag)
Just name the arguments you use. With your code the following works fine:
save(list = this.csv.data$unique_tag, file=saved_file_name)
My preference is to avoid the name in the RData file on load:
obj = local(get(load('myfile.RData')))
This way you can load various RData files and name the objects whatever you want, or store them in a list etc.
You really should use saveRDS/readRDS to serialize your objects.
save and load are for saving whole environments.
saveRDS(this.csv.data, saved_file_name)
# later
mydata <- readRDS(saved_file_name)
you can use
save.image("myfile.RData")
This worked for me:
env <- new.env()
env[[varname]] <- object_to_save
save(list=c(varname), envir=env, file='out.Rda')
You could probably do it without a new env (but I didn't try this):
.GlobalEnv[[varname]] <- object_to_save
save(list=c(varname), envir=.GlobalEnv, file='out.Rda')
You might even be able to remove the envir variable.

Trying to save everything in R environment to disk

I need to save items in my environment in R to disk. I can't figure out why the following code doesn't work :
op <- function(){
for(i in 1:length(ls())){
file <- paste0(ls()[i],".Rds")
saveRDS(file,file)
}
}
There are actually couple things wrong here:
I suspect you want to save .GlobalEnv, not just op's environment. However the calls to ls will list objects in op's environment (which is only i by the time you call ls). If you want to list object in .GlobalEnv, call ls(.GlobalEnv)
Also, when you are calling saveRDS, you are telling it to save a string stored in file into path stored in file. So you are essentially only saving the path. Instead you need to get the object from .GlobalEnv
So one of correct ways to do it would be:
op <- function(){
obj_names <- ls(.GlobalEnv)
for(i in 1:length(obj_names){
file <- paste0(obj_names[i],".Rds")
saveRDS(get(obj_names[i], envir = .GlobalEnv), file)
}
}
Or a bit more idiomatic,
op <- function()
sapply(ls(.GlobalEnv), function(x) saveRDS(get(x, envir = .GlobalEnv), paste0(x, ".Rds"))
Also save function might be useful, if you don't mind saving all objects in one file. More at ?save
The code you wrote only saves a list of files with names identical to the names in the environment of your function (i.e. a single file "i.rds").
If you want to save the contents of an environment to a file, you might want to try the save() or save.image() function which does exactly what you are looking for.
For information try ?save. Here is some code:
a <- 1
b <- 2
save(list=ls(), file="myfile.rda")
rm(list=ls())
load(file="myfile.rda")
ls()
yielding:
[1] "a" "b"

Resources