I'm trying to override save() in R so that it creates any missing directories before saving an object. I'm having trouble passing an object through one function to another using the ellipsis method.
My example:
save <- function(...,file){ #Overridden save()
target.dir <- dirname(file) #Extract the target directory
if(!file.exists(target.dir)) {
#Create the target directory if it doesn't exist.
dir.create(target.dir,showWarnings=T,recursive=T)
}
base::save(...,file=file.path(target.dir,basename(file)))
}
fun1 <- function(obj) {
obj1 <- obj + 1
save(obj1,file="~/test/obj.RData")
}
fun1(obj = 1)
The code above results in this error:
Error in base::save(..., file = file.path(target.dir, basename(file))) :
object ‘obj1’ not found
I realize that the problem is that the object 'obj1' doesn't exist inside my custom save() function, but I haven't yet figured out how to pass it from fun1 to base::save.
I have tried:
base::save(parent.frame()$...,file=file.path(target.dir,basename(file)))
and:
base::save(list=list(...),file=file.path(target.dir,basename(file)))
with no success.
Any suggestions?
You need to specify the parent's environment to 'base::save' :
save <- function(...,file){ #Overridden save()
target.dir <- dirname(file) #Extract the target directory
if(!file.exists(target.dir)) {
#Create the target directory if it doesn't exist.
dir.create(target.dir,showWarnings=T,recursive=T)
}
base::save(...,file=file.path(target.dir,basename(file)),envir=parent.frame())
}
Note the parameter added to the base::save call.
fun1 <- function(obj) {
obj1 <- obj + 1
save(obj1,file="~/test/obj.RData")
}
In addition, use '=' to specify parameter names:
fun1(obj = 1)
Related
Context:
I'm trying to set up some mock data testing and I have saved some mock data down as csvs in a directory called 'test/'.
Each file relates to a get_data type function e.g. the mock data for get_energy() is stored as test/get_energy.csv.
I'd like to use list.files() to assign functions to my environment that read the csv.
files<-list.files('test/')
for (name in substr(files,1,nchar(files)-4)){
assign(name,function(){read.csv(eval(parse(text=paste0('test/',name,'.csv')))) })
}
but when I try and see the source code for the function get_energy by running
get_energy
it returns
function(){read.csv(eval(parse(text=paste0('test/',name,'.csv'))))}
whereas I need it to evaluate the string expression so that it returns
function(){read.csv('test/get_energy.csv')}
Put the whole function definition in the string like so:
files <- list.files("test/", pattern = 'csv$')
for (name in substr(files,1,nchar(files)-4)) {
assign(name, eval(parse(text = paste0('function(){ read.csv(', '"test/', name, '.csv', '")}'))))
}
I am trying to run something on a very large dataset. Basically, I want to loop through all files in a folder and run the function fromJSON on it. However, I want it to skip over files that produce an error. I have built a function using tryCatch however, that only works when i use the function lappy and not parLapply.
Here is my code for my exception handling function:
readJson <- function (file) {
require(jsonlite)
dat <- tryCatch(
{
fromJSON(file, flatten=TRUE)
},
error = function(cond) {
message(cond)
return(NA)
},
warning = function(cond) {
message(cond)
return(NULL)
}
)
return(dat)
}
and then I call parLapply on a character vector files which contains the full paths to the JSON files:
dat<- parLapply(cl,files,readJson)
that produces an error when it reaches a file that doesn't end properly and does not create the list 'dat' by skipping over the problematic file. Which is what the readJson function was supposed to mitigate.
When I use regular lapply, however it works perfectly fine. It generates the errors, however, it still creates the list by skipping over the erroneous file.
any ideas on how I could use exception handling with parLappy parallel such that it will skip over the problematic files and generate the list?
In your error handler function cond is an error condition. message(cond) signals this condition, which is caught on the workers and transmitted as an error to the master. Either remove the message calls or replace them with something like
message(conditionMessage(cond))
You won't see anything on the master though, so removing is probably best.
What you could do is something like this (with another example, reproducible):
test1 <- function(i) {
dat <- NA
try({
if (runif(1) < 0.8) {
dat <- rnorm(i)
} else {
stop("Error!")
}
})
return(dat)
}
cl <- parallel::makeCluster(3)
dat <- parallel::parLapply(cl, 1:100, test1)
See this related question for other solutions. I think using foreach with .errorhandling = "pass" would be another good solution.
How do I create a set of R functions that all access the same private variable?
Let's say I want to create readSetting(key) and writeSetting(key,value) functions that both operate on the same hidden list settings. If I try it like so...
local( {
settings <- list()
readSetting <<- function ( key ) settings[[key]]
writeSetting <<- function ( key, value ) settings[[key]] = value
} )
...then readSetting and writeSetting are not visible outside of the local call. If I want them to be visible there, I have to first assign
readSetting <- writeSetting <- NULL
outside the local call. There must be a better way, because my code isn't DRY if I have to say in two different ways which variables are public.
(The context of this work is that I'm developing an R package, and this code will be in an auxiliary file loaded into the main file via source.)
This question is related to How to limit the scope of the variables used in a script? but the answers there do not solve my problem.
You can simulate somthing like that using R6Class package and the following very rough code:
Privates <- R6Class("Privates",
public=list(
readSetting = function(key) {
private$settings[[key]]
},
writeSetting = function(key,value) {
private$settings[[key]] <<- value
}
),
private=list(
settings = list()
)
)
a <- Privates$new()
a$writeSetting("a",4)
a$readSetting("a")
Directly reading o setting the a$setting would not work.
I need to create a function in R that reads all the files in a folder (let's assume that all files are tables in tab delimited format) and create objects with same names in global environment. I did something similar to this (see code below); I was able to write a function that reads all the files in the folder, makes some changes in the first column of each file and writes it back in to the folder. But the I couldn't find how to assign the read files in to an object that will stay in the global environment.
changeCol1 <- function () {
filesInfolder <- list.files()
for (i in 1:length(filesInfolder)){
wrkngFile <- read.table(filesInfolder[i])
wrkngFile[,1] <- gsub(0,1,wrkngFile[,1])
write.table(wrkngFile, file = filesInfolder[i], quote = F, sep = "\t")
}
}
You are much better off assigning them all to elements of a named list (and it's pretty easy to do, too):
changeCol1 <- function () {
filesInfolder <- list.files()
lapply(filesInfolder, function(fname) {
wrkngFile <- read.table(fname)
wrkngFile[,1] <- gsub(0, 1, wrkngFile[,1])
write.table(wrkngFile, file=fname, quote=FALSE, sep="\t")
wrkngFile
}) -> data
names(data) <- filesInfolder
data
}
a_list_full_of_data <- changeCol1()
Also, F will come back to haunt you some day (it's not protected where FALSE and TRUE are).
add this to your loop after making the changes:
assign(filesInfolder[i], wrkngFile, envir=globalenv())
If you want to put them into a list, one way would be, outside your loop, declare a list:
mylist = list()
Then, within your loop, do like so:
mylist[[filesInfolder[i] = wrkngFile]]
And then you can access each object by looking at:
mylist[[filename]]
from the global env.
I'm working through an R tutorial. I've been working on a function and one of the parts of the function is to take an argument and use it to define a directory in which to find data. It must then load that data.
As it stands the following works:
getmonitor <- function(id, directory){
csvfile <- function(id) {
if (id < 10) {
paste0(0,0,id,".csv")
} else if (id < 100) {
paste0(0,id,".csv")
} else paste0(id,".csv")
}
foo <- read.csv(csvfile(id))
}
Fine. But I now have to use the "directory" parameter to define the directory where the csv file must be read from. I've tried various things here to no avail.
Currently, the code works if the assumption is that the data are in the working directory. I need to say "go to the directory called (directory) and then read.csv.
The directory with all of the data files is called "specdata" and the parameter for directory is thus "specdata".
I tried the following:
getmonitor <- function(id, directory){
csvfile <- function(id) {
if (id < 10) {
paste0(0,0,id,".csv")
} else if (id < 100) {
paste0(0,id,".csv")
} else paste0(id,".csv")
}
filepath <- append(directory,"/",csvfile(id))
foo <- read.csv(filepath)
}
But then I received an error message "Error in !after : invalid argument type
"
I have tried a few various things and if I cut and paste all the code it would probably be more messy than help.
What would be a logical way to do this? Am I on the right track with append? What else should I sue if not? I need to take the parameter "directory" and then load data from that directory.
getmonitor <- function(id, directory=getwd(), ...){
csvfile <- sprintf("%03d.csv", id)
filepath <- file.path(directory, csvfile)
foo <- read.csv(filepath, ...)
foo
}