Understanding element wise clearing of R's workspace - r

I am trying to find a way to clear the workspace in R using lists.
According to the documentation, I could simply create a vector with all my workspace objects: WS=c(ls()). But nothing happens when I try element wise deletion with rm(c(ls()) or rm(WS).
I know I can use the command rm(list=ls()). I am just trying to figure how R works. Where did I err in my thinking in applying the rm() function on a vector with the list of the objects?
Specifically, I'm trying to create a function similar to the clc function in MATLAB, but I am having trouble getting it to work. Here's the function that I've written:
clc <- function() { rm(list = ls()) }

From ?rm, "Details" section:
Earlier versions of R incorrectly claimed that supplying a character vector in ... removed the objects named in the character vector, but it removed the character vector. Use the list argument to specify objects via a character vector.
Your attempt should have been:
rm(list = WS)
HOWEVER, this will still leave you with an object (a character vector) named "WS" in your workspace since that was created after you called WS <- c(ls()). To actually get rid of the "WS" object, you would have had to use rm(WS, list = WS). :-)
How does it work? If you look at the code for rm, the first few lines of the function captures any individual objects that have been specified, whether quoted or unquoted. Towards the end of the function, you will find the line list <- .Primitive("c")(list, names) which basically creates a character vector of all of the objects individually named and any objects in the character vector supplied to the "list" argument.
Update
Based on your comment, it sounds like you're trying to write a function like:
.clc <- function() {
rm(list = ls(.GlobalEnv), envir = .GlobalEnv)
}
I think it's a little bit of a dangerous function, but let's test it out:
ls()
# character(0)
for (i in 1:5) assign(letters[i], i)
ls()
# [1] "a" "b" "c" "d" "e" "i"
.clc()
ls()
# character(0)
Note: FYI, I've named the function .clc (with a dot) so that it doesn't get removed when the function is run. If you wanted to write a version of the function without the ., you would probably do better to put the function in a package and load that at startup to have the function available.

Related

R parLapply: How to (or Can we) access an object within the parallel code

I am trying to use parLapply to run a custom function. Since my actual code and data is not very reader friendly, I am creating a pseudo code for reference. I do the following:
a) First, I create a custom function. This function takes an argument say "Argument1". Argument1 is a list object which is what I use to run the parLapply on later.
b) Inside the function, based on Argument1, I create a subset called subset_data (subsetting on the full dataset which is supplied while calling parLapply).
c) After getting subset_data, I obtain a list of unique items for Variable2 and then further subset it depending on the number of unique items in Variable2.
d) Finally I run a function (SomeOtherFunction) which takes subset_data2 as the argument.
SomeCustomFunction = function(Argument1){
subset_data = OriginalData[which(OriginalData$Variable1==Argument1),]
some_other_variable = unique(subset_data$Variable2)
for (object in some_other_variable){
subset_data2 = subset_data[which(subset_data$Variable2 == object),]
FinalOutput = SomeOtherFunction(subset_data2)
}
return(SomeOutput)
}
SomeOtherFunction=function(subset_data2){
#Do Some computation here
}
Next I can create clusters in this way:
cl=parallel::makeCluster(2,type="PSOCK")
registerDoParallel(cl)
And supply the objects Argument1, OriginalData by calling clusterExport and then finally run parLapply by supplying SomeCustomFunction and a list for Argument1 (suppose Argument1_list).
clusterExport(cl=cl, list("Argument1","OriginalData"),envir=environment())
zz=parLapply(cl=cl,fun=SomeCustomFunction,Argument1=Argument1_list)
However, in this case, when I run parLapply, I get an error saying
Error in get(name, envir = envir) : object 'subset_data2' not found
In this case, I was assuming that since subset_data2 is being created within the first function, the object subset_data2 will get supplied automatically. Clearly this is not happening.
Is there a way for me supply this 2nd subset (subset_data2) within the function SomeCustomFunction without passing it to the cluster when calling ClusterExport?
If the question is not clear, please let me know and I can modify it accordingly. Thanks in advance.
P.S. I read this question: using parallel's parLapply: unable to access variables within parallel code, but in my case I do not call parLapply inside my function.
In the related question you mention, the top answer passes clusterExport a character vector of variable names, whereas you pass a list. Also, help(clusterExport) reveals: "varlist: character vector of names of objects to export".
Also, you're missing a " after Argument1 here: list("Argument1,"OriginalData, but I'm guessing that's only the sample code you posted, not in your real code.
PS: It's a step in the right direction that you put some code, but your question will get more responses if you put sample data and code that can be directly pasted and run to reproduce the error.

R function - Error argument is missing, with no default

I am testing a simple function in R that should transform a time series object into a data frame.
However the code works fine outside the function but within the function it gives me the error in the object.
>fx<-function(AMts) {
x<-as.data.frame(AMts)
return(x)
}
>fx()
I expeced to have the data.frame x in my environment, but I got
Error in as.data.frame(AMts) : argument "AMts" is missing, with no default
If it's inside a function, you need to have "<<-" as the assignment operator instead of the traditional "<-". <<- tells R to keep the object after the function is done running.
>fx<-function(AMts) {
x<<-as.data.frame(AMts) # "<<-" is what saves "x" in your environment
return(x) # remove this line; this prints data frame "x" to the console, but it doesn't save it
}
>fx(AMts)
EDIT: As the commenters have already pointed out, you aren't including any parameters in your function. Above I made it fx(AMts) to make it clear you need to pass in AMts to the function too.

How do you re-write the rm() function in R to clear your workspace automatically [duplicate]

I am trying to find a way to clear the workspace in R using lists.
According to the documentation, I could simply create a vector with all my workspace objects: WS=c(ls()). But nothing happens when I try element wise deletion with rm(c(ls()) or rm(WS).
I know I can use the command rm(list=ls()). I am just trying to figure how R works. Where did I err in my thinking in applying the rm() function on a vector with the list of the objects?
Specifically, I'm trying to create a function similar to the clc function in MATLAB, but I am having trouble getting it to work. Here's the function that I've written:
clc <- function() { rm(list = ls()) }
From ?rm, "Details" section:
Earlier versions of R incorrectly claimed that supplying a character vector in ... removed the objects named in the character vector, but it removed the character vector. Use the list argument to specify objects via a character vector.
Your attempt should have been:
rm(list = WS)
HOWEVER, this will still leave you with an object (a character vector) named "WS" in your workspace since that was created after you called WS <- c(ls()). To actually get rid of the "WS" object, you would have had to use rm(WS, list = WS). :-)
How does it work? If you look at the code for rm, the first few lines of the function captures any individual objects that have been specified, whether quoted or unquoted. Towards the end of the function, you will find the line list <- .Primitive("c")(list, names) which basically creates a character vector of all of the objects individually named and any objects in the character vector supplied to the "list" argument.
Update
Based on your comment, it sounds like you're trying to write a function like:
.clc <- function() {
rm(list = ls(.GlobalEnv), envir = .GlobalEnv)
}
I think it's a little bit of a dangerous function, but let's test it out:
ls()
# character(0)
for (i in 1:5) assign(letters[i], i)
ls()
# [1] "a" "b" "c" "d" "e" "i"
.clc()
ls()
# character(0)
Note: FYI, I've named the function .clc (with a dot) so that it doesn't get removed when the function is run. If you wanted to write a version of the function without the ., you would probably do better to put the function in a package and load that at startup to have the function available.

Saving workspace (in a particular frame) for post-mortem debugging in R

While debug some R code, I'd like to save the workspace (i.e. all present objects) in some particular frame so that I can utilize those objects outside of the the debugging browser. Following the example given in this answer:
x <- 1:5
y <- x + rnorm(length(x),0,1)
f <- function(x,y) {
y <- c(y,1)
lm(y~x)
}
Setting options(error = recover) and running f(x,y) allows us to pick which frame to enter. Here I'll pick 2 and check my workspace with ls() like so:
Browse[1]> ls()
[1] "cl" "contrasts" "data" "formula" "m" "method" "mf" "model" "na.action" "offset" "qr"
[12] "ret.x" "ret.y" "singular.ok" "subset" "weights" "x" "y"
I'd like to be able to save all of these objects to use them later. Using save.image() in the browser, or inserting it into the relevant function, saves the environment f(x,y) was originally called from. I can use dump.frames() and call debugger() on the resulting dump.frames classed object, but I still have to work interactively from within the debugging browser. All I really want is an .RData file containing the 18 above listed objects.
The point of all this is to reproduce certain errors within an R Markdown document. If anyone has an idea for that particular application it would be appreciated.
save(list=ls(), file="mylocals.Rda")
The hurdle I had to get over to realize this was the way forward was the name of that argument in save. Why did the authors use the argument name, "list", when it was a character vector (and not a list)? Same whine applies to the rm function argument names.

modify the body text of existing function objects

I have some .Rdata files that contain saved functions as defined by approxfun().
Some of the save files pre-date the change to approxfun from package "base" to "stats", and so the body has
PACKAGE = "base"
and the wrong package causes the function to fail. I can fix(myfun) and simply replace "base" with "stats", but I want a neater automatic way.
Can I do this with gsub() and body() somehow?
I can get the body text and substitute there with
as.character(body(myfun))
but I don't know how to turn that back into a "call" and replace the definition.
(I know that a better solution is to have saved the data originally used by approxfun and simply recreate the function, but I wonder if there's a sensible way to modify the existing one.)
Edit: I found it here
What ways are there to edit a function in R?
Use the substitute function.
For example:
myfun <- function(x,y) {
result <- list(x+y,x*y)
return(result)
}
Using body, treat myfun as a list to select what you would like to change in the function:
> body(myfun)[[2]][[3]][[2]]
x + y
When you change this, you must use the substitute function so you replace the part of the function with a call or name object, as appropriate. Replacing with character strings doesn't work since functions are not stored as or operated on as character strings.
body(myfun)[[2]][[3]][[2]] <- substitute(2*x)
Now the selected piece of the function has been replaced:
> myfun
function (x, y)
{
result <- list(2 * x, x * y)
return(result)
}

Resources