How can I write a generic R function that cleans the current workspace apart from some self-defined variables? For sure, I can achieve this in a single script with the following code:
prj = '/path/to/project'
src = 'string'
data_to_clean = head(iris)
rm(list = ls()[ !ls() %in% c('prj', 'src') ] )
# only prj and src remain
However I want this to be a function, so that it's applicable for multiple scripts and I can change the variables which should not be cleaned, in one place. Is this possible?
In case you wrap this in a function, you have to keep in mind, that a function will create its own environment, when executed. Therefore, you need to specify the environment every time (in each ls as well as rm). You probably want to remove them from the .GlobalEnv.
clean_workspace <- function(not_to_be_removed) {
rm(list =
setdiff(ls(envir = .GlobalEnv), c("clean_workspace", not_to_be_removed)),
envir = .GlobalEnv)
}
prj = '/path/to/project'
src = 'string'
data_to_clean = head(iris)
clean_workspace(c('prj', 'src'))
In order not to remove the function itself, it should be added to the values not to be removed.
If you want to read more about environments, have a look a this overview.
I think you want to remove the function itself. The important bit is to tell rm the environment where to remove these objects from:
clean_workspace <- function(not_to_be_removed, envir = globalenv()) {
objs <- ls(envir = envir)
rm(list = objs[ !objs %in% not_to_be_removed], envir = envir)
}
prj = '/path/to/project'
src = 'string'
data_to_clean = head(iris)
clean_workspace(c('prj', 'src'))
ls()
#> [1] "prj" "src"
Related
I have these files in my global environment:
x <- sapply(sapply(ls(), get), is.data.frame)
n = names(x)[(x==TRUE)]
n
[1] "sample_1" "sample_10" "sample_2" "sample_3" "sample_4" "sample_5" "sample_6" "sample_7" "sample_8" "sample_9" "table_i"
I want to remove all files that start with "samp". I found the this code that can do this (How do I clear only a few specific objects from the workspace?):
rm(list = apropos("samp_"))
Now, I want to learn how to do the same thing using a different way. I found another way to find out all files in the global environment that start with "samp":
nn = grep("samp", n, value = TRUE)
[1] "sample_1" "sample_10" "sample_2" "sample_3" "sample_4" "sample_5" "sample_6" "sample_7" "sample_8" "sample_9"
Then, I tried to delete these files:
for (file in nn){
nn[i] <- NULL
}
do.call(file.remove, list(nn))
I think I am missing something here - can someone please show me how to correct this?
Thank you!
You can make use of pattern inside of the remove function:
rm(list = ls(pattern = "^samp"))
Or using grep:
rm(list = grep("^samp", ls(), value = TRUE))
I need to access (i.e., read and save) the items of the environment I'm working in. I have written the following function to save all objects in my (global) environment:
save_vars <- function(list.of.vars = NULL,
prefix = "StatusQuo",
path = "data") {
if(is.null(list.of.vars)) list.of.vars <- ls()
date_time <- Sys.time()
if (!is.null(path))
path <- paste0(path, "/")
file_name <- paste0(path, prefix, "_", date_time, ".RData")
save(list = list.of.vars, file = file_name)
}
The idea was that if no list.of.vars argument is passed to the function, using ls(), the function accesses the variables of the environment calling save_vars. However, it only saves the variables within the scope of the function itself. I know I can call the function as save_vars(ls()) to do the job, but is there a neater way around it?
Probably cleanest to pass the environment:
fun <- function(envir = parent.frame()) ls(envir = envir)
fun()
This lists the objects in the caller but also lets the user change which environment is used. For example, they could force the global environment to be used:
fun(.GlobalEnv)
It's obviously not something to advise in an ideal workflow but sometimes it can be useful.
Can it be done easily ?
I made the following functions, it will put a temp file in your home folder and delete it when it's fetched by default :
shoot <- function(..., list = character(), rm = FALSE){
path <- file.path(path.expand("~"),"temp_object.RData")
save(..., list = list, file = path)
if(rm) rm(list = c(list,as.character(substitute(alist(...))[-1])),
envir = parent.frame())
invisible(NULL)
}
loot <- function(rm = TRUE){
path <- file.path(path.expand("~"),"temp_object.RData")
if(file.exists(path)){
load(path,envir = parent.frame())
if(rm) file.remove(path)
} else {
stop("nothing to loot!")
}
invisible(NULL)
}
test <- "abcd"
shoot(test)
rm(test)
loot() # in practice from another session
test
# [1] "abcd"
Useful in my case if one RStudio session has a bug and I can't plot, so I can send it to another.
With a simple change in the default path can be used in a network to easily pass data between colleagues for example.
Thanks to #MrFlick for suggestions
I have a variable in my global environment called myList. I have a function that modifies myList and re-assigns it to the global environment called myFunction. I only want myList to be modified by myFunction. Is there a way to prevent any other function from modifying myList?
For background, I am building a general tool for R users. I don't want users of the tool to be able to define their own function to modify myList. I also don't want to myself to be able to modify myList with a function I may write in the future.
I have a potential solution, but I don't like it. When the tool is executed, I could examine the text of every function defined by a user and search for the text that will assign myList to the global environment. I don't like the fact that I need to search over all functions.
Does anyone know if what I am looking for is implementable in R? Thanks for any help that can be provided.
For a reproducible example. I need code that will make the following example possible:
assign('myList', list(), envir = globalenv())
myFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # I need some code that will allow this function to run successfully
userFunction() # and cause an error when this function runs
Sounds like you need the modules package.
Basically, each unit of code has its own scope.
e.g.
# install.packages("modules")
# Load library
library("modules")
# Create a basic module
m <- module({
.myList <- list()
myFunction <- function() {
.myList <<- c(.myList, 'test')
}
get <- function() .myList
})
# Accessor
m$get()
# list()
# Your function
m$myFunction()
# Modification
m$get()
# [[1]]
# [1] "test"
Note, we tweaked the example slightly by changing the variable name to .myList from myList. So, we'll need to update that in the userfunction()
userFunction <- function() {
.myList <- c(.myList, 'test')
}
Running this, we now get:
userFunction()
# Error in userFunction() : object '.myList' not found
As desired.
For more detailed examples see modules vignette.
The alternative is you can define an environment (new.env()) and then lock it after you have loaded myList.
This is all around a bad idea. Beginning with assignment into the global environment (I'd never use a package that does this) to surprising your users. You should probably just use S4 or reference classes.
Anyway, you can lock the bindings (or environment if you followed better practices). You wouldn't stop an advanced user with that, but they would at least know that you don't want them to change the object.
createLocked <- function(x, name, env) {
assign(name, x, envir = env)
lockBinding(name, env)
invisible(NULL)
}
createLocked(list(), "myList", globalenv())
myFunction <- function() {
unlockBinding("myList", globalenv())
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
lockBinding("myList", globalenv())
invisible(NULL)
}
userFunction <- function() {
myList <- c(myList, 'test')
assign('myList', myList, envir = globalenv())
}
myFunction() # runs successfully
userFunction()
#Error in assign("myList", myList, envir = globalenv()) :
# cannot change value of locked binding for 'myList'
I am fairly new to R
and will try my best to make myself understood.
Suppose if I have an existing rdata file with multiple objects.
Now I want to add a data frame to it how do i do that?
I tried the following:
write.data.loc <- 'users/Jim/Objects'
rdataPath <- 'users/Jim/Objects.Rda'
myFile<- read.csv("myFile.csv")
loadObjects <- load(rdataPath)
save(loadObjects,myFile,file=paste(write.data.loc,".Rda",sep=""))
But this does not seem to work?
I'm not certain of your actual use-case, but if you must "append" a new object to an rda file, here is one method. This tries to be clever by loading all of the objects from the rda file into a new environment (there are many tutorials and guides that discuss the use and relevance of environments, Hadley's "Advanced R" is one that does a good job, I think).
This first step loads all of the objects into a new (empty) environment. It's useful to use an otherwise-empty environment so that we can get all of the objects from it rather easily using ls.
e <- new.env(parent = emptyenv())
load("path/to/.rda", envir = e)
The object you want to add should be loaded into a variable within the environment. Note that the dollar-sign access looks the same as lists, which makes it both (1) easy to confuse the two, and (2) easy to understand the named indexing that $ provides.
e$myFile <- read.csv("yourFile.csv")
This last piece, re-saving the rda file, is an indirect method. The ls(envir = e) returns the variable names of all objects within the environment. This is good, because save can deal with objects or with their names.
do.call("save", c(ls(envir = e), list(envir = e, file = "newsave.rda")))
Realize that this is not technically appending the data.frame to the rda file, it's over-writing the rda file with a new one that happens to contain all the previous objects and the new one data.frame.
I wrote this solution that can add dataframes, list, matrices or lists. By default it will overwrite an existing object but can be reversed with overwrite=TRUE.
add_object_to_rda <- function(obj, rda_file, overwrite = FALSE) {
.dummy <- NULL
if (!file.exists(rda_file)) save(.dummy, file = rda_file)
old_e <- new.env()
new_e <- new.env()
load(file = rda_file, envir = old_e)
name_obj <- deparse(substitute(obj)) # get the name of the object
# new_e[[name_obj]] <- get(name_obj) # use this only outside a function
new_e[[name_obj]] <- obj
# merge object from old environment with the new environment
# ls(old_e) is a character vector of the object names
if (overwrite) {
# the old variables take precedence over the new ones
invisible(sapply(ls(new_e), function(x)
assign(x, get(x, envir = new_e), envir = old_e)))
# And finally we save the variables in the environment
save(list = ls(old_e), file = rda_file, envir = old_e)
}
else {
invisible(sapply(ls(old_e), function(x)
assign(x, get(x, envir = old_e), envir = new_e)))
# And finally we save the variables in the environment
save(list = ls(new_e), file = rda_file, envir = new_e)
}
}