Hide data in global environment - r

Is there a function that allows for data to be hidden in the global environment but it can still be accessed?
For example, I have a very long script with up to 100+ lines and my global environment is looking messy, there is too much and it strains my brain finding what is necessary.
I have searched up similar questions and they involve creating a package, quite frankly I have no time to learn, at this moment.

If you name all the objects you don't want to appear in the global environment starting with a dot (.), for example:
.foo <- 'bar' the object will be accesible but will be hidden in the global environment or in any ls() call:
> .foo <- 'bar'
> .foo
[1] "bar"
> ls()
character(0)
>
Edit: Adding a working example

Possible solutions would be:
removing objects once they're not needed any more
put related variables into a list (hello lapply, sapply)
move into a separate custom environment (new.env())
simplify the script to not use as many objects
run script in batch mode and miss out on what the environment looks like altogether

Sounds like an XY problem, 100 lines is quite small, it's likely that you are using too many temporary variables or are numbering objects that should be in a list.
You also don't mention why you don't like your environment to be "messy", my guess is that maybe you don't like the output of ls() ?
Then maybe you'll be happy to learn about the pattern argument of ls() which will allow you to filter the result, mostly useful with prefixes or suffixes as in the following examples :
something <- 1
some_var <- 2
another_var <- 3
ls(pattern ="^some")
#> [1] "some_var" "something"
ls(pattern="var$")
#> [1] "another_var" "some_var"
Created on 2019-11-17 by the reprex package (v0.3.0)

Related

I want to have a list of all the functions in my R workspace that works like the objects() command

I love working with R functions, so my workspace accumulates a lot of functions.
However, the "objects()" command seems to return strings that name my objects instead of the objects themselves. So when I have a function named "barchart00", it shows up with the objects() command, and if I test its type, it is detectable as a function, as the following code shows:
is.function(barchart00)
[1] TRUE
> objects()
[1] "barchart00"
> OL<-objects()
> OL
[1] "barchart00"
> is.function(OL[1])
[1] FALSE
This wouldn't be a problem if I had just one or two or three functions. But in practice I have dozens of functions, AND dozens of objects that are not functions, and I want to get a list of functions that's just as convenient as the list of objects returned by objects().
My first thought was that if objects() returned a list of actual objects, I could just go through that list and test for function status. But in fact, objects() seems to return a list of strings that are the names of my objects, not the objects themselves.
Any constructive advice would be greatly appreciated. Thanks.
...Hong Ooi answered the question but I can't mark it as answered for another eight hours.
lsf.str() is the syntax I was looking for.
All credit should go to Hong Ooi.
https://stackoverflow.com/users/474349/hong-ooi
Thanks, Hong Ooi.
lsf.str looks like a fine answer. If you'd like a more general tool, here's one from my (horn-tooting here) cgwtools package. You can get a list of any particular type of object in your environment (not just closures).
lstype <- function(type='closure'){
#simple command to get only one type of object in current environment
# Note: if you foolishly create variables named 'c' ,'q' ,'t' or the like,
# this will fail because typeof finds the builtin function first
inlist<-ls(.GlobalEnv)
if (type=='function') type <-'closure'
typelist<-sapply(sapply(inlist,get),typeof)
return(names(typelist[typelist==type]))
}

Examining contents of .rdata file by attaching into a new environment - possible?

I am interested in listing objects in an RDATA file and loading only selected objects, rather than the whole set (in case some may be big or may already exist in the environment). I'm not quite clear on how to do this when there are conflicts in names, as attach() doesn't work as nicely.
1: For examining the contents of an R data file without loading it: This question is similar, but different from, the one asked at listing contents of an R data file without loading
In that case, the solution offered was:
attach(filename)
ls(pos = 2)
detach()
If there are naming conflicts between objects in the file and those in the global environment, this warning appears:
The following object(s) are masked _by_ '.GlobalEnv':
I tried creating a new environment, but I cannot seem to attach into that.
For instance, this produces the same error:
lsfile <- function(filename){
tmpEnv <- new.env()
evalq(attach(filename), envir = tmpEnv)
tmpls <- ls(pos = 2)
detach()
return(tmpls)
}
lsfile(filename)
Maybe I've made a mess of things with evalq (or eval). Is there some other way to avoid the naming conflict?
2: If I want to access an object - if there are no naming conflicts, I can just work with the one from the .rdat file, or copy it to a new one. If there are conflicts, how does one access the object in the file's namespace?
For instance, if my file is "sample.rdat", and the object is surveyData, and a surveyData object already exists in the global environment, then how can I access the one from the file:sample.rdat namespace?
I currently solve this problem by loading everything into a temporary environment, and then copy out what's needed, but this is inefficient.
Since this question has just been referenced let's clarify two things:
attach() simply calls load() so there is really no point in using it instead of load
if you want selective access to prevent masking it's much easier to simply load the file into a new environment:
e = local({load("foo.RData"); environment()})
You can then use ls(e) and access contents like e$x. You can still use attach on the environment if you really want it on the search path.
FWIW .RData files have no index (the objects are stored in one big pairlist), so you can't list the contained objects without loading. If you want convenient access, convert it to the lazy-load format instead which simply adds an index so each object can be loaded separately (see Get specific object from Rdata file)
I just use an env= argument to load():
> x <- 1; y <- 2; z <- "foo"
> save(x, y, z, file="/tmp/foo.RData")
> ne <- new.env()
> load(file="/tmp/foo.RData", env=ne)
> ls(env=ne)
[1] "x" "y" "z"
> ne$z
[1] "foo"
>
The cost of this approach is that you do read the whole RData file---but on the other hand that seems to be unavoidable anyway as no other method seems to offer a list of the 'content' of such a file.
You can suppress the warning by setting warn.conflicts=FALSE on the call to attach. If an object is masked by one in the global environment, you can use get to retreive it from your attached data.
x <- 1:10
save(x, file="x.rData")
#attach("x.rData", pos=2, warn.conflicts=FALSE)
attach("x.rData", pos=2)
(x <- 1)
# [1] 1
(x <- get("x", pos=2))
# [1] 1 2 3 4 5 6 7 8 9 10
Thanks to #Dirk and #Joshua.
I had an epiphany. The command/package foreach with SMP or MC seems to produce environments that only inherit, but do not seem to conflict with, the global environment.
lsfile <- function(list_files){
aggregate_ls = foreach(ix = 1:length(list_files)) %dopar% {
attach(list_files[ix])
tmpls <- ls(pos = 2)
return(tmpls)
}
return(aggregate_ls)
}
lsfile("f1.rdat")
lsfile(dir(pattern = "*rdat"))
This is useful to me because I can now parallelize this. This is a bare-bones version, and I will modify it to give more detailed information, but so far it seems to be the only way to avoid conflicts, even without ignore.
So, question #1 can be resolved by either ignoring the warnings (as #Joshua suggested) or by using whatever magic foreach summons.
For part 2, loading an object, I think #Joshua has the right idea - "get" will do.
The foreach magic can also work, by using the .noexport option. However, this has risks: whatever isn't specifically excluded will be inherited/exported from the global environment (I could do ls(), but there's always the possibility of attached datasets). For safety, this means that get() must still be used to avoid the risk of a naming conflict. Loading into a subenvironment avoids the naming conflict, but doesn't avoid the loading of unnecessary objects.
#Joshua's answer is far simpler than my foreach detour.

Cleaning up the workspace "hiding" objects [duplicate]

This question already has answers here:
hiding personal functions in R
(7 answers)
Closed 9 years ago.
It's getting to the point where I have about 100 or so personal functions that I use for line by line data analysis. I generally use f.<mnemonic> nomenclature for my functions, but I'm finding that they're starting to get in the way of my work. Is there any way to hide them from the workspace? Such that ls() doesn't show them, but I can still use them?
If you have that many functions which you use on a repeated basis, consider putting them into a package. They can then live in their own namespace, which removes ls() clutter and also allows you to remove the f. prefix.
You can also put the function definitions into a separate environment, and then attach() that environment. (This is similar to Hong Ooi's suggestion, without the added step of making that into a loadable package.) I have this code in my .Rprofile file to set up some utility functions I commonly use:
local(env = my.fns, { # create a new env. all variables created below go into this env.
foo <- function (bar) {
# whatever foo does
}
# put as many function definitions here as you want
})
attach(my.fns)
All the functions inside my.fns are now available at the commandline, but the only thing that shows up in ls() is my.fns itself.
Try this to leave out the "f-dots":
fless <- function() { ls(env=.GlobalEnv)[!grepl("^f\\.", ls(env=.GlobalEnv) )]}
The ls() function looks at objects in an environment. If you only used (as I initially did) :
fless <- function() ls()[!grepl("^f\\.", ls())]
You get ... nothing. Adding .GlobalEnv moves the focus for ls() out to the usual workspace. The indexing is pretty straightforward. You are just removing (with the ! operator) anything that starts with "f." and since the "." is a special character in regex expressions, you need to escape it, ... and since the "\" is also a special character, the escape needs to be doubled.
A couple of options not already mentioned are
objects with names beginning with . are not shown by ls() (by default; you can turn this on with argument all.names = TRUE in the ls() call), so you could rename everything to .f.<mnemonic> in the source files.
In a similar vein to #Aaron's answer but use sys.source() to to source directly into an environment.
An example using sys.source() is shown below:
env <- attach(NULL, name = "myenv")
sys.source(fnames, env)
where fnames is a list of file names/paths from which to read your functions.

hiding personal functions in R

I have a few convenience functions in my .Rprofile, such as this handy function for returning the size of objects in memory. Sometimes I like to clean out my workspace without restarting and I do this with rm(list=ls()) which deletes all my user created objects AND my custom functions. I'd really like to not blow up my custom functions.
One way around this seems to be creating a package with my custom functions so that my functions end up in their own namespace. That's not particularly hard, but is there an easier way to ensure custom functions don't get killed by rm()?
Combine attach and sys.source to source into an environment and attach that environment. Here I have two functions in file my_fun.R:
foo <- function(x) {
mean(x)
}
bar <- function(x) {
sd(x)
}
Before I load these functions, they are obviously not found:
> foo(1:10)
Error: could not find function "foo"
> bar(1:10)
Error: could not find function "bar"
Create an environment and source the file into it:
> myEnv <- new.env()
> sys.source("my_fun.R", envir = myEnv)
Still not visible as we haven't attached anything
> foo(1:10)
Error: could not find function "foo"
> bar(1:10)
Error: could not find function "bar"
and when we do so, they are visible, and because we have attached a copy of the environment to the search path the functions survive being rm()-ed:
> attach(myEnv)
> foo(1:10)
[1] 5.5
> bar(1:10)
[1] 3.027650
> rm(list = ls())
> foo(1:10)
[1] 5.5
I still think you would be better off with your own personal package, but the above might suffice in the meantime. Just remember the copy on the search path is just that, a copy. If the functions are fairly stable and you're not editing them then the above might be useful but it is probably more hassle than it is worth if you are developing the functions and modifying them.
A second option is to just name them all .foo rather than foo as ls() will not return objects named like that unless argument all = TRUE is set:
> .foo <- function(x) mean(x)
> ls()
character(0)
> ls(all = TRUE)
[1] ".foo" ".Random.seed"
Here are two ways:
1) Have each of your function names start with a dot., e.g. .f instead of f. ls will not list such functions unless you use ls(all.names = TRUE) therefore they won't be passed to your rm command.
or,
2) Put this in your .Rprofile
attach(list(
f = function(x) x,
g = function(x) x*x
), name = "MyFunctions")
The functions will appear as a component named "MyFunctions" on your search list rather than in your workspace and they will be accessible almost the same as if they were in your workspace. search() will display your search list and ls("MyFunctions") will list the names of the functions you attached. Since they are not in your workspace the rm command you normally use won't remove them. If you do wish to remove them use detach("MyFunctions") .
Gavin's answer is wonderful, and I just upvoted it. Merely for completeness, let me toss in another one:
R> q("no")
followed by
M-x R
to create a new session---which re-reads the .Rprofile. Easy, fast, and cheap.
Other than that, private packages are the way in my book.
Another alternative: keep the functions in a separate file which is sourced within .RProfile. You can re-source the contents directly from within R at your leisure.
I find that often my R environment gets cluttered with various objects when I'm creating or debugging a function. I wanted a way to efficiently keep the environment free of these objects while retaining personal functions.
The simple function below was my solution. It does 2 things:
1) deletes all non-function objects that do not begin with a capital letter and then
2) saves the environment as an RData file
(requires the R.oo package)
cleanup=function(filename="C:/mymainR.RData"){
library(R.oo)
# create a dataframe listing all personal objects
everything=ll(envir=1)
#get the objects that are not functions
nonfunction=as.vector(everything[everything$data.class!="function",1])
#nonfunction objects that do not begin with a capital letter should be deleted
trash=nonfunction[grep('[[:lower:]]{1}',nonfunction)]
remove(list=trash,pos=1)
#save the R environment
save.image(filename)
print(paste("New, CLEAN R environment saved in",filename))
}
In order to use this function 3 rules must always be kept:
1) Keep all data external to R.
2) Use names that begin with a capital letter for non-function objects that I want to keep permanently available.
3) Obsolete functions must be removed manually with rm.
Obviously this isn't a general solution for everyone...and potentially disastrous if you don't live by rules #1 and #2. But it does have numerous advantages: a) fear of my data getting nuked by cleanup() keeps me disciplined about using R exclusively as a processor and not a database, b) my main R environment is so small I can backup as an email attachment, c) new functions are automatically saved (I don't have to manually manage a list of personal functions) and d) all modifications to preexisting functions are retained. Of course the best advantage is the most obvious one...I don't have to spend time doing ls() and reviewing objects to decide whether they should be rm'd.
Even if you don't care for the specifics of my system, the "ll" function in R.oo is very useful for this kind of thing. It can be used to implement just about any set of clean up rules that fit your personal programming style.
Patrick Mohr
A nth, quick and dirty option, would be to use lsf.str() when using rm(), to get all the functions in the current workspace. ...and let you name the functions as you wish.
pattern <- paste0('*',lsf.str(), '$', collapse = "|")
rm(list = ls()[-grep(pattern, ls())])
I agree, it may not be the best practice, but it gets the job done! (and I have to selectively clean after myself anyway...)
Similar to Gavin's answer, the following loads a file of functions but without leaving an extra environment object around:
if('my_namespace' %in% search()) detach('my_namespace'); source('my_functions.R', attach(NULL, name='my_namespace'))
This removes the old version of the namespace if it was attached (useful for development), then attaches an empty new environment called my_namespace and sources my_functions.R into it. If you don't remove the old version you will build up multiple attached environments of the same name.
Should you wish to see which functions have been loaded, look at the output for
ls('my_namespace')
To unload, use
detach('my_namespace')
These attached functions, like a package, will not be deleted by rm(list=ls()).

Protecting function names in R

Is it possible in R to protect function names (or variables in general) so that they cannot be masked.
I recently spotted that this can be a problem when creating a data frame with the name "new", which masked a function used by lmer and thus stopped it working. (Recovery is easy once you know what the problem is, here "rm(new)" did it.)
There is an easy workaround for your problem, without worrying about protecting variable names (though playing with lockBinding does look fun). If a function becomes masked, as in your example, it is still possible to call the masked version, with the help of the :: operator.
In general, the syntax is packagename::variablename.
(If the function you want has not been exported from the package, then you need three colons instead, :::. This shouldn't apply in this case however.)
Maybe use environments! This is a great way to separate namespaces. For example:
> a <- new.env()
> assign('printer', function(x) print(x), envir=a)
> get('printer', envir=a)('test!')
[1] "test!"
#hdallazuanna recommends (via Twitter)
new <- 1
lockBinding('new', globalenv())
this makes sense when the variable is user created but does not, of course, prevent overwriting a function from a package.
I had the reverse problem from the OP, and I wanted to prevent my custom functions in .Rprofile from being overridden when I defined a variable with the same name as a function, but I ended up putting my functions to ~/.R.R and I added these lines to .Rprofile:
if("myfuns"%in%search())detach("myfuns")
source("~/.R.R",attach(NULL,name="myfuns"))
From the help page of attach:
One useful ‘trick’ is to use ‘what = NULL’ (or equivalently a
length-zero list) to create a new environment on the search path
into which objects can be assigned by assign or load or
sys.source.
...
## create an environment on the search path and populate it
sys.source("myfuns.R", envir = attach(NULL, name = "myfuns"))

Resources