Function hiding in R - r

Consider the following file.r:
foo = function(){}
bar = function(){}
useful = function() {foo(); bar()}
foo and bar are meant only for internal use by useful - they are not reusable at all, because they require very specific data layout, have embedded constants, do something obscure that no one is going to need etc.
I don't want to define them inside useful{}, because then it will become too long (>10 LOC).
A client could do the following to import only useful in their namespace, and still I am not sure if that will work with foo and bar outside visibility:
# Source a single function from a source file.
# Example use
# max.a.posteriori <- source1( "file.r","useful" )
source1 <- function( path, fun )
{
source( path, local=TRUE )
get( fun )
}
How can I properly do this on the file.r side i.e. export only specific functions?
Furthermore, there is the problem of ordering of functions, which I feel is related to the above. Let us have
douglas = function() { adams() }
adams = function() { douglas() }
How do I handle circular dependencies?

You can achieve this by setting the binding environment of your useful function, as in the code listed below. This is similar to what packages do and if your project gets bigger, I would really recommend creating a package using the great devtools package.
If the functions foo and bar are not used by other functions I would just define them inside useful. Since the functions are quite independent pieces of code it does not make the code more complicated to understand, even if the line count of useful increases. (Except of course if you are forced by some guideline to keep the line count short.)
For more on environments see: http://adv-r.had.co.nz/Environments.html
# define new environment
myenv <- new.env()
# define functions in this environment
myenv$foo <- function(){}
myenv$bar <- function(){}
# define useful in global environment
useful <- function(){
foo()
bar()
}
# useful does not find the called functions so far
useful()
# neither can they be found in the globalenv
foo()
# but of course in myenv
myenv$foo()
# set the binding environment of useful to myenv
environment(useful) <- myenv
# everything works now
useful()
foo()

My recommendation is to use packages. They were created for such situations. But still you cannot hide the functions itself in pure R.

In order to encapsulate foo and bar you need to implement a class. The easiest way, in my opinion, to do that in R is through R6classes: https://cran.r-project.org/web/packages/R6/vignettes/Introduction.html#private-members. There you have the example on how to hide the length function.

Related

How can I source specific functions in an R script?

I have a script with my most commonly used functions which I source at the top of most scripts. Sometimes I only want to get one of the functions in that script, but I don't know how to indicate that I only want one specific function. I'm looking for a function that is similar to the :: used to get a function inside a package. A reproducible example:
# file a.R
foo <- function() cat("Hello!\n")
bar <- function() cat("Goodbye!\n")
# End of file a.R
# file b.R
# Can't just delete all functions
fun <- function(x) print(x)
fun("It's so late!")
source("a.R")
foo()
fun("See you next time")
# End of file
I read the "source" help and it was unhelpful to me. The solution I currently have is to assign a variable at the start of the script with the functions loaded before, then set the difference with what was there after:
list_before <- lsf.str()
# content of file b.R
new_funcs <- setdiff(lsf.str(),list_before)
Then I can use rm(list=new_funcs[-1]) to keep only the function I wanted. This is, however a very convoluted way of doing this and I was hoping to find an easier solution.
A good way would be to write a package but it requires more knowledge (not there myself).
A good alternative I found is to use the package box that always you to import functions from an R script as a module.
You can import all functions or specific functions.
To set up a function as a module, you would use the roxygen2 documentation syntax as such:
#' This is a function to calculate a sum
#' #export
my_sum <- function(x, y){
x + y
}
#' This is a function to calculate a difference
#' #export
my_diff <- function(x, y){
x - y
}
Save the file as an R script "my_module.R"
The export parameter in the documentation tells box that what follows is a module. Then you can call box to reach a specific function in the module named "my_module".
Let's say your project directory has a script folder that contains your scripts and modules, you would import functions as such:
box::use(script/my_module)
my_module$my_sum(x, y)
box::use() creates an environment that contains all the functions found inside the module.
You can also import single functions like as follows. Let's assume your directory is a bit more complex as well where modules are inside a box folder inside script.
box::use(./script/box/my_module[my_sum])
my_sum(x, y)
You can use box to fetch functions from packages as well. In a sense, it is better than calling library() that would import all the functions in the package.
Using box, you can organize script by objectives or whatever organization you have in place.
I have a script to deal with strings from which I fetch function that work with strings.
I have a script for plot functions that I use in my projects...etc
insertSource() would help.
In your example, let's presume we need to import foo() from a.R :
# file b.R
foo <- function(){}
insertSource("a.R", functions = "foo", force=T)
foo <- foo#.Data

How to check if a function has been called from the console?

I am trying to track the number of times certain functions are called from the console.
My plan is to add a simple function such as "trackFunction" in each function that can check whether they have been called from the console or as underlying functions.
Even though the problem sounds straight-forward I can't find a good solution to this problem as my knowledge in function programming is limited. I've been looking at the call stack and rlang::trace_back but without a good solution to this.
Any help is appreciated.
Thanks
A simple approach would be to see on which level the current frame lies. That is, if a function is called directly in the interpreter, then sys.nframe() returns 1, otherwise 2 or higher.
Relate:
Rscript detect if R script is being called/sourced from another script
myfunc <- function(...) {
if (sys.nframe() == 1) {
message("called from the console")
} else {
message("called from elsewhere")
}
}
myfunc()
# called from the console
g <- function() myfunc()
g()
# called from elsewhere
Unfortunately, this may not always be intuitive:
ign <- lapply(1, myfunc)
# called from elsewhere
for (ign in 1) myfunc()
# called from the console
While for many things the lapply-family and for loops are similar, they behave separately here. If this is a problem, perhaps the only way to mitigate this is the analyze/parse the call stack and perhaps "ignore" certain functions. If this is what you need, then perhaps this is more appropriate:
R How to check that a custom function is called within a specific function from a certain package

Keep user-defined functions in global environment, during removal of objects

Question: How can I control the deletion (and saving) of user-defined function?
What I have tried so far:
I've gotten a recommendation to ad a dot [.] in the beginning of every function, being told that the functions would not be deleted. When tested, the function are deleted despite of staring with dot.
Requirements:
All "non-function" should be handled by the [rm].
Due to automation, the procedure needs to be able to be triggered by R base from a terminal. It is not enough that solution works only in Rstudio.
Global environment to be used, due to keeping the solution standardized.
If possible, one should be able to define which function to keep/delete.
Expected outcome:
None of the functions in the example should be deleted.
Below you fin the example code:
# Create 3 object variables.
a <- 1
b <- 2
c <- 3
# Create 3 functions.
myFunction1 <- function() {}
myFunction2 <- function() {}
myFunction3 <- function() {}
# Remove all from global.env.
# Keep the ones specified below.
rm(list = ls()[! ls() %in% c(
"a",
"c"
)
]
)
You can use ls.str to specify a mode of object to find. With this you can exclude functions from the rm list.
rm(list=setdiff(ls(),ls.str(mode="function")))
ls()
[1] "myFunction1" "myFunction2" "myFunction3"
However, you might be better off formalising your functions in a package and then you would not need to worry about deleting them with rm.
I strongly recommend a different approach. Don’t partially remove objects, use proper scope instead. That is, don’t define objects in the global environment that don’t need to be defined there, define them inside functions or local scopes instead.
Going one step further, your functions.r file also shouldn’t define functions in the global environment. Instead, as suggested in a comment, it should define them inside a dedicated environment which you may attach, if convenient. This is in fact what R packages solve. If you feel that R packages are too heavy for your purpose, I suggest you write modules using my ‘box’ package: it cleanly implements file-based code modules.
If you use scoping as it was designed, there’s no need to call rm on temporary variables, and hence your problem won’t arise.
If you really want a clean slate, restart R and re-execute your script: this is the only way to consistently reset the state of the R session; all other ways are error-prone hacks because they only perform a partial cleanup.
A note on what you wrote:
When tested, the function are deleted despite of staring with dot.
They’re not — they’re just invisible; that’s what the leading dot does. However, this recommendation also strikes me as bad practice: it’s an unnecessary hack.
Easy. Don't use the global environment.
myenv <- new.env()
with(myenv,
{
# Create 3 object variables.
a <- 1
b <- 2
c <- 3
}
)
myenv$a
#[1] 1
# Create 3 functions.
myFunction1 <- function() {}
myFunction2 <- function() {}
myFunction3 <- function() {}
# Remove all from env.
# Keep the ones specified below.
rm(list = ls(envir = myenv)[! ls(envir = myenv) %in% c(
"a",
"c"
)
], envir = myenv
)
ls(envir = myenv)
#[1] "a" "c"

When does a package need to use ::: for its own objects

Consider this R package with two functions, one exported and the other internal
hello.R
#' #export
hello <- function() {
internalFunctions:::hello_internal()
}
hello_internal.R
hello_internal <- function(x){
print("hello world")
}
NAMESPACE
# Generated by roxygen2 (4.1.1): do not edit by hand
export(hello)
When this is checked (devtools::check()) it returns the NOTE
There are ::: calls to the package's namespace in its code. A package
almost never needs to use ::: for its own objects:
‘hello_internal’
Question
Given the NOTE says almost never, under what circumstances will a package need to use ::: for its own objects?
Extra
I have a very similar related question where I do require the ::: for an internal function, but I don't know why it's required. Hopefully having an answer to this one will solve that one. I have a suspicion that unlocking the environment is doing something I'm not expecting, and thus having to use ::: on an internal function.
If they are considered duplicates of each other I'll delete the other one.
You should never need this in ordinary circumstances. You may need it if you are calling the parent function in an unusual way (for example, you've manually changed its environment, or you're calling it from another process where the package isn't attached).
Here is a pseudo-code example, where I think using ::: is the only viable solution:
# R-package with an internal function FInternal() that is called in a foreach loop
FInternal <- function(i) {...}
#' Exported function containing a foreach loop
#' #export
ParallelLoop <- function(is, <other-variables>) {
foreach(i = is) %dopar% {
# This fails, because it cannot not locate FInternal, unless it is exported.
FInternal(i)
# This works but causes a note:
PackageName:::FInternal(i)
}
}
I think the problem here is that the body of the foreach loop is not defined as a function of the package. Hence, when executed on a worker process, it is not treated as a code belonging to the package and does not have access to the internal objects of the package. I would be glad if someone could suggest an elegant solution for this specific case.

Determining current environment, or maybe I'm going about this all wrong?

I'm attempting to use environments to keep specialized constants out of the global namespace and for potentially masking each other. This is resulting in a slew of warnings along the lines of The following object(s) are masked from ....
I have:
foo <- new.env()
with(foo, {
# Define variables pertaining to foo.
)}
bar <- new.env()
with(bar, {
# Define variables pertaining to bar.
)}
Now it gets interesting. I have various functions that need to access the items in foo and bar. For example:
fooFunc1 <- function (args) {
attach(foo)
on.exit(detach(foo))
## Do foo things.
fooFunc2()
}
Now, fooFunc2 is defined similarly with an attach() statement at the top. This results in an warning that everything defined in foo has been masked. Which makes sense, because we're already in foo. The answer would appear to be having each function would check if it's already in the correct environment and only attach() if not. But I'm not seeing a way to name an environment to work with environmentName().
So how do people actually effect encapsulation and hiding in R? Having to type foo$fooVar1, foo$fooVar2, etc. seems absurd. Same with wrapping every statement in with(). What am I missing?
You could use some thing like:
if (!"foo" %in% search()) {attach(foo); on.exit(detach(foo))}
Or alternatively, use local:
fooFunc1 <- local(function(args) {
##Do foo things
fooFunc2()
}, env=foo)
I would use with again. For example:
foo <- new.env()
with(foo,{x=1;y=2})
fooFunc1 <- function(){
xx <- with(foo,{
x^2+1/2
})
}
You could just turn off the conflict warnings with attach(foo, warn.conflicts=FALSE). Alternatively, if you want to keep redundancies out of your searchpath, you could do something like this instead:
try(detach(foo), silent=TRUE)
attach(foo)
on.exit(try(detach(foo), silent=TRUE))
I think the best way, though, is to define the functions with the environment you want to run them in.
f <- function(...) {print(...)}
environment(f) <- foo
or equivalently,
f <- local({
function(...) {print(...)}
}, env=foo)
Functions in R are all closures, meaning they're all bundled with a reference to the environment that they are supposed to run in. By default each function's environment is the environment in which it is created, but you can change this using the environment or local functions to any environment you want.

Resources