How to check if a function has been called from the console? - r

I am trying to track the number of times certain functions are called from the console.
My plan is to add a simple function such as "trackFunction" in each function that can check whether they have been called from the console or as underlying functions.
Even though the problem sounds straight-forward I can't find a good solution to this problem as my knowledge in function programming is limited. I've been looking at the call stack and rlang::trace_back but without a good solution to this.
Any help is appreciated.
Thanks

A simple approach would be to see on which level the current frame lies. That is, if a function is called directly in the interpreter, then sys.nframe() returns 1, otherwise 2 or higher.
Relate:
Rscript detect if R script is being called/sourced from another script
myfunc <- function(...) {
if (sys.nframe() == 1) {
message("called from the console")
} else {
message("called from elsewhere")
}
}
myfunc()
# called from the console
g <- function() myfunc()
g()
# called from elsewhere
Unfortunately, this may not always be intuitive:
ign <- lapply(1, myfunc)
# called from elsewhere
for (ign in 1) myfunc()
# called from the console
While for many things the lapply-family and for loops are similar, they behave separately here. If this is a problem, perhaps the only way to mitigate this is the analyze/parse the call stack and perhaps "ignore" certain functions. If this is what you need, then perhaps this is more appropriate:
R How to check that a custom function is called within a specific function from a certain package

Related

R: Packaging a function with trace

We have some internal R packages with a very large number of functions. As part of an effort to eliminate unused code I looked into covr and codetools::checkUsage and both are insufficient - so we opted to hook all functions with trace that would record activity somewhere. Toy example with no technical details:
> f <- function() { print("Doing very very important work") }
> trace(f, tracer=substitute(print("recording call")))
[1] "f"
> f()
Tracing f() on entry
[1] "recording call"
[1] "Doing very very important work"
The tracer operation does not significantly delay the work, but the tracing all package functions (~35K) takes ~3 minutes - and I'm looking for ways to shorten it.
Is there some way to package the functions with the trace, so it won't have to be added in a separate post-load stage? Is there another direction I didn't think of?
You can put the trace() calls into the source for your package. Just make sure the trace() call happens after the function definition, either by putting it later in the same source file, or by putting it in a separate file that collates after all your function definitions.
For example, if your package has a file R/fun.R containing this source,
fun <- function(x) {
print('this is fun!')
}
then simply add another line to R/fun.R so it looks like this instead:
fun <- function(x) {
print('this is fun!')
}
trace(fun, tracer=substitute(print("recording call")))
This works because of the way R installs and traces things:
trace modifies functions to insert the tracing.
installing executes all of the source files in the R directory, and saves the results.
So putting a trace call in your source will modify the function before it is saved, and it will stay modified for any user of that package.

r, cache, safety and *apply

I would like to improve a R package of mine with caching for subfunctions without changing the results (only optimization).
i've seen that cache packages exist like memoise. It seems also be possible to simply store some calls with a closure inside the package, the simplest example would be like
multiplyby3_generator <- function() {
save <- c(0,0)
function(value) {
if (value != save[1]) {
save <<- c(value,3*value)
warning("Recalculation !")
}
return(save[2])
}
}
multiplyby3 <- multiplyby3_generator()
multiplyby3(3)
# [1] 9
# Warning message:
# In multiplyby3(3) : Recalculation !
multiplyby3(3)
# [1] 9
However, how can you make it safe and useful to use that kind of things inside a package ?
As far as I understand :
If I did something like my example, I would have to forbid every kind of parallel computing on the mother function. Otherwise there would be a risk for the reference save to be changed between the if (value != save[1]) and the return(save[2]). Can you forbid parallelization and would it be enough to make it safe ? Or is it another way to make it safe ?
for a cache to be useful, one has to know how the *apply behave. The code invisible(lapply(1:100,function(i) print(i))) tends to show that the evaluation order of a lapply is the order of the list. Is that always the case ?
Thanks

Global Variables as Function Parameters

In an R project, we have a global dataframe df that is to be used inside a function my_func(). The dataframe will not be changed, but it will be used as a "read-only" table.
Can you please assist me, on, what is the best practice:
Include the dataframe in the parameters of the function, as in
my_func(df)
{
a <- df[1,2]
}
OR
Not include it in the parameters, just use it (read it) in the function body, as in
my_func()
{
a <- df[1,2]
}
In an ideal world, data enters a function as an argument and leaves it as a return value. That is a good principle. Besides it is prefereable for code reuse. Right now you may be conviced, that you will only ever call this code on df (bad name by the way, as there is a function calles df already in R and that can lead to terrible error messages).
The only exception from this rule, and the reason, why <<- exist(*), may rarely be performance.
However in the read-only case, there are no performance gains, as R does behave cleverly.
Will will need to install the microbenchmark package for the following code to run:
expl <- data.frame(a = rep("Hello world.", 1e8),
b = rep(1, 1e8))
fun1 <- function(dataframe) return(sum(dataframe$b))
fun2 <- function() return(sum(expl$b))
microbenchmark::microbenchmark(fun1(expl), fun2())
Try it and you will see, that there is no performance gain in fun2over fun1, even though the dataframe has considerable size.
Edit:
(*) as I have learned from Konrad Rudolph's comment below, <<- can be usefull, when giving data to the parent, not necessarily the global namespace. Very interesting read even if not strictly on topic here: http://adv-r.had.co.nz/Functional-programming.html#mutable-state

In R, do an operation temporarily using a setting such as working directory

I'm almost certain I've read somewhere how to do this. Instead of having to save the current option (say working directory) to a variable, change the w.d, do an operation, and then revert back to what it was, doing this inside a function akin to "with" relative to attach/detach. A solution just for working directory is what I need now, but there might be a more generic function that does that sort of things? Or ain't it?
So to illustrate... The way it is now:
curdir <- getwd()
setwd("../some/place")
# some operation
setwd(curdir)
The way it is in my wildest dreams:
with.dir("../some/place", # some operation)
I know I could write a function for this, I just have the impression there's something more readily available and generalizable to other parameters too.
Thanks
There is an idiom for this in some of R's base plotting functions
op <- par(no.readonly = TRUE)
# par(blah = stuff)
# plot(stuff)
par(op)
that is so unbelievably crude as to be fully portable to options() and setwd().
Fortunately it's also easy to implement a crude wrapper:
with_dir <- function(dir, expr) {
old_wd <- getwd()
setwd(dir)
result <- evalq(expr)
setwd(old_wd)
result
}
I'm no wizard with nonstandard evaluation so evalq could be unstable somehow. More on NSE in an old write-up by Lumley and also in Wickham's Advanced R, but it's dense stuff and I haven't wrapped my head around it all yet.
edit: as per Ben Bolker's comment, it's probably better to use on.exit for this:
with_dir <- function(dir, expr) {
old_wd <- getwd()
on.exit(setwd(old_wd))
setwd(dir)
evalq(expr)
}
From the R docs:
on.exit records the expression given as its argument as needing to be executed when the current function exits (either naturally or as the result of an error). This is useful for resetting graphical parameters or performing other cleanup actions.
What you're describing depends upon two things: detecting when you enter and leave a particular lexical scope, and defining a behavior to do on entrance and on exit. Python has these, called "Context Managers". This was a big deal when it was released, and many parts of Python's standard library now behave like context managers, and have to define the "enter" and "exit" behavior in explicitly, or by leveraging some clever inheritance scheme.
with.default
function (data, expr, ...)
eval(substitute(expr), data, enclos = parent.frame())
<bytecode: 0x07d02ccc>
<environment: namespace:base>
R's with function works sort of like a context manager, because it can pass scopes around easily. That said, this doesn't give you the "enter" and "exit" operations for free. Especially consider that the current working directory isn't an entry in the current scope, but a state of the R interpreter, which can only be queried or changed by function calls behind the .Internal shield.
You can easily define your own object types to have methods that are context manager-like for the with generic function, as well as writing and registering methods for other types you commonly use, but it is not part of the base R language.

Changing defaults in a function inside a locked package [duplicate]

This question already has answers here:
Setting Function Defaults R on a Project Specific Basis
(2 answers)
Closed 9 years ago.
I am developing my first package and it is aimed at users who are new to R, so I am trying to minimize the amount of R skills required to use the package. As a result I want a function that changes defaults in other functions within my package. But I get the following error "cannot add bindings to a locked environment", which means the environment of the package is locked and I am not allowed to change the default values of its functions.
Here is an example that throws a similar error:
library(ggplot2)
assign(formals(geom_point)$position, "somethingelse", pos="package:ggplot2")
When I try assignInNamespace i get:
Error in bindingIsLocked(x, ns) : no binding for "identity"
assignInNamespace(formals(geom_point)$position,"somethingelse", pos = "package:ggplot2")
Here is an example of what I hope to achieve.
default <- function(x=c("A", "B", "C")){
x
}
default()
change.default <- function(x){
formals(default)$x <<- x # Notice the global assign
}
change.default(1:3)
default()
I am aware that this is far from the recommended approach, but I am willing to cut corners to improve the learning curve of the package. Is there a way to achieve this?
This question has been marked as a duplicate of Setting Function Defaults R on a Project Specific Basis. This is a different situation as this question concerns how to allow the user in a interactive session to change the defaults of a function - not how to actually do it. The old question could not have been solved with the options() function and it is therefore a different question.
I think the colloquial way to achieve what you want is via option and packages in fact do so, e.g., lattice (although they use special options) or ascii.
Furthermore, this is also done so in base R, e.g., the famous and notorious default for stringsAsFactors.
If you look at ?read.table or ?data.frame you get: stringsAsFactors = default.stringsAsFactors(). Inspecting this reveals:
> default.stringsAsFactors
function ()
{
val <- getOption("stringsAsFactors")
if (is.null(val))
val <- TRUE
if (!is.logical(val) || is.na(val) || length(val) != 1L)
stop("options(\"stringsAsFactors\") not set to TRUE or FALSE")
val
}
<bytecode: 0x000000000b068478>
<environment: namespace:base>
The relevant part here is getOption("stringsAsFactors") which produces:
> getOption("stringsAsFactors")
[1] TRUE
Changing it is achieved like this:
> options(stringsAsFactors = FALSE)
> getOption("stringsAsFactors")
[1] FALSE
To do what you want your package would need to set an option, and the function take it's values form the options. Another function could then change the options:
options(foo=c("A", "B", "C"))
default <- function(x=getOption("foo")){
x
}
default()
change.default <- function(x){
options(foo=x)
}
change.default(1:3)
default()
If you want your package to set the options when loaded, you need to create a .onAttach or .onLoad function in zzz.R. My afex package e.g., does this and changes the default contrasts. In your case it could look like the following:
.onAttach <- function(libname, pkgname) {
options(foo=c("A", "B", "C"))
}
ascii does it via .onLoad (I don't remember what is the exact difference, but Writing R Extensions will help).
Preferably, a function has the following things:
Input arguments
A function body which does something with those arguments
Output arguments
So in your situation where you want to change something about the behavior of a function, changing the input arguments in the best way to go. See for example my answer to another post.
You could also use an option to save some global settings (e.g. which font to use, which PATH the packages you use are stored), see the answer of #James in the question I linked above. But use these things sparingly as it makes the code hard to read. I would primarily use them read only, i.e. set them once (either by the package or the user) and not allow functions to change them.
The unreadability stems from the fact that the behavior of the function is not solely determined locally (i.e. by the code directly working with it), but also by settings far away. This makes it hard to determine what a function does by purely looking at the code calling it, but you have to dig through much more code to fully understand what is going on. In addition, what if other functions change those options, making it even harder to predict what a given function will do as it depends on the history of functions. And here comes my earlier recommendation for read-only options back into play, if these are read only, some of the problems about readability are lessened.

Resources