In R I have two functions that pretty much do the same thing except they have a different set of default variables.
Say I have function1<-function(a=1,b=2,c=3){...}what I have right now is function 2 calling function 1 except defining a different set of default variables function2<-function(a=3,b=4,c=5){function1(a=a,b=b,c=c)}
obviously this is not optimal and I was wondering if there is a better way to write these two functions (maybe have a common function and make the other two aliases with different default variables?)
You can modify default parameters by formals<-.
> f1 <- function(a = 1) a
> f2 <- f1
> formals(f2)$a <- 2
>
> f1
function(a = 1) a
> f2
function (a = 2)
a
>
> f1()
[1] 1
> f2()
[1] 2
I suppose you could just add another argument to the original function that acts as a flag to indicate which set of defaults to use:
function1 <- function(a=1, b=2, c=3, altDefaults = FALSE){
if (altDefaults){
a <- 3; b <- 4; c <- 5
}
}
One could expand this I suppose to incorporate multiple sets of defaults but it might get cumbersome.
Look at this wiki on first order functions by Hadley. One of the functions discussed is Curry which allows you to define variants of a function just like what you mentioned in your question.
Related
Into the R console, type:
#First code snippet
x <- 0
x <- x+1
x
You'll get '1'. That makes sense: the idea is that the 'x' in 'x+1' is the current value of x, namely 0, and this is used to compute the value of x+1, namely 1, which is then shoveled into the container x. So far, so good.
Now type:
#Second code snippet
f <- function(n) {n^2}
f <- function(n) {if (n >= 1) {n*f(n-1)} else {1}}
f(5)
You'll get '120', which is 5 factorial.
I find this perplexing. Following the logic of the first code snippet, we might expect the 'f' in the expression
if (n >= 1) {n*f(n-1)} else {1}
to be interpreted as the current value of f, namely
function(n) {n^2}
Following this reasoning, the value of f(5) should be 5*(5-1)^2 = 80. But that's not what we get.
Question. What's really going on here? How does R know not to use the old 'f'?
we might expect the 'f' in the expression
if (n >= 1) {n*f(n-1)} else {1}
to be interpreted as the current value of f
— Yes, we might expect that. And we would be correct.
But what is the “current value of f”? Or, more precisely, what is “current”?
“Current” is when the function is executed, not when it is defined. That is, by the time you execute f(5), it has already been redefined. So now the execution enters the function, looks up inside the function what f refers to — and also finds the current (= new) definition, not the old one.
In other words: the objects associated with names are looked up when they are actually accessed. And inside a function this means that names are accessed when the function is executed, not when it’s defined.
The same is true for all objects. Let’s say f is using a global object that’s not a function:
n = 5
f = function() n ^ 2
n = 1
f() # = 1
To understand the difference between your first and second example, consider the following case which involved functions, yet behaves like your first case (i.e. it uses the “old” value of f).
To make the example work, we need a little helper: a function that modifies other functions. In the following, twice is a function which takes a function as an argument and returns a new function. That new function is the same as the old function, only it runs twice when invoked:
twice = function (original_function) {
force(original_function)
function (...) {
original_function(original_function(...))
}
}
To illustrate what twice does, let’s invoke it on an example function:
plus1 = function (n) n + 1
plus2 = twice(plus1)
plus2(3) # = 5
Neat — R allows us to handle functions like any other object!
Now let’s modify your f:
f = function(n) {n^2}
f = twice(f)
f(5) # 625
… and here we have it: in the statement f = twice(f), the second f refers to the current (= old) definition. Only after that line does f refer to the new, modified function.
Here's a simple example illustrating my comment on Konrad's excellent answer:
a <- 2
f <- function() a*b
e <- new.env()
assign("b",5,e)
environment(f) <- e
> f()
[1] 10
b <- 10
> f()
[1] 10
So we've manually altered the environment for f so that it always first looks in e for b. Theoretically, one could even lock that binding ?lockBinding to make sure it never changes without throwing an error.
This sort of thing could get complicated, though, as in general you'd want to make sure that you set the parent environment of e correctly based on where the function f is actually being created. In this example f is created in the global environment, but if f were being created inside another function, you'd want e's parent environment to reflect that.
If I make an environment with a list in it, and want to assign values to that list, why does the following fail when using get and assign?
res <- new.env()
res$calls <- vector("list", 100)
res$counter <- 1
## works fine
res$calls[[1]] <- 1
## Fails, why?
get("calls", envir=res)[[get("counter", envir=res)]] <- 2
## doesnt make the assignment
val <- get("calls", envir=res)[[get("counter", envir=res)]]
assign("val", 2, envir=res)
I think the following will address your issue:
get("calls", envir=res)[[get("counter", envir=res)]] <- 2 fails because get is not a replacement function. On the other hand res$calls[[1]] <- 1 is actually a replacement function which you can see if you type help('[[<-'). This is the function used when you make an assignment. I think the reason why get has no replacement counterpart i.e. (get<-) is that there is a specific function to do this, which is called assign (as per #TheTime 's comment).
For the second case val <- get("calls", envir=res)[[get("counter", envir=res)]] is created in the global environment. When you use assign("val", 2, envir=res) a res$val variable is created inside the res environment which you can see below:
> res$val
[1] 2
However, val remains the same on the global environment as 1:
> val
[1] 1
So, You probably won't be able to do the assignment with either get or assign. get won't allow it because it is not a replacement function and ?assign mentions:
assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.
So, you can just use the normal [[<- assignment method. #Frank in the comments provides a nice way like:
res[[ "calls" ]][[ res[["counter"]] ]] <- 2
I just finished reading about scoping in the R intro, and am very curious about the <<- assignment.
The manual showed one (very interesting) example for <<-, which I feel I understood. What I am still missing is the context of when this can be useful.
So what I would love to read from you are examples (or links to examples) on when the use of <<- can be interesting/useful. What might be the dangers of using it (it looks easy to loose track of), and any tips you might feel like sharing.
<<- is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:
A closure is a function written by another function. Closures are
so-called because they enclose the environment of the parent
function, and can access all variables and parameters in that
function. This is useful because it allows us to have two levels of
parameters. One level of parameters (the parent) controls how the
function works. The other level (the child) does the work. The
following example shows how can use this idea to generate a family of
power functions. The parent function (power) creates child functions
(square and cube) that actually do the hard work.
power <- function(exponent) {
function(x) x ^ exponent
}
square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16
cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64
The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.
This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter is run, it creates an environment, initialises the counter i in this environment, and then creates a new function.
new_counter <- function() {
i <- 0
function() {
# do something useful, then ...
i <<- i + 1
i
}
}
The new function is a closure, and its environment is the enclosing environment. When the closures counter_one and counter_two are run, each one modifies the counter in its enclosing environment and then returns the current count.
counter_one <- new_counter()
counter_two <- new_counter()
counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1
It helps to think of <<- as equivalent to assign (if you set the inherits parameter in that function to TRUE). The benefit of assign is that it allows you to specify more parameters (e.g. the environment), so I prefer to use assign over <<- in most cases.
Using <<- and assign(x, value, inherits=TRUE) means that "enclosing environments of the supplied environment are searched until the variable 'x' is encountered." In other words, it will keep going through the environments in order until it finds a variable with that name, and it will assign it to that. This can be within the scope of a function, or in the global environment.
In order to understand what these functions do, you need to also understand R environments (e.g. using search).
I regularly use these functions when I'm running a large simulation and I want to save intermediate results. This allows you to create the object outside the scope of the given function or apply loop. That's very helpful, especially if you have any concern about a large loop ending unexpectedly (e.g. a database disconnection), in which case you could lose everything in the process. This would be equivalent to writing your results out to a database or file during a long running process, except that it's storing the results within the R environment instead.
My primary warning with this: be careful because you're now working with global variables, especially when using <<-. That means that you can end up with situations where a function is using an object value from the environment, when you expected it to be using one that was supplied as a parameter. This is one of the main things that functional programming tries to avoid (see side effects). I avoid this problem by assigning my values to a unique variable names (using paste with a set or unique parameters) that are never used within the function, but just used for caching and in case I need to recover later on (or do some meta-analysis on the intermediate results).
One place where I used <<- was in simple GUIs using tcl/tk. Some of the initial examples have it -- as you need to make a distinction between local and global variables for statefullness. See for example
library(tcltk)
demo(tkdensity)
which uses <<-. Otherwise I concur with Marek :) -- a Google search can help.
On this subject I'd like to point out that the <<- operator will behave strangely when applied (incorrectly) within a for loop (there may be other cases too). Given the following code:
fortest <- function() {
mySum <- 0
for (i in c(1, 2, 3)) {
mySum <<- mySum + i
}
mySum
}
you might expect that the function would return the expected sum, 6, but instead it returns 0, with a global variable mySum being created and assigned the value 3. I can't fully explain what is going on here but certainly the body of a for loop is not a new scope 'level'. Instead, it seems that R looks outside of the fortest function, can't find a mySum variable to assign to, so creates one and assigns the value 1, the first time through the loop. On subsequent iterations, the RHS in the assignment must be referring to the (unchanged) inner mySum variable whereas the LHS refers to the global variable. Therefore each iteration overwrites the value of the global variable to that iteration's value of i, hence it has the value 3 on exit from the function.
Hope this helps someone - this stumped me for a couple of hours today! (BTW, just replace <<- with <- and the function works as expected).
f <- function(n, x0) {x <- x0; replicate(n, (function(){x <<- x+rnorm(1)})())}
plot(f(1000,0),typ="l")
The <<- operator can also be useful for Reference Classes when writing Reference Methods. For example:
myRFclass <- setRefClass(Class = "RF",
fields = list(A = "numeric",
B = "numeric",
C = function() A + B))
myRFclass$methods(show = function() cat("A =", A, "B =", B, "C =",C))
myRFclass$methods(changeA = function() A <<- A*B) # note the <<-
obj1 <- myRFclass(A = 2, B = 3)
obj1
# A = 2 B = 3 C = 5
obj1$changeA()
obj1
# A = 6 B = 3 C = 9
I use it in order to change inside map() an object in the global environment.
a = c(1,0,0,1,0,0,0,0)
Say I want to obtain a vector which is c(1,2,3,1,2,3,4,5), that is if there is a 1, let it 1, otherwise add 1 until the next 1.
map(
.x = seq(1,(length(a))),
.f = function(x) {
a[x] <<- ifelse(a[x]==1, a[x], a[x-1]+1)
})
a
[1] 1 2 3 1 2 3 4 5
I'm looking for a simple function to speed up my ability to write and debug R functions. Consider the following blocks of code:
# Part A:
myfun = function(a, b = 5, out = "hello"){
if(a>b) print(out)
return(a-b)
}
# Part B:
b = 5
out = "hello"
# Part C:
do.args = function(f){
#intialize the arguments of myfun in the parent environment
???
}
The function myfun is a trivial example of a bigger problem: I often have a complicated function with many arguments. To efficiently write and debug such a function, I find it useful to initialize the arguments of the function, and 'step through' the function line-by-line. Initializing the arguments, as in Part B above, is somewhat a hassle, when there are lots of arguments, and I would prefer to have a function as in Part C, which takes only the string myfun as it arguments and produces the same effect as running Part B in the current environment.
This only works for functions where all the arguments are defined. In other words, myfun has to have a value for a defined in the function.
some.func <- function(infunc){
forms <- formals(infunc)
for(i in 1:length(forms)){
assign(names(forms)[i],forms[[i]],envir=globalenv())
}
}
You could add a qualifier to deal with the variables that do not have default values, but it may not work in all examples. In this example I defined all missing variables to NA - and you could change the definition. Note: assigning the missing variables to NULL will not work.
some.func <- function(infunc){
forms <- formals(infunc)
for(i in 1:length(forms)){
if(class(forms[[i]])=="name") forms[[i]] <- NA
assign(names(forms)[i],forms[[i]],envir=globalenv())
}
}
You could also adjust the function and simply skip assigning the missing variables by using next after the if statement rather than defining the missing variables to NA, or some other value. The next example:
some.func <- function(infunc){
forms <- formals(infunc)
for(i in 1:length(forms)){
if(class(forms[[i]])=="name") next
assign(names(forms)[i],forms[[i]],envir=globalenv())
}
}
If you want to reassign formal arguments there is a formals<- function. By default the environment in which it does the assignment is the same as that in which it was created, bu that could be changed. See ?formals and ?alist
formals(myfun) <- alist(a=,b=4, out="not awake")
myfun
#------------------
function (a, b = 4, out = "not awake")
{
if (a > b)
print(out)
return(a - b)
You need to use alist with the argument of the form a= if you want the default to be missing.
}
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
General suggestions for debugging R?
When debugging, I would often like to know the value of a variable used in a function that has completed executing. How can that be achieved?
Example:
I have function:
MyFunction <- function() {
x <- rnorm(10) # assign 10 random normal numbers to x
return(mean(x))
}
I would like to know the values stored in x which are not available to me after the function is done executing and the function's environment is cleaned up.
You mentioned debugging, so I assume the values are not needed later in the script, you just want to check what is happening. In that case, what I always do is use browser:
MyFunction <- function() {
browser()
x <- rnorm(10) # assign 10 random normal numbers to x
return(mean(x))
}
This drops you into an interactive console inside the scope of the function, allowing you to inspect what is happening inside.
For general information about debugging in R I suggest this SO post.
MyFunction <- function() {
x <- rnorm(10) # assign 10 random normal numbers to x
return(list(x,mean(x)))
}
This will return a list where the first element is x and the second is its mean
You have many options here. The easiest is to use the <<- operator when you assign to x. It's also the most likely to get you into trouble.
> test <- function() x <- runif(1)
> x <- NA
> test()
> x
[1] NA
> test <- function() x <<- runif(1)
> test()
> x
[1] 0.7753325
Edit
#PaulHeimstra points out that you'd like this for debugging. Here's a pointer to some general tricks:
General suggestions for debugging in R
I'd recommend either setting options(error=recover) or using trace() in combination with browser().
There are already some good solutions, I'd like to add one possibility. I emphasize on the fact that you want to know the value of a variable used in a function that has completed executing. So there is maybe no need to assign those values, and you don't want (a priori) to stop execution. The solution is to simply use print. So it is not used by default but only when you want to debug, the option to print or not can be passed as a function argument:
MyFunction <- function(x, y, verbose = FALSE) {
a <- x * y
if (verbose) print(a)
b <- x - y
if (verbose) print(b)
return(a * b)
}
In general, you would run your function like this: MyFunction(10, 4) but when you want to see those intermediate results, do MyFunction(10, 4, verbose = TRUE).