I seem to be deeply confused about what constitutes good practice in R. Suppose that I have the following R code:
f<-function()
{
g<-function(s)
{
b<-b+1
s<-s+5
}
b<-10
g(2)
return(b)
}
In any typical language, this will always return b=10 and to the best of my awareness, the typical way to get f to recognise that g is modifying b would be to use global variables. However, as best as I can tell, it seems to be common practice in R to avoid global variables wherever possible. This leads me to ask, how am I supposed to modify f so that it outputs b=11 without making any use of global variables? I seem to either have a very deep misunderstanding or to be ignorant of a very important idea.
There are three forms of the assignment operator in R: <-, =, and <<-. The third form, <<-, assigns the value to an object in the parent of the current environment.
If you use this form in the g() function, it increments the value of b within the f() function by 1.
f<-function()
{
g<-function(s)
{
# use <<- form instead of <-
b<<-b+1
s<-s+5
}
b<-10
g(2)
return(b)
}
...and the output:
> f<-function()
+ {
+ g<-function(s)
+ {
+ b<<-b+1
+ s<-s+5
+ }
+ b<-10
+ g(2)
+ return(b)
+ }
> f()
[1] 11
>
Related
I would like to exploit the associativity of a binary operator by applying it to a vector of arbitary length. One can do that in Mathematica. I found a way to do it using recursive functions. Here is an example:
insert.binary.operator <- function(x, bin.op) {
recur.op <- function(n) {
if (n == 2) {
return(bin.op(x[2], x[1]))
} else {
bin.op(x[n], recur.op(n-1))
}
}
recur.op(length(x))
}
It works!
> insert.binary.operator(1:10, `+`)
[1] 55
My question is, is there something in the R language that does the same thing? If there is, it certainly would be faster and use less memory than the solution above.
All I can find is how to write to global variables, but not how to read them.
Example of incorrect code:
v = 0;
test <- function(v) {
v ->> global_v;
v <<- global_v + v;
}
test(1);
print(v);
This yields 2 because v ->> global_v treats v as the local variable v which is equal to 1. What can I replace that line with for global_v to get the 0 from the global v?
I'm asking of course about solutions different to "use different variable names".
You can use with(globalenv(), v) to evaluate v in the global environment rather than the function. with constructs an environment from its first argument, and evaluates the subsequent arguments in that environment. globalenv() returns the global environment. Putting those together, your function would become this:
test <- function(v) {
v <<- with(globalenv(), v) + v;
}
I have one function inside another like this:
func2 <- function(x=1) {ko+x+1}
func3= function(l=1){
ko=2
func2(2)+l
}
func3(1)
it shows error: Error in func2(2) : object 'ko' not found. Basically I want to use object ko in func2 which will not be define until func3 is called. Is there any fix for this?
Yes, it can be fixed:
func2 <- function(x=1) {ko+x+1}
func3= function(l=1){
ko=2
assign("ko", ko, environment(func2))
res <- func2(2)+l
rm("ko", envir = environment(func2))
res
}
func3(1)
#[1] 6
As you see this is pretty complicated. That's often a sign that you are not following good practice. Good practice would be to pass ko as a parameter:
func2 <- function(x=1, ko) {ko+x+1}
func3= function(l=1){
ko=2
func2(2, ko)+l
}
func3(1)
#[1] 6
You don't really have one function "inside" the other currently (you are just calling a function within a different function). If you did move the one function inside the other function, then this would work
func3 <- function(l=1) {
func2 <- function(x=1) {ko+x+1}
ko <- 2
func2(2)+l
}
func3(1)
Functions retain information about the environment in which they were defined. This is called "lexical scoping" and it's how R operates.
But in general I agree with #Roland that it's better to write functions that have explicit arguments.
This is a good case for learning about closures and using a factory.
func3_factory <- function (y) {
ko <- y
func2 <- function (x = 1) { ko + x + 1 }
function (l = 1) { func2(2) + l }
}
ko <- 1
func3_ko_1 <- func3_factory(ko)
ko <- 7
func3_ko_7 <- func3_factory(ko)
# each function stores its own value for ko
func3_ko_1(1) # 5
func3_ko_7(1) # 11
# changing ko in the global scope doesn't affect the internal ko values in the closures
ko <- 100
func3_ko_1(1) # 5
func3_ko_7(1) # 11
When func3_factory returns a function, that new function is coupled with the environment in which it was created, which in this case includes a variable named ko which keeps whatever value was passed into the factory and a function named func2 which can also access that fixed value for ko. This combindation of a function and the environemnt it was defined in is called a closure. Anything that happens inside the returned function can access these values, and they stay the same even if that ko variable is changed outside the closure.
In my package, I define %+% operator as a shortcut for strings concatenation. As it may be defined by previously loaded packages, I want to execute my custom code only when both arguments are suitable (e.g. character), otherwise try to call the code from previously loaded packages. Here is my solution for that:
# helper function to find environment of the package
getEnvByName <- function(inpEnv=.GlobalEnv, lookFor){
e <- inpEnv;
while (environmentName(e) != 'R_EmptyEnv' & environmentName(e)!=lookFor) e <- parent.env(e);
if (environmentName(e) != lookFor) return(NULL);
return(e);
}
"%+%" <- function(arg1, arg2){
if (is.character(arg1) & is.character(arg2)) {
paste0(arg1, arg2);
} else {
e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
if (exists('%+%', envir = e)) get('%+%',envir = e)(arg1,arg2);
}
}
My questions are:
1) is it a good way to treat such situations?
2) why it is not the common practice to do similar things in other packages? For example, in the ggplot2 package, %+% operator is defined as following:
"%+%" <- function (e1, e2)
{
e2name <- deparse(substitute(e2))
if (is.theme(e1)) add_theme(e1, e2, e2name)
else if (is.ggplot(e1)) add_ggplot(e1, e2, e2name)
}
as you see, their code breaks previously defined %+% for any arguments while they could just override it only for theme or ggplot arguments and keep all other cases. I could suggest the authors to implement this kind of check but I assume there's some reason they don't do it...
UPD. just a little modification of my code: instead of defining everything in one function, I split it with UseMethod() - I'm wondering if it makes any difference:
`%+%` <- function(...) UseMethod("%+%")
`%+%.character` <- paste0
`%+%.default` <- function (arg1, arg2){
e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
get('%+%',envir = e)(arg1,arg2);
}
First of all I don't think it is a good practice to reimplement functions that already exist in widely used package (I refer to previously mentioned %s+% from stringi).
As for about you question I think the best way is this:
'%+%' <- function(arg1, arg2){
if (is.character(arg1) & is.character(arg2)) {
paste0(arg1, arg2)
} else {
old.func <- get('%+%',
envir = parent.env(.GlobalEnv),
inherits = TRUE)
old.func(arg1, arg2)
}
}
With option inherits = TRUE (which is default by the way) get performs the same search in environments as is implemented in your answer;
The method with UseMethod will work differently because in that case %+% will check only the first argument for the type "character", not both arguments;
As for ggplot2s %+% I think it was intended to return NULL with not suitable arguments' type. It might possibly be a flaw in the code.
I have a function f that takes two parameters (p1 and p2):
If for the parameter p2 no value was passed to the function, the value of p1^2 should be used instead. But how can I find out within the function, if a value is given or not. The problem is that the variable p2 is not initialized if there was no value. Thus I can't test for p2 being NULL.
f <- function(p1, p2) {
if(is.null(p2)) {
p2=p1^2
}
p1-p2
}
Is it somehow possible to check if a value for p2 was passed to the function or not? (I could not find an isset() - function or similar things.)
You use the function missing() for that.
f <- function(p1, p2) {
if(missing(p2)) {
p2=p1^2
}
p1-p2
}
Alternatively, you can set the value of p2 to NULL by default. I sometimes prefer that solution, as it allows for passing arguments to nested functions.
f <- function(p1, p2=NULL) {
if(is.null(p2)) {
p2=p1^2
}
p1-p2
}
f.wrapper <-function(p1,p2=NULL){
p1 <- 2*p1
f(p1,p2)
}
> f.wrapper(1)
[1] -2
> f.wrapper(1,3)
[1] -1
EDIT: you could do this technically with missing() as well, but then you would have to include a missing() statement in f.wrapper as well.
I think '?missing' should do it.
In a case like this you can also use something like this:
f <- function(p1, p2 = p1 ^ 2) {
p1-p2
}
See the part on Lazy evaluation at http://adv-r.had.co.nz/Functions.html