We can use purrr::partial to create partial functions:
f <- function(x, y) {
print(x)
print(y)
return(invisible())
}
ff <- purrr::partial(f, y = 1)
ff(2)
#> [1] 2
#> [1] 1
Created on 2020-02-19 by the reprex package (v0.3.0)
This can often be quite useful, but has the unfortunate side-effect that the partialized function loses it's signature, which is replaced with an elipsis:
ff
#> <partialised>
#> function (...)
#> f(y = 1, ...)
While programatically irrelevant, this leads to worse code legibility during development, where RStudio's "intellisense" can no longer aid us in remembering the names and/or order of arguments. So is there some other means of partializing which keeps the original signature (minus the partialized-away arguments), as below?
ff
#> <partialised>
#> function (x)
#> f(y = 1, x)
Now, obviously this can be done manually, by defining a new function ff which is simply a wrapper around f with the desired arguments.
ff <- function(x) f(x, y = 1)
But this means any modifications to the signature of f need to be replicated to ff. So is there a "cleaner" way of partializing while keeping the signature?
One option is to use rlang::fn_fmls() (or base::formals() equivalent) to explicitly give default values to the function arguments:
# If desired, create a copy of the function first: ff <- f
rlang::fn_fmls(f) <- purrr::list_modify( rlang::fn_fmls(f), y=1 )
args(f)
# function (x, y = 1)
f(2)
# [1] 2
# [1] 1
Related
I am trying to use furrr::future_pmap in R to replace purrr::pmap in a function call within another function.
Presently I have it set up so pmap is passing other arguments using the ellipsis ... however when I try and do this using future_pmap I get unused argument errors (see example below). I know from comments in here passing ellipsis arguments to map function purrr package, R and other previous research that for the ellipsis to work with pmap you need to use function(x,y,z) blah(x,y,z,...) instead of ~blah(..1,..2,..3) but the same approach doesn't seem to work for future_map. Is there some other secret to making this work?
I've created a very simple reprex, obviously my real functions make a lot more sense to run in future_pmap
library(purrr)
library(furrr)
#> Loading required package: future
plan(multiprocess)
xd <- list(1, 10, 100)
yd <- list(1, 2, 3)
zd <- list(5, 50, 500)
sumfun <- function(indata, otherdata){
out <- sum(c(indata, otherdata))
return(out)
}
test_fun_pmap_tilde <- function(ind, ...){
return( pmap(ind, ~sumfun(c(..1,..2,..3), ...)))
}
test_fun_pmap <- function(ind, ...){
return( pmap(ind, function(x,y,z) sumfun(c(x,y,z), ...)))
}
test_fun_future_pmap <- function(ind, ...){
return( future_pmap(ind, function(x,y,z) sumfun(c(x,y,z), ...)))
}
#doesn't work as need to use function(x,y,z) instead of tildes
test_fun_pmap_tilde(list(xd, yd, zd), otherdata = c(100,1000))
#> Error in sumfun(c(..1, ..2, ..3), ...): unused arguments (.l[[2]][[i]], .l[[3]][[i]])
#this one works
test_fun_pmap(list(xd, yd, zd), otherdata = c(100,1000))
#> [[1]]
#> [1] 1107
#>
#> [[2]]
#> [1] 1162
#>
#> [[3]]
#> [1] 1703
#but using future_pmap it doesn't work
test_fun_future_pmap(list(xd, yd, zd), otherdata = c(100,1000))
#> Error in (function (x, y, z) : unused argument (otherdata = c(100, 1000))
Created on 2020-08-31 by the reprex package (v0.3.0)
Okay I have found a way for it to work. Apparently I need 3 sets of ellipsis instead of just 1.
test_fun_future_pmap <- function(ind, ...){
return( future_pmap(ind, function(x,y,z,...) sumfun(c(x,y,z), ...),...))
}
I have a function that has been saved as an Rds object and I'm wondering if there is any possibility after reading the function to debug() it or to see the code inside the function?
Example
library(purrr)
some_function <- function(x){
avg <- mean(x)
std <- sd(x)
return(c(avg, std))
}
safe_function <- safely(some_function)
saveRDS(safe_function, 'safe_function.rds')
rm(safe_function)
# How can I debug the function or make changes to it after I've loaded it?
safe_function <- readRDS('safe_function.rds')
Here's one way to do it:
Execute debug(safe_function) in the console and then call your function, say, safe_function(c(1, 2))
At this time, you will be in debug mode:
In your console, execute debugonce(.f) and then hit either 'next' or 'continue' (alternatively, type n or c in the console)
You will now be within the body of your some_function and will be able to see the code:
The short answer is that it's easy to extract the underlying function, with a single line of code:
extracted_function <- environment(safe_function)$.f
extracted_function
#> function(x){
#>
#>
#> avg <- mean(x)
#> std <- sd(x)
#>
#> return(c(avg, std))
#>
#> }
#> <bytecode: 0x1951c3e0>
You can debug this function however you like, then if you want to overwrite it but keep it within its safely wrapper, you can overwrite it like this:
debugged_function <- function(x) c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE))
environment(safe_function)$.f <- debugged_function
safe_function(1:10)
#> $result
#> [1] 5.50000 3.02765
#>
#> $error
#> NULL
I'll give a quick explanation of why this works:
If you examine your safe_function, you will notice a couple of unusual things about it. Although you have loaded it into the global environment, it is actually wrapped in its own unnamed environment (in this case <environment: 0x095704c0>), which is integral to the way that it works:
safe_function
#> function (...)
#> capture_error(.f(...), otherwise, quiet)
#> <bytecode: 0x0956fd50>
#> <environment: 0x095704c0>
The other odd thing you'll notice is that safe_function calls the function .f, which is neither a built-in function, nor a function exported from purrr. That's because .f is a copy of your original function that is kept inside this special environment.
We can look at the complete contents of the unnamed environment by doing:
ls(environment(safe_function), all.names = TRUE)
#> [1] ".f" "otherwise" "quiet"
Now, if you look at what .f is, you will find it is just a copy of your original some_function:
#> function(x){
#>
#>
#> avg <- mean(x)
#> std <- sd(x)
#>
#> return(c(avg, std))
#>
#> }
#> <bytecode: 0x1951c3e0>
So this is where your wrapped function is "hiding". It remains accessible as a member of this unnamed environment, which can be accessed via environment(safe_function) so is easy to modify if desired.
What factors should I consider when deciding whether or not to remove a variable that will not be used again in a function?
Here's a noddy example:
DivideByLower <- function (a, b) {
if (a > b) {
tmp <- a
a <- b
b <- tmp
remove(tmp) # When should I include this line?
}
# Return:
a / b
}
I understand that tmp will be removed when the function finishes executing, but should I ever be concerned about removing it earlier?
From Hadley Wickham's advanced R :
In some languages, you have to explicitly delete unused objects for
their memory to be returned. R uses an alternative approach: garbage
collection (or GC for short). GC automatically releases memory when an
object is no longer used. It does this by tracking how many names
point to each object, and when there are no names pointing to an
object, it deletes that object.
In the case you're describing garbage collection will release the memory.
In case the output of your function is another function, in which case Hadley names these functions respectively the function factory and the manufactured function, the variables created in the body of the function factory will be available in the enclosing environment of the manufactured function, and memory won't be freed.
More info, still in Hadley's book, can be found in the chapter about function factories.
function_factory <- function(x){
force(x)
y <- "bar"
fun <- function(z){
sprintf("x, y, and z are all accessible and their values are '%s', '%s', and '%s'",
x, y, z)
}
fun
}
manufactured_function <- function_factory("foo")
manufactured_function("baz")
#> [1] "x, y, and z are all accessible and their values are 'foo', 'bar', and 'baz'"
Created on 2019-07-08 by the reprex package (v0.3.0)
In this case, if you want to control which variables are available in the enclosing environment, or be sure you don't clutter your memory, you might want to remove unnecessary objects, either by using rm / remove as you did, or as I tend to prefer, wrapped in an on.exit statement.
Another case in which I might use rm is if I want to access variables from a parent environment without risk of them being overriden inside of the function, but in that case it's often possible and cleaner to use eval.parent.
y <- 2
z <- 3
test0 <- function(x, var){
y <- 1
x + eval(substitute(var))
}
# opps, the value of y is the one defined in the body
test0(0, y)
#> [1] 1
test0(0, z)
#> [1] 3
# but it will work using eval.parent :
test1 <- function(x, var){
y <- 1
x + eval.parent(substitute(var))
}
test1(0, y)
#> [1] 2
test1(0, z)
#> [1] 3
# in some cases (better avoided), it can be easier/quick and dirty to do something like :
test2 <- function(x, var){
y <- 1
# whatever code using y
rm(y)
x + eval(substitute(var))
}
test2(0, y)
#> [1] 2
test2(0, z)
#> [1] 3
Created on 2019-07-08 by the reprex package (v0.3.0)
When trying to create a list of similar functions using lapply, I find that all the functions in the list are identical and equal to what the final element should be.
Consider the following:
pow <- function(x,y) x^y
pl <- lapply(1:3,function(y) function(x) pow(x,y))
pl
[[1]]
function (x)
pow(x, y)
<environment: 0x09ccd5f8>
[[2]]
function (x)
pow(x, y)
<environment: 0x09ccd6bc>
[[3]]
function (x)
pow(x, y)
<environment: 0x09ccd780>
When you try to evaluate these functions you get identical results:
pl[[1]](2)
[1] 8
pl[[2]](2)
[1] 8
pl[[3]](2)
[1] 8
What is going on here, and how can I get the result I desire (the correct functions in the list)?
R passes promises, not the values themselves. The promise is forced when it is first evaluated, not when it is passed, and by that time the index has changed if one uses the code in the question. The code can be written as follows to force the promise at the time the outer anonymous function is called and to make it clear to the reader:
pl <- lapply(1:3, function(y) { force(y); function(x) pow(x,y) } )
This is no longer true as of R 3.2.0!
The corresponding line in the change log reads:
Higher order functions such as the apply functions and Reduce() now
force arguments to the functions they apply in order to eliminate
undesirable interactions between lazy evaluation and variable capture
in closures.
And indeed:
pow <- function(x,y) x^y
pl <- lapply(1:3,function(y) function(x) pow(x,y))
pl[[1]](2)
# [1] 2
pl[[2]](2)
# [1] 4
pl[[3]](2)
# [1] 8
I searched for a reference to learn about replacement functions in R, but I haven't found any yet. I'm trying to understand the concept of the replacement functions in R. I have the code below but I don't understand it:
"cutoff<-" <- function(x, value){
x[x > value] <- Inf
x
}
and then we call cutoff with:
cutoff(x) <- 65
Could anyone explain what a replacement function is in R?
When you call
cutoff(x) <- 65
you are in effect calling
x <- "cutoff<-"(x = x, value = 65)
The name of the function has to be quoted as it is a syntactically valid but non-standard name and the parser would interpret <- as the operator not as part of the function name if it weren't quoted.
"cutoff<-"() is just like any other function (albeit with a weird name); it makes a change to its input argument on the basis of value (in this case it is setting any value in x greater than 65 to Inf (infinite)).
The magic is really being done when you call the function like this
cutoff(x) <- 65
because R is parsing that and pulling out the various bits to make the real call shown above.
More generically we have
FUN(obj) <- value
R finds function "FUN<-"() and sets up the call by passing obj and value into "FUN<-"() and arranges for the result of "FUN<-"() to be assigned back to obj, hence it calls:
obj <- "FUN<-"(obj, value)
A useful reference for this information is the R Language Definition Section 3.4.4: Subset assignment ; the discussion is a bit oblique, but seems to be the most official reference there is (replacement functions are mentioned in passing in the R FAQ (differences between R and S-PLUS), and in the R language reference (various technical issues), but I haven't found any further discussion in official documentation).
Gavin provides an excellent discussion of the interpretation of the replacement function. I wanted to provide a reference since you also asked for that: R Language Definition Section 3.4.4: Subset assignment.
As a complement to the accepted answer I would like to note that replacement functions can be defined also for non standard functions, namely operators (see ?Syntax) and control flow constructs. (see ?Control).
Note also that it is perfectly acceptable to design a generic and associated methods for replacement functions.
operators
When defining a new class it is common to define S3 methods for $<-, [[<- and [<-, some examples are data.table:::`$<-.data.table`, data.table:::`[<-.data.table`, or tibble:::`$.tbl_df`.
However for any other operator we can write a replacement function, some examples :
`!<-` <- function(x, value) !value
x <- NULL # x needs to exist before replacement functions are used!
!x <- TRUE
x
#> [1] FALSE
`==<-` <- function(e1, e2, value) replace(e1, e1 == e2, value)
x <- 1:3
x == 2 <- 200
x
#> [1] 1 200 3
`(<-` <- function(x, value) sapply(x, value, USE.NAMES = FALSE)
x <- c("foo", "bar")
(x) <- toupper
x
#> [1] "FOO" "BAR"
`%chrtr%<-` <- function(e1, e2, value) {
chartr(e2, value, e1)
}
x <- "woot"
x %chrtr% "o" <- "a"
x
#> [1] "waat"
we can even define <-<-, but the parser will prevent its usage if we call x <- y <- z, so we need to use the left to right assignment symbol
`<-<-` <- function(e1, e2, value){
paste(e2, e1, value)
}
x <- "b"
"a" -> x <- "c"
x
#> [1] "a b c"
Fun fact, <<- can have a double role
x <- 1:3
x < 2 <- NA # this fails but `<<-` was called!
#> Error in x < 2 <- NA: incorrect number of arguments to "<<-"
# ok let's define it then!
`<<-` <- function(x, y, value){
if (missing(value)) {
eval.parent(substitute(.Primitive("<<-")(x, y)))
} else {
replace(x, x < y, value)
}
}
x < 2 <- NA
x
#> [1] NA 2 3
x <<- "still works"
x
#> [1] "still works"
control flow constructs
These are in practice seldom encountered (in fact I'm responsible for the only practical use I know, in defining for<- for my package pbfor), but R is flexible enough, or crazy enough, to allow us to define them. However to actually use them, due to the way control flow constructs are parsed, we need to use the left to right assignment ->.
`repeat<-` <- function(x, value) replicate(value, x)
x <- "foo"
3 -> repeat x
x
#> [1] "foo" "foo" "foo"
function<-
function<- can be defined in principle but to the extent of my knowledge we can't do anything with it.
`function<-` <- function(x,value){NULL}
3 -> function(arg) {}
#> Error in function(arg) {: target of assignment expands to non-language object
Remember, in R everything operation is a function call (therefore also the assignment operations) and everything that exists is an object.
Replacement functions act as if they modify their arguments in place such as in
colnames(d) <- c("Input", "Output")
They have the identifier <- at the end of their name and return a modified copy of the argument object (non-primitive replacement functions) or the same object (primitive replacement functions)
At the R prompt, the following will not work:
> `second` <- function(x, value) {
+ x[2] <- value
+ x
+ }
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> second(x) <- 9
Error in second(x) <- 9: couldn't find function "second<-"
As you can see, R is searching the environment not for second but for second<-.
So lets do the same thing but using such a function identifier instead:
> `second<-` <- function(x, value) {
+ x[2] <- value
+ x
+ }
Now, the assignment at the second position of the vector works:
> second(x) <- 9
> x
[1] 1 9 3 4 5 6 7 8 9 10
I also wrote a simple script to list all replacement functions in R base package, find it here.