I'm trying to do anonymous recursion in R and also playing with pipes to learn. The code below works well
sorttt <- function(list){
if (length(list) == 0) c() else c(max(list), Recall(list[list < max(list)]))
}
example %>% sorttt
But this code errors out with the error: Error in example %>% function(list) { : invalid formal argument list for "function"
example %>% function(list){if (length(list) == 0) c() else c(max(list), Recall(list[list < max(list)]))}
Does anyone know why these two might act differently? These seem to be logically the same thing.
You need to wrap anonymous functions in parentheses for them to work with pipes.
## doesn't work
1:10 %>% function(x) {mean(x)}
# Error in 1:10 %>% function(x) { :
# invalid formal argument list for "function"
## works with parens
1:10 %>% (function(x) {mean(x)})
# [1] 5.5
Same thing for your function:
1:3 %>%
(function(list){if (length(list) == 0) c() else c(max(list), Recall(list[list < max(list)]))})
# [1] 3 2 1
This is because function is itself a function. A %>% function(x){...} is interpreted as function(A, x){...}. The parentheses make sure the whole function definition is run before the pipe inserts an argument.
Related
Access result later in pipe
I am trying to create functions which print the number of rows excluded in a dataset at each step in a pipe.
Something like this:
iris %>%
function_which_save_nrows_and_return_the_data() %>%
filter(exclude some rows) %>%
function_which_prints_difference_in_rows_before_after_exlusion_and_returns_data %>%
function_which_save_nrows_and_return_the_data() %>%
function_which_prints_difference_in_rows_before_after_exlusion_and_returns_data ...etc
These are the functions I have attempted:
n_before = function(x) {assign("rows", nrow(x), .GlobalEnv); return(x)}
n_excluded = function(x) {
print(rows - nrow(x))
return(x)
}
This successfully saves the object rows:
But if I add two more links, the object is NOT saved:
So how can I create and access the rows-object later the pipe?
This is due to R's lazy evaluation. It occurs even if pipes are not used. See code below. In that code the argument to n_excluded is filter(n_before(iris), Species != 'setosa') and at the point that rows is used in the print statement the argument has not been referenced from within n_excluded so the entire argument will not have been evaluated and so rows does not yet exist.
if (exists("rows")) rm(rows) # ensure rows does not exist
n_excluded(filter(n_before(iris), Species != 'setosa'))
## Error in h(simpleError(msg, call)) :
## error in evaluating the argument 'x' in selecting a method for function
## 'print': object 'rows' not found
To fix this
1) we can force x before the print statement.
n_excluded = function(x) {
force(x)
print(rows - nrow(x))
return(x)
}
2) Alternately, we can use the magrittr sequential pipe which guarantees that legs are run in order. magrittr makes it available but does not provide an operator for it but we can assign it to an operator like this.
`%s>%` <- magrittr::pipe_eager_lexical
iris %>%
n_before() %>%
filter(Species != 'setosa') %s>% # note use of %s>% on this line
n_excluded()
The magrittr developer has stated that he will add it as an operator if there is sufficient demand for it so you might want to add such request to magrittr issue #247 on github.
You can also use the extended capabilities of pipeR.
library(dplyr)
library(pipeR)
n_excluded = function(x) {
print(rows - nrow(x))
return(x)
}
p <- iris %>>%
(~rows=nrow(.)) %>>%
filter(Species != "setosa") %>>%
n_excluded()
This is my function. Basically I want to include a if else statement inside it but controlling by the length of the arguments that I use in ...:
This is what Ive tried so far, and it is wrong:
soma_mtcars<-function(data,...){
if(length(...) < 2){
sum_df<- data %>% group_by() %>% summarise(total = sum(disp))
}
else(
sum_df<- data %>% group_by() %>% summarise(total = sum(disp))
)
}
Of course the problem is in length(...) < 2. How can I deal with it?
And I would like to have, for example, outputs for: soma_mtcars(mtcars,cyl) and soma_mtcars(mtcars, cyl, disp)
You can use nargs(), which gives you the total number of arguments (i.e. including your data argument):
soma_mtcars <- function (data, ...) {
if (nargs() < 3L) { …
}
… or you can pass ... list, and get the length of its result:
soma_mtcars <- function (data, ...) {
if (length(list(...)) < 2L) { …
}
Either of these will return the length of dot dot dot. The first one does it without evaluating dot dot dot.
len_noeval <- function(...) ...length()
len_eval <- function(...) length(list(...))
# test
len_noeval(11, print(12), 13)
## [1] 3
len_eval(11, print(12), 13)
## [1] 12
## [1] 3
I am trying to catch the name of a variable passed to a function, the right way. The name of interest noi is a data frame column or a vector. Below is my minimum working example. Ideally, I would like to receive a character vector which contains only "noi"
library(dplyr)
df <- data.frame(noi = seq(1:3))
example_fun <- function( x ){
deparse(substitute(x))
}
The result depends on the way I structure my input. Now I have an idea why this happens, but how would I do it correctly to have the desired result, regardless of how I call the function.
# Base
example_fun(df$noi)
[1] "df$noi"
# Pipe
df$noi %>% example_fun()
[1] "."
# Mutate
df %>% mutate(example_fun(noi))
noi example_fun(noi)
1 1 noi
2 2 noi
3 3 noi
Thanks in advance!
Perhaps decorate that variable with a "comment" attribute in another function? Note that the variable you want to decorate has to be wrapped directly in the decoration function z; otherwise, an error is raised (by design and for robustness).
example_fun <- function(x){
attr(x, "comment")
}
z <- function(x) {
nm <- substitute(x)
nm <- as.character(
if (is.symbol(nm) && !identical(nm, quote(.))) {
nm
} else if (length(nm) > 1L && (identical(nm[[1L]], quote(`[[`)) || identical(nm[[1L]], quote(`$`)))) {
tail(nm, 1L)
} else {
stop("not a valid symbol or extract operator.", call. = match.call())
}
)
`comment<-`(x, nm)
}
Output
> example_fun(z(df$noi))
[1] "noi"
> z(df$noi) %>% (function(x) x + 1) %>% example_fun()
[1] "noi"
> df %>% mutate(example_fun(z(noi)))
noi example_fun(z(noi))
1 1 noi
2 2 noi
3 3 noi
> z(df[["noi"]]) %>% example_fun()
[1] "noi"
> with(df, z(noi)) %>% example_fun()
[1] "noi"
> z(with(df, noi)) %>% example_fun()
Error in z(with(df, noi)) : not a valid symbol or extract operator.
> df$noi %>% z()
Error in z(.) : not a valid symbol or extract operator.
... but this may not be a robust method. It is extremely difficult to achieve what you want in a robust way, especially when a pipeline is involved. I think you should read Hadley's Advanced R and learn more about how bindings and environments work.
I'm searching R scripts and not sure why as.character() drops the `[`. Is there some way to get the code back correctly as a vector of strings?
Notice the `[`(. < 5) turns into (. < 5)[]
Note: I'm not looking for better ways to make this call as this isn't my code.
code <-
"1:10 %>% `[`(. < 5) %>% mean()
a <- 1:3"
# fine
parse(text = code)
#> expression(1:10 %>% `[`(. < 5) %>% mean(), a <- 1:3)
# not fine
as.character(parse(text = code))
#> [1] "1:10 %>% (. < 5)[] %>% mean()"
#> [2] "a <- 1:3"
Created on 2020-07-01 by the reprex package (v0.3.0)
The issue is that I need to substitute out parts of the code so that my function works. The function finds reactive commands and changes them to functions for the user to access in their environment. It's part of my shinyobjects package (shameless plug).
a <- reactive({
input$n * 100
})
and converts it to
a <- function() {
input$n *100
}
The methods I have been using have been fine until this edge case with the `[`.
The solution should be able to return each expression as something I can manipulate. This is a more complex example and should return a string vector of length 5. I'm also happy to take this discussion offline as I'm open to a better method overall for this functionality. You can find my contact here
code <-
'library(tidyverse)
library(shiny)
1:10 %>% `[`(. < 5) %>% mean()
df <- reactive({
mpg %>%
filter(cty > input$cty)
})
renderPlot(
ggplot(df(), aes(class)) +
geom_bar()
)'
(I've edited this a bit to explain the behaviour some more):
The problem is that magrittr's pipe operator uses non-standard evaluation inconsistently.
The expression
`[`(. < 5)
is legal R code that is equivalent to what was deparsed:
(. < 5)[]
However, it's a weird enough expression that magrittr gets confused by it, and doesn't transform
1:10 %>% (. < 5)[]
the same way it would transform
1:10 %>% `[`(. < 5)
I wouldn't call this a bug in magrittr (it's documented behaviour, if you look closely enough), but it's certainly an inconvenience caused by the inconsistent handling of dots. Normally if you put a dot in a term in a magrittr chain, that's the only place that the previous result is put in. For example, this doesn't print "foobar" twice:
"foobar" %>% cat("arg1", ., "arg3")
However, if the dot is in a function call in the chain, it is also inserted at the start:
"foobar" %>% cat("arg1", identity(.), "arg3")
does print it twice.
magrittr is evaluating 1:10 %>% [(. < 5) as
`[`(1:10, 1:10 < 5)
i.e.
(1:10)[1:10 < 5]
Really for consistency it would require you to type
1:10 %>% `[`(., . < 5)
but it is trying to be helpful, which is what is so unhelpful for what you want to do.
I suppose you could write a function to detect these cases yourself, and insert the extra dot explicitly.
Edited to add: Here's such a function:
explicitDots <- function(expr) {
nestedDot <- function(lang) {
if (is.call(lang)) {
for (i in seq_along(lang)) {
if (nestedDot(lang[[i]]))
return(TRUE)
}
return(FALSE)
} else
identical(lang, quote(.))
}
fixLang <- function(lang) {
if (is.call(lang)) {
fn <- lang[[1]]
if (as.character(fn) == "%>%") {
lang[[2]] <- fixLang(lang[[2]])
lang[[3]] <- fixLang(lang[[3]])
} else {
hasTopLevelDot <- FALSE
hasNestedDot <- FALSE
for (i in seq_along(lang)[-1]) {
if (identical(lang[[i]], quote(.))) {
hasTopLevelDot <- TRUE
break
}
hasNestedDot <- hasNestedDot || nestedDot(lang[[i]])
}
if (hasNestedDot && !hasTopLevelDot) {
# Insert a dot in position 2
lang <- lang[c(1,seq_along(lang))]
lang[[2]] <- quote(.)
}
}
}
lang
}
expr <- removeSource(expr)
for (i in seq_along(expr)) {
expr[[i]] <- fixLang(expr[[i]])
}
expr
}
And here's an example using it:
code <-
"1:10 %>% `[`(. < 5) %>% mean()
a <- 1:3"
p <- parse(text = code)
explicitDots(p)
which produced this output:
expression(1:10 %>% .[. < 5] %>% mean(), a <- 1:3)
If we need to get a vector of strings, one option is strsplit on the nextline character followed by zero or more spaces
out <- strsplit(code, "\n\\s*")[[1]]
out
#[1] "1:10 %>% `[`(. < 5) %>% mean()"
#[2] "a <- 1:3"
sapply(out, function(x) eval(parse(text = x)))
#$`1:10 %>% `[`(. < 5) %>% mean()`
#[1] 2.5
#$`a <- 1:3`
#[1] 1 2 3
First of all, you should be aware of the "never use parse " rule -- there are always better ways.
Next, what you get back is an expression , not an object which tells you about an expression.
Similarly, you're explicitly barred from trying to pull this trick with a closure:
bar <- as.character(function(x) x+3)
Error in as.character(function(x) x + 3) :
cannot coerce type 'closure' to vector of type 'character'
Now,
foo <- parse(text = code)
as.character(deparse(foo))
[1] "structure(expression(1:10 %>% (. < 5)[] %>% mean(), a <- 1:3), srcfile = <environment>, wholeSrcref = structure(c(1L, "
[2] "0L, 3L, 0L, 0L, 0L, 1L, 3L), srcfile = <environment>, class = \"srcref\"))"
Should give you a feel for what you're dealing with.
But it's not clear why you want/need to take a string, parse it, and then try to get it back again. Either follow akrun's approach or take another route to turn poorly structured text into executable commands.
I'm looking for a simple way to move on to the next iteration in a for loop in R if the operation inside the for loop errors.
I've recreated a simple case below:
for(i in c(1, 3)) {
test <- try(i+1, silent=TRUE)
calc <- if(class(test) %in% 'try-error') {next} else {i+1}
print(calc)
}
This correctly gives me the following calc values.
[1] 2
[1] 4
However once I change the vector in i to include a non-numeric value:
for(i in c(1, "a", 3)) {
test <- try(i+1, silent=TRUE)
calc <- if(class(test) %in% 'try-error') {next} else {i+1}
print(calc)
}
This for loop doesn't work. I was hoping for the same calc values as above with the vector excluding the non-numeric value in i.
I tried using tryCatch as the following:
for(i in c(1, "a", 3)) {
calc <- tryCatch({i+1}, error = function(e) {next})
print(calc)
}
However, I get the following error:
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
Could someone please help me understand how I could achieve this using a for loop in R?
As Dason noted, an atomic vector really is not the best way of storing mixed data types. Lists are for that. Consider the following:
l = list(1, "sunflower", 3)
for(i in seq_along(l)) {
this.e = l[[i]]
test <- try(this.e + 1, silent=TRUE)
calc <- if(class(test) %in% 'try-error') {next} else {this.e + 1}
print(calc)
}
[1] 2
[1] 4
In other words, your former loop "worked". It just always failed and went to next iteration.
Here is a solution using the "purr" package that might be helpful.
It goes through your list or vector and returns the elements that will cause errors
#Wrap the function you want to use in the adverb "safely"
safetest <- safely(function(x){ifelse(is.na(as.numeric(x)),
x+1,
as.numeric(x)+1)})
myvect<-c(1,"crumbs",3) #change to list if you want a list
#Use the safe version to find where the errors occur
check <- myvect %>%
map(safetest) %>%
transpose %>% .$result %>%
map_lgl(is_null)
myvect[check]
#This returns the results that did not through an error
#first remove NULL elements then flatten to double.
#The two flatten expresiion can be replaced by a single unlist
myvect %>%
map(safetest) %>%
transpose %>% .$result %>%
flatten()%>%flatten_dbl()
see https://blog.rstudio.org/2016/01/06/purrr-0-2-0/ for the original example.