Is there an elegant way to simplify this call?
a <- list(1, 2, 3)
b <- list(4, 5)
conditional = TRUE
if (conditional) {
x <- a
} else {
x <- b
}
x
# [1, 2, 3]
I've tried x <- ifelse(TRUE, a, b), but it assumes the conditional is a vector which must be iterated, so in this case it returns a single value (in this case, 1).
dplyr's if_else, on the other hand, demands that the lists be of equal length. And even if they were, it also iterates through the conditional and would also output a single value 1.
So, is there some clean way of solving this or is the simple if{}else{} the way to go?
Here's a simple one-liner using switch.
x <- switch(TRUE + 1, b, a)
x
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
x <- switch(FALSE + 1, b, a)
x
[[1]]
[1] 4
[[2]]
[1] 5
This uses the switch behavior with integer EXPR as described in documentation -
switch works in two distinct ways depending whether the first argument
evaluates to a character string or a number.
If the value of EXPR is not a character string it is coerced to
integer. Note that this also happens for factors, with a warning, as
typically the character level is meant. If the integer is between 1
and nargs()-1 then the corresponding element of ... is evaluated and
the result returned: thus if the first argument is 3 then the fourth
argument is evaluated and returned.
Related
I have a two argument function that takes as its first input a triple of pairs of numbers in the form "(a, b)(c, d)(e, f)" (as a character string) and as second argument a pair of numbers (also written as a character string of the form "(a, b)") and outputs a logical that states if the pair (the second argument) is one of the three pairs in the triple (the first argument). I actually wrote two versions:
version1 <- function(x, y){#x is a triple of pairs, y is a pair
pairsfromthistriple <- paste(c("", "(", "("), strsplit(x, split = ")(", fixed = T)[[1]], c(")", ")", ""), sep = "")
y %in% pairsfromthistriple
}
version2 <- function(x, y){#x is triple of pairs, y is pair
y == substr(x, 1, 6) | y == substr(x, 7, 12) | y == substr(x, 13, 18)
}
I want to set this function loose for every triple-of-pairs from a vector of triples an every pair from some vector of pairs using outer. For here I'll us the following very short vectors:
triples <- c("(1, 2)(3, 4)(5, 6)", "(1, 2)(3, 5)(4, 6)")
names(triples) <- triples
pairs <- c("(5, 6)", "(3, 5)")
names(pairs) <- pairs
So here we go:
test1 <- outer(X = triples, Y = pairs, FUN = version1)
test2 <- outer(X = triples, Y = pairs, FUN = version2)
test2 evaluates to exactly what you expect, but test1 gives a non-sensical output:
> test1
(5, 6) (3, 5)
(1, 2)(3, 4)(5, 6) TRUE FALSE
(1, 2)(3, 5)(4, 6) TRUE FALSE
> test2
(5, 6) (3, 5)
(1, 2)(3, 4)(5, 6) TRUE FALSE
(1, 2)(3, 5)(4, 6) FALSE TRUE
The natural conclusion is that there is an error in version1, but it is not as simple as that. 'Manually' computing the terms in the matrix using version1 gives:
> version1(triples[1], pairs[1])
[1] TRUE
> version1(triples[1], pairs[2])
[1] FALSE
> version1(triples[2], pairs[1])
[1] FALSE
> version1(triples[2], pairs[2])
[1] TRUE
exactly as it should! So at least part of the fault is with the function outer. In fact what happens (in this small example it is not so clear, but this is very visible in larger examples) is that outer correctly computes the first row of its output matrix, but then copies this first row over and over to make up the subsequent rows. Obviously this is not what I want. If I only wanted to compute version1(x, y) for all y in some vector but just one single x, I would have used sapply rather than outer.
What is going on here?
Note this detail from the documentation for ?outer:
X and Y must be suitable arguments for FUN. Each will be extended by rep to length the products of the lengths of X and Y before FUN is called.
FUN is called with these two extended vectors as arguments (plus any arguments in ...). It must be a vectorized function (or the name of one) expecting at least two arguments and returning a value with the same length as the first (and the second).
Your version1 function is not vectorized properly like version2 is. You can see this by simply testing it on the original triples and pairs vectors, which should both match.
version1(triples, pairs)
#> [1] TRUE FALSE
version2(triples, pairs)
#> (5, 6) (3, 5)
#> TRUE TRUE
Your version1 function seems designed for use with apply(), because you retrieve a list from strsplit() but then just take the first element. If you want to maintain the approach of splitting the vector, then you would have to use the apply family of functions. Without using them, you are going to expand the triples or x vector into something much longer than y and you can't do element wise comparison.
However, I would just use something very simple. stringr::str_detect is already vectorized for string and pattern, so you can just use that directly.
library(stringr)
outer(X = triples, Y = pairs, FUN = str_detect)
#> (5, 6) (3, 5)
#> (1, 2)(3, 4)(5, 6) TRUE FALSE
#> (1, 2)(3, 5)(4, 6) FALSE TRUE
This question already has answers here:
if-else vs ifelse with lists
(3 answers)
Closed 8 years ago.
Those two functions should give similar results, don't they?
f1 <- function(x, y) {
if (missing(y)) {
out <- x
} else {
out <- c(x, y)
}
return(out)
}
f2 <- function(x, y) ifelse(missing(y), x, c(x, y))
Results:
> f1(1, 2)
[1] 1 2
> f2(1, 2)
[1] 1
This is not related to missing, but rather to your wrong use of ifelse. From help("ifelse"):
ifelse returns a value with the same shape as test which is filled with elements selected from either yes or no depending on whether the element of test is TRUE or FALSE.
The "shape" of your test is a length-one vector. Thus, a length-one vector is returned. ifelse is not just different syntax for if and else.
The same result occurs outside of the function:
> ifelse(FALSE, 1, c(1, 2))
[1] 1
The function ifelse is designed for use with vectorised arguments. It tests the first element of arg1, and if true returns the first element of arg2, if false the first element of arg3. In this case it ignores the trailing elements of arg3 and returns only the first element, which is equivalent to the TRUE value in this case, which is the confusing part. It is clearer what is going on with different arguments:
> ifelse(FALSE, 1, c(2, 3))
[1] 2
> ifelse(c(FALSE, FALSE), 1, c(2,3))
[1] 2 3
It is important to remember that everything (even length 1) is a vector in R, and that some functions deal with each element individually ('vectorised' functions) and some with the vector as a whole.
I want to create a function that transforms its object.
I have tried to transform the variable as you would normally, but within the function.
This works:
vec <- c(1, 2, 3, 3)
vec <- (-1*vec)+1+max(vec, na.rm = T)
[1] 3 2 1 1
This doesn't work:
vec <- c(1, 2, 3, 3)
func <- function(x){
x <- (-1*x)+1+max(x, na.rm = T))
}
func(vec)
vec
[1] 1 2 3 3
R is functional so normally one returns the output. If you want to change
the value of the input variable to take on the output value then it is normally done by the caller, not within the function. Using func from the question it would normally be done like this:
vec <- func(vec)
Furthermore, while you can overwrite variables it is, in general, not a good
idea. It makes debugging difficult. Is the current value of vec the
input or output and if it is the output what is the value of the input? We
don't know since we have overwritten it.
func_ovewrite
That said if you really want to do this despite the comments above then:
# works but not recommended
func_overwrite <- function(x) eval.parent(substitute({
x <- (-1*x)+1+max(x, na.rm = TRUE)
}))
# test
v <- c(1, 2, 3, 3)
func_overwrite(v)
v
## [1] 3 2 1 1
Replacement functions
Despite R's functional nature it actually does provide one facility for overwriting although the function in the question is not really a good candidate for it so let us change the example to provide a function incr which increments the input variable by a given value. That is, it does this:
x <- x + b
We can write this in R as:
`incr<-` <- function(x, value) x + value
# test
xx <- 3
incr(xx) <- 10
xx
## [1] 13
T vs. TRUE
One other comment. Do not use T for true. Always write it out. TRUE is a reserved name in R but T is a valid variable name so it can lead to hard to find errors such as when someone uses T for temperature.
I have to create a function as: ans(x) which returns the value 2*abs(x), if x is
negative, and the value x otherwise. What command could i use?
Thanks
ans <- function(x){
ifelse(x < 0, 2*abs(x), x)
}
will do.
> ans(2)
[1] 2
> ans(-2)
[1] 4
Explanation:
We can use the built-in base R function ifelse(). The logic is pretty simple:
ifelse(condition, output if condition is TRUE, output if condition is FALSE)
Therefore, ifelse(x < 0, 2*abs(x), x) will do the following:
evaluate whether value x is negative (<0)
if TRUE, return 2*abs(x)
if FALSE, return x
The advantage of ifelse() over traditional if() is the vectorization. if() can only handle a single value, ifelse() will evaluate any vector given as input.
Comparison:
ans_if <- function(x){
if(x < 0){2*abs(x)}else{x}
}
This is the same function, using a traditional if() structure. Giving a single value as input will result in the same output for both functions:
> ans(-2)
[1] 4
> ans_if(-2)
[1] 4
But if you want to input multiple values, let's say
test <- c(-1, -2, 3, -4)
the ifelse() variant will evaluate every element of the vector and generate the correct output as a vector of the same length:
> ans(test)
[1] 2 4 3 8
whereas the if() variant will throw a warning
> ans_if(test)
[1] 2 4 6 8
Warning message:
In if (x < 0) { :
the condition has length > 1 and only the first element will be used
and return the wrong output, as only the first value was used for evaluation (-1) and the operation over the whole vector was based on this evaluation.
How does lapply extract sub-elements from a list? More specifically, how does lapply extract sub-elements from a list of lists versus a list of vectors? Even more specifically, suppose I have the following:
my_list_of_lists <- list(list(a = 1, b = 2), list(a = 2, c = 3), list(b = 4, c = 5))
my_list_of_lists[[1]][["a"]] # just checking
# [1] 1
# that's what I expected
and apply the following:
lapply(my_list_of_lists, function(x) x[["a"]])
# [[1]]
# [1] 1
#
# [[2]]
# [1] 2
#
# [[3]]
# NULL
So lapply extracts the a element from each of the 3 sublists, returning each in its own list, contained in the length=3 list. At this point, my mental model is the following: lapply applies FUN to each element of my_list, returning FUN(my_list[[i]]) for i in 1:3. Great! So I expect my mental model should work for lists of vectors as well. For example,
my_list_of_vecs <- list(c(a = 1, b = 2), c(a = 2, c = 3), c(b = 4, c = 5))
my_list_of_vecs[[1]][["a"]] # Just checking
# [1] 1
# that's what I expected
and apply the following:
lapply(my_list_of_vecs, function(x) x[["a"]])
# Error in x[["a"]] : subscript out of bounds
# Wait...What!?
What's going on here!? Shouldn't this just work? I found a section in help(lapply) which might be relevant:
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
I really don't know how to make sense of this.
I think it's related to the fact that you can use both [[ and [ extraction of single elements from a vector but you can ONLY use [ extraction of ranges of elements. For example,
my_list_of_vecs[[1]][1:2]
# a b
# 1 2
my_list_of_vecs[[1]][[1:2]]
# Error in my_list_of_vecs[[1]][[1:2]] :
# attempt to select more than one element in vectorIndex
So under the hood, lapply must be using function(x) x[["a"]] over a range. Is that right?
Debugging doesn't help me here since these functions rely on .Internal functions.