If I have a symmetric binary operator that I want to apply over the pairs of elements from a list, is there an easy way I can do this in R? I tried:
A <- list(1,2,3)
mapply(function(x,y) x+y, A,A)
but this only gives x[n]+y[n] for all n=1..N but I want x[n]+y[m] for all m=1..n, n=1..N returned as a list. outer(..) does that for m=1..N, n=1..N which involves redundant computation so I want to discount that.
Notice I don't want solution to this simple example. I need a general solution that works for non-numeric input as well. The thing I'm trying to do is like:
mapply(function(set_1, set_2) setequal(intersect(set_1, set_2), set_3), list_of_sets, list_of_sets)
In both cases addition and intersection are symmetric. In the first example, I expect list(3,4,5) from list(1+2,1+3,2+3). For the second case me input list_of_sets is:
> list_of_sets
[[1]]
numeric(0)
[[2]]
[1] 1
[[3]]
[1] 2
[[4]]
[1] 1 2
[[5]]
[1] 3
[[6]]
[1] 1 3
[[7]]
[1] 2 3
[[8]]
[1] 1 2 3
and set_3 being c(1,2) as a simple example.
You may use outer -
values <- c(1, 2, 3)
outer(values, values, `+`)
# [,1] [,2] [,3]
#[1,] 2 3 4
#[2,] 3 4 5
#[3,] 4 5 6
outer also works for non-numeric input. If the function that you want to apply is not vectorised you can use Vectorize. Since OP did not provide an example I have created one of my own.
list_of_sets_1 <- list(c('a', 'b', 'c'), c('a'))
list_of_sets_2 <- list(c('a', 'c'), c('a', 'b'))
fun <- function(x, y) intersect(x, y)
result <- outer(list_of_sets_1, list_of_sets_2, Vectorize(fun))
result
We need combn to do pairwise computation without redundancy
combn(A, 2, FUN = function(x) x[[1]] + x[[2]], simplify = FALSE)
-output
[[1]]
[1] 3
[[2]]
[1] 4
[[3]]
[1] 5
This will also work with non-numeric elements
list_of_sets <- list(c('a', 'b', 'c'), "a", c("a", "c"))
combn(list_of_sets, 2, FUN = function(x) Reduce(intersect, x), simplify = FALSE)
-output
[[1]]
[1] "a"
[[2]]
[1] "a" "c"
[[3]]
[1] "a"
We may also do
combn(list_of_sets, 2, FUN = function(x)
setequal(intersect(x[[1]], x[[2]]), set_3), simplify = FALSE)
Related
Say I have the following function
power<-function (number){
number^2
}
I also have a vector -z
z<- c(1:3, "a", 7:9)
I would like to apply the power function over the vector variables. If everything is a number, the functions works well using this code, which creates a list as I want:
q<-lapply(z, FUN=power)
How do I make the function skip, if it does not find a valid argument? In this case skip "a". Let's say removing the odd argument is not an option for my task. I might also have cases when the function does not find the argument at all (e.g. empty space, missing tag on a web page). Would be nice if the solution could work for these instances as well. Thanks.
Consider creating a list instead of a vector as list can have multiple types whereas vector can have only a single class and if there is a single character element, it returns the whole object as character
z <- list(1:3, "a", 7:9)
lapply(Filter(is.numeric, z), FUN = power)
Or with map_if
library(purrr)
map_if(z, .p = is.numeric, .f = power)
-output
[[1]]
[1] 1 4 9
[[2]]
[1] "a"
[[3]]
[1] 49 64 81
This will try to coerce the elements of the supplied vector to numeric. Values not coercible will have NA returned. Note that your input vector z may not be what you intended, i.e. it resolves to a character vector c("1", "2", "3", "a", ...) and not c(1, 2, 3, "a", 7, 8, 9).
power<-function (number){
result<- as.numeric(number)^2
}
z <- c(1:3, "a", 7:9)
lapply(z, power)
[[1]]
[1] 1
[[2]]
[1] 4
[[3]]
[1] 9
[[4]]
[1] NA
[[5]]
[1] 49
[[6]]
[1] 64
[[7]]
[1] 81
We can also write a custom function that wraps power inside lapply. Basically an equivalent of map_if:
z <- list(1:3, "a", 7:9)
lapply(z, function(x) {
if(all(is.numeric(x))) {
power(x)
} else {
x
}
})
[[1]]
[1] 1 4 9
[[2]]
[1] "a"
[[3]]
[1] 49 64 81
Try the code below
power <- Vectorize(function(number) {
ifelse(is.numeric(number), number^2, number)
})
z <- list(1:3, "a", 7:9)
lapply(z, power)
which gives
> lapply(z, power)
[[1]]
[1] 1 4 9
[[2]]
a
"a"
[[3]]
[1] 49 64 81
I have a list with known expected values. Some of these values may be missing in the list I have. How can I update the list to return NULL for the elements that are missing to give me what I want?
expected <- c('a', 'b', 'c', 'd')
have <- list(a=1, b=3, d=5)
want <- list(a=1, b=3, c=NULL, d=5)
I can do it like this but this seems a little hacky in that I have to rename the NAs.
x <- have[expected]
names(x) <- expected
x
## $a
## [1] 1
##
## $b
## [1] 3
##
## $c
## NULL
##
## $d
## [1] 5
I want to keep the names for easy indexing later.
One option without creating an object is setNames
setNames(have[expected], expected)
#$a
#[1] 1
#$b
#[1] 3
#$c
#NULL
#$d
#[1] 5
Consider the following list:
temp <- list(1, "a", TRUE)
We can use sapply to replicate the list:
> ts <- sapply(1:5, function(x) temp)
> ts
[,1] [,2] [,3] [,4] [,5]
id 1 1 1 1 1
grade "a" "a" "a" "a" "a"
alive TRUE TRUE TRUE TRUE TRUE
If I inspect the result using typeof, I obtain list. However, if I inspect it with sapply, I get this:
> sapply(ts, function(x) print(x))
[1] 1
[1] "a"
[1] TRUE
[1] 1
[1] "a"
[1] TRUE
[1] 1
[1] "a"
[1] TRUE
[1] 1
[1] "a"
[1] TRUE
[1] 1
[1] "a"
[1] TRUE
That is, when I inspect the same result with sapply, this vector of lists is treated as a matrix. Is there any workaround, or does R disallow a vector of lists in general? If the latter is the case, why do I get "list" from typeof?
PS: For my specific question, I understand the obvious solution of using lapply to switch to a list of lists. I am just curious and confused by R’s behavior.
The return of sapply(ts, function(x) print(x)) is still a list. Actually a list of 15 variables as 3 members of temp has been simplified and returned as 3 items (times 5 iterations). If you want something like lapply like output please try:
>ts <- sapply(1:5, function(x) temp, simplify = FALSE)
> ts
#[[1]]
#[[1]][[1]]
#[1] 1
#
#[[1]][[2]]
#[1] "a"
#
#[[1]][[3]]
#[1] TRUE
#.......
#.......
Or even you can try:
>ts <- sapply(1:5, function(x) as.data.frame(temp))
In python we can do this..
numbers = [1, 2, 3]
characters = ['foo', 'bar', 'baz']
for item in zip(numbers, characters):
print(item[0], item[1])
(1, 'foo')
(2, 'bar')
(3, 'baz')
We can also unpack the tuple rather than using the index.
for num, char in zip(numbers, characters):
print(num, char)
(1, 'foo')
(2, 'bar')
(3, 'baz')
How can we do the same using base R?
To do something like this in an R-native way, you'd use the idea of a data frame. A data frame has multiple variables which can be of different types, and each row is an observation of each variable.
d <- data.frame(numbers = c(1, 2, 3),
characters = c('foo', 'bar', 'baz'))
d
## numbers characters
## 1 1 foo
## 2 2 bar
## 3 3 baz
You then access each row using matrix notation, where leaving an index blank includes everything.
d[1,]
## numbers characters
## 1 1 foo
You can then loop over the rows of the data frame to do whatever you want to do, presumably you actually want to do something more interesting than printing.
for(i in seq_len(nrow(d))) {
print(d[i,])
}
## numbers characters
## 1 1 foo
## numbers characters
## 2 2 bar
## numbers characters
## 3 3 baz
For another option, how about mapply, which is the closest analog to zip I can think of in R. Here I'm using the c function to make a new vector, but you could use any function you'd like:
numbers<- c(1, 2, 3)
characters<- c('foo', 'bar', 'baz')
mapply(c,numbers, characters, SIMPLIFY = FALSE)
[[1]]
[1] "1" "foo"
[[2]]
[1] "2" "bar"
[[3]]
[1] "3" "baz"
Which way is of most use depends on what you want to do with your output, but as the other answers mention, a dataframe is the most natural approach in R (and pandas dataframe probably in python).
To index a vector in R, where the vector is variable x would be x[1]. This would return the first element of the vector. R element numbering starts at 1 in contrast to Python which starts at 0.
For this problem it would be:
x = seq(1,10)
j = seq(11,20)
for (i in 1:length(x)){
print (c(x[i],j[i]))
}
Many functions in R are vectorized and don't require loops:
numbers = c(1, 2, 3)
characters = c('foo', 'bar', 'baz')
myList <- list(numbers, characters)
myDF <- data.frame(numbers,characters, stringsAsFactors = F)
print(myList)
print(myDF)
This is the conceptual equivalent:
for (item in Map(list,numbers,characters)){ # though most of the time you would actually do all your work inside Map
print(item[c(1,2)])
}
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[1]]
# [1] 2
#
# [[2]]
# [1] "b"
#
# [[1]]
# [1] 3
#
# [[2]]
# [1] "c"
#
# [[1]]
# [1] 4
#
# [[2]]
# [1] "d"
#
# [[1]]
# [1] 5
#
# [[2]]
# [1] "e"
Though most of the time you would actually do all your work inside Map and do something like this:
Map(function(nu,ch){print(data.frame(nu,ch))},numbers,characters)
This is the closest I could get to a clone:
zip <- function(...){ Map(list,...)}
print2 <- function(...){do.call(cat,c(list(...),"\n"))}
for (item in zip(numbers,characters)){
print2(item[[1]],item[[2]])
}
# 1 a
# 2 b
# 3 c
# 4 d
# 5 e
to be able to call items by their names (still works with indices):
zip <- function(...){
names <- sapply(substitute(list(...))[-1],deparse)
Map(function(...){setNames(list(...),names)}, ...)
}
for (item in zip(numbers,characters)){
print2(item[["numbers"]],item[["characters"]])
}
The tidyverse solution would be to use purrr::map2 function. Ex:
numbers <- c(1, 2, 3)
characters <- c('foo', 'bar', 'baz')
map2(numbers, characters, ~paste0(.x, ',', .y))
#[[1]]
#[1] "1,foo"
#[[2]]
#[1] "2,bar"
#[[3]]
#[1] "3,baz"
See API here
Other scalable alternatives: Store the vectors in the list and iterate over.
vect1 <- c(1, 2, 3)
vect1 <- c('foo', 'bar', 'baz')
vect2 <- c('a', 'b', 'c')
idx_list <- list(vect1, vect2)
idx_vect <- c(1:length(idx_list[[1]]))
for(i in idx_vect){
x <- idx_list[[1]][i]
j <- idx_list[[2]][i]
print(c(i, x, j))
}
I can make this list by hand:
list( list(n=1) , list(n=2), list(n=3) )
But how do I automate this, for instance if I want n to go up to 10? I tried as.list(1:10), which firstly is a different type of data structure, and secondly I couldn't work out how to specify n.
I'm hoping the answer can be expanded to multiple element lists, e.g. all combinations of 1:3 and c('A','B'):
list( list(n=1,z='A') , list(n=2,z='A'), list(n=3,z='A'),
list(n=1,z='B') , list(n=2,z='B'), list(n=3,z='B') )
Background: I'll be using it along the lines of: lapply( outer_list, function(params) do.call(FUN,params) )
UPDATE:
It was difficult to choose which answer to give the tick to. I went with the expand.grid approach as it can scale to more than two parameters more easily; the use of mapply as shown in the comment makes the two examples above look reasonably compact and readable:
outer_list=with( expand.grid(n=1:10,stringsAsFactors=F),
mapply(list, n=n, SIMPLIFY=F)
)
outer_list=with( expand.grid(n=1:3,z=c('A','Z'), stringsAsFactors=F),
mapply(list, n=n, z=z, SIMPLIFY=F)
)
They violate the DRY principle, by repeating the parameter names in the mapply() call, which bothers me a little. So, when it bothers me enough I will use the alply call as shown in Sebastian's answer.
You don't need to expand using expand.grid.
L <- mapply(function(x, y) list("n"=x,"z"=y),
rep(1:10, each=10), LETTERS[1:10],
SIMPLIFY=FALSE)
EDIT (see comment below)
L <- mapply(function(x, y) list("n"=x,"z"=y),
rep(1:10, each=length(LETTERS[1:10])), LETTERS[1:10],
SIMPLIFY=FALSE)
vals <- expand.grid(n=1:3, z=c("A", "B"),
KEEP.OUT.ATTRS=FALSE, stringsAsFactors=FALSE)
library(plyr)
alply(vals, 1, as.list)
$`1`
$`1`$n
[1] 1
$`1`$z
[1] "A"
$`2`
$`2`$n
[1] 2
$`2`$z
[1] "A"
$`3`
$`3`$n
[1] 3
$`3`$z
[1] "A"
$`4`
$`4`$n
[1] 1
$`4`$z
[1] "B"
$`5`
$`5`$n
[1] 2
$`5`$z
[1] "B"
$`6`
$`6`$n
[1] 3
$`6`$z
[1] "B"
attr(,"split_type")
[1] "array"
attr(,"split_labels")
n z
1 1 A
2 2 A
3 3 A
4 1 B
5 2 B
6 3 B