Create a list containing a variable number of lists - r

I need to create a list from rows of a dataframe in the following format:
df <- data.frame(y1 = c("a", "d"), y2 = c("b", "e"), y3 = c("c", "f"))
df$y1 <- as.character(df$y1)
df$y2 <- as.character(df$y2)
df$y3 <- as.character(df$y3)
x <- list(
list(y1 = df$y1[1],
y2 = df$y2[1],
y3 = df$y3[1]),
list(y1 = df$y1[2],
y2 = df$y2[2],
y3 = df$y3[2])
)
> x
[[1]]
[[1]]$`y1`
[1] "a"
[[1]]$y2
[1] "b"
[[1]]$y3
[1] "c"
[[2]]
[[2]]$`y1`
[1] "d"
[[2]]$y2
[1] "e"
[[2]]$y3
[1] "f"
This is an example when there are two rows in the dataframe. How can I achieve this when the number of rows in the dataframe is variable? So for every row in the dataframe, there should be a list.

We may also use apply by going over the rows and applying as.list to each:
apply(df, 1, as.list)
[[1]]
[[1]]$y1
[1] "a"
[[1]]$y2
[1] "b"
[[1]]$y3
[1] "c"
[[2]]
[[2]]$y1
[1] "d"
[[2]]$y2
[1] "e"
[[2]]$y3
[1] "f"

We first split every row of the dataframe and then for every row we convert each element into separate list element using as.list
lapply(split(df, 1:nrow(df)), as.list)
#$`1`
#$`1`$y1
#[1] "a"
#$`1`$y2
#[1] "b"
#$`1`$y3
#[1] "c"
#$`2`
#$`2`$y1
#[1] "d"
#$`2`$y2
#[1] "e"
#$`2`$y3
#[1] "f"

We can use transpose from purrr
library(purrr)
transpose(df)
#[1]]
#[[1]]$y1
#[1] "a"
#[[1]]$y2
#[1] "b"
#[[1]]$y3
#[1] "c"
#[[2]]
#[[2]]$y1
#[1] "d"
#[[2]]$y2
#[1] "e"
#[[2]]$y3
#[1] "f"

Related

R use mapply on nested list

Using base R, I'd like to use the mapply function on a nested list. For example, in the code below, I'm trying to remove the letter "a" from each element of a nested list. I'd like to replace the last two lines with just a single line of code.
mylist <- list(
list(c("a", "b", "c"), c("d", "e", "f")),
list(c("a", "v", "w"), c("x", "y"), c("c", "b", "a"))
)
mylist
not_a <- lapply(mylist, lapply, `!=`, "a")
not_a
mylist[[1]] <- mapply(`[`, mylist[[1]], not_a[[1]], SIMPLIFY = FALSE)
mylist[[2]] <- mapply(`[`, mylist[[2]], not_a[[2]], SIMPLIFY = FALSE)
One option could be:
rapply(mylist, how = "replace", function(x) x[x != "a"])
[[1]]
[[1]][[1]]
[1] "b" "c"
[[1]][[2]]
[1] "d" "e" "f"
[[2]]
[[2]][[1]]
[1] "v" "w"
[[2]][[2]]
[1] "x" "y"
[[2]][[3]]
[1] "c" "b"
Or using map2
library(purrr)
map2(mylist, not_a, ~ map2(.x, .y, `[`))
Or using map_depth (if the OP is interested only in the final outcome)
map_depth(mylist, 2, ~ .x[.x != 'a'])
#[[1]]
#[[1]][[1]]
#[1] "b" "c"
#[[1]][[2]]
#[1] "d" "e" "f"
#[[2]]
#[[2]][[1]]
#[1] "v" "w"
#[[2]][[2]]
#[1] "x" "y"
#[[2]][[3]]
#[1] "c" "b"
Or more compactly
map_depth(mylist, 2, setdiff, 'a')
A double loop Map/mapply will do what the question asks for.
Map(function(i) mapply(`[`, mylist[[i]], not_a[[i]], SIMPLIFY = FALSE), seq_along(mylist))
Simpler:
Map(function(x, y) Map(`[`, x, y), mylist, not_a)

Generate all combinations (and their sum) of a vector of characters in R

Suppose that I have a vector of length n and I need to generate all possible combinations and their sums. For example:
If n=3, we have:
myVec <- c("a", "b", "c")
Output =
"a"
"b"
"c"
"a+b"
"a+c"
"b+c"
"a+b+c"
Note that we consider that a+b = b+a, so only need to keep one.
Another example if n=4,
myVec <- c("a", "b", "c", "d")
Output:
"a"
"b"
"c"
"d"
"a+b"
"a+c"
"a+d"
"b+c"
"b+d"
"c+d"
"a+b+c"
"a+c+d"
"b+c+d"
"a+b+c+d"
We can use sapply with varying length in combn and use paste as function to apply.
sapply(seq_along(myVec), function(n) combn(myVec, n, paste, collapse = "+"))
#[[1]]
#[1] "a" "b" "c"
#[[2]]
#[1] "a+b" "a+c" "b+c"
#[[3]]
#[1] "a+b+c"
myVec <- c("a", "b", "c", "d")
sapply(seq_along(myVec), function(n) combn(myVec, n, paste, collapse = "+"))
#[[1]]
#[1] "a" "b" "c" "d"
#[[2]]
#[1] "a+b" "a+c" "a+d" "b+c" "b+d" "c+d"
#[[3]]
#[1] "a+b+c" "a+b+d" "a+c+d" "b+c+d"
#[[4]]
#[1] "a+b+c+d"
We can unlist if we need output as single vector.

Concat named list of vectors

I'd trying to figure out how to transform a named list where the values are also list in a named list where the value is the result of a concatenation of the values within a vector.
I do not know if I explain correctly or easily, so follow the example.
x <- list(A = c("e", "f", "g"), B = c("a", "b", "c"), C = c("m", "l", "w"))
#$A
#[1] "e" "f" "g"
#$B
#[1] "a" "b" "c"
#$C
#[1] "m" "l" "w"
named_list_concat <- function(data){ ... }
named_list_concat(x)
#$A
#[1] "efg"
#$B
#[1] "abc"
#$C
#[1] "mlw"
One base possibility:
lapply(x, function(x) paste(x, collapse = ""))
$A
[1] "efg"
$B
[1] "abc"
$C
[1] "mlw"
Or the same thing in a shortened form:
lapply(x, paste, collapse = "")

R - How do I check if an element is in a list of vectors?

Ok, my question might be a bit weirder than what the title suggests.
I have this list:
x <- list(
c("a", "d"),
c("a", "c"),
c("d", "e"),
c("e", "f"),
c("b", "c"),
c("f", "c"), # row 6
c("c", "e"),
c("f", "b"),
c("b", "a")
)
And I need to copy this stuff in another list called T. The only condition is that both letters of the pair must not be in T already. If one of them is already in T and the other isn't it's fine.
Basically in this example I would take the first 5 positions and copy them in T one after another because either one or both letters are new to T.
Then I would skip the 6th position because the letter "f" was already in the 4th position of T and the letter "c" is already in the 2nd and 5th positions of T.
Then I would skip the remaining 3 positions for the same reason (the letters "c", "e", "f", "b", "a" are already in T at this point)
I tried doing this
for(i in 1:length(T){
if (!( *first letter* %in% T && *second letter* %in% T)) {
T[[i]] <- c(*first letter*, *second letter*)
}
}
But it's like the "if" isn't even there, and I'm pretty sure I'm using %in% in the wrong way.
Any suggestions? I hope what I wrote makes sense, I'm new to R and to this site in general.
Thanks for your time
Effectively, for each element of the list, you want to lose it if both of its elements exist in earlier elements. A logical index is helpful here.
# Make a logical vector the length of x.
lose <- logical(length(x))
Now you can run a loop over the length of lose and compare it against all previous elements of x. Using seq_len saves us the headache of having to guard against the special case of i = 1 (seq_len(0) returns a zero-length integer instead of 0).
for (i in seq_along(lose)){
lose[i] <- all(x[[i]] %in% unique(unlist(x[seq_len(i - 1)])))
}
Now let's use the logical vector to subset x to T
T <- x[!lose]
T
#> [[1]]
#> [1] "a" "d"
#>
#> [[2]]
#> [1] "a" "c"
#>
#> [[3]]
#> [1] "d" "e"
#>
#> [[4]]
#> [1] "e" "f"
#>
#> [[5]]
#> [1] "b" "c"
# Created on 2018-07-19 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0).
You can put the set of all previous elements in a list cum.sets, then use Map to check if all elements of the current vector are in the lagged cumulative set.
cum.sets <- lapply(seq_along(x), function(y) unlist(x[1:y]))
keep <- unlist(
Map(function(x, y) !all(x %in% y)
, x
, c(NA, cum.sets[-length(cum.sets)])))
x[keep]
# [[1]]
# [1] "a" "d"
#
# [[2]]
# [1] "a" "c"
#
# [[3]]
# [1] "d" "e"
#
# [[4]]
# [1] "e" "f"
#
# [[5]]
# [1] "b" "c"
tidyverse version (same output)
library(tidyverse)
cum.sets <- imap(x, ~ unlist(x[1:.y]))
keep <- map2_lgl(x, lag(cum.sets), ~!all(.x %in% .y))
x[keep]
You can use Reduce. In this case. IF all the new values are not in the list already, then concatenate it to the list, else drop it. the initial is the first element of the list:
Reduce(function(i, y) c(i, if(!all(y %in% unlist(i))) list(y)), x[-1],init = x[1])
[[1]]
[1] "a" "d"
[[2]]
[1] "a" "c"
[[3]]
[1] "d" "e"
[[4]]
[1] "e" "f"
[[5]]
[1] "b" "c"
The most straightforward option is that you could store unique entries in another vector as you're looping through your input data.
Here's a solution without considering the positions (1 or 2) of the alphabets in your output list or the order of your input list.
dat <- list(c('a','d'),c('a','c'),c('d','e'),c('e','f'),c('b','c'),
c('f','c'),c('c','e'),c('f','b'),c('b','a'))
Dat <- list()
idx <- list()
for(i in dat){
if(!all(i %in% idx)){
Dat <- append(Dat, list(i))
## append to idx if not previously observed
if(! i[1] %in% idx) idx <- append(idx, i[1])
if(! i[2] %in% idx) idx <- append(idx, i[2])
}
}
print(Dat)
#> [[1]]
#> [1] "a" "d"
#>
#> [[2]]
#> [1] "a" "c"
#>
#> [[3]]
#> [1] "d" "e"
#>
#> [[4]]
#> [1] "e" "f"
#>
#> [[5]]
#> [1] "b" "c"
On another note, I'd advise against using T as your vector name as it's used as TRUE in R.
We can unlist, check duplicated values with duplicated, reformat as a matrix and filter out pairs of TRUE values:
x[colSums(matrix(duplicated(unlist(x)), nrow = 2)) != 2]
# [[1]]
# [1] "a" "d"
#
# [[2]]
# [1] "a" "c"
#
# [[3]]
# [1] "d" "e"
#
# [[4]]
# [1] "e" "f"
#
# [[5]]
# [1] "b" "c"
#
And I recommend you don't use T as a variable name, it means TRUE by default (thought it's discouraged to use it as such), this could lead to unpleasant debugging.

R remove an object from a list of vectors

I have a list of vectors and i would like to remove a specific object. Any ideas hot to achieve that?
Lets say i would like to remove the object F. How can i do that?
blocks <- list(
c("A", "B"),
c("C"),
c("D","E", "F")
)
We could also use setdiff with Map
Map(setdiff, blocks, 'F')
#[[1]]
#[1] "A" "B"
#[[2]]
#[1] "C"
#[[3]]
#[1] "D" "E"
or with lapply
lapply(blocks, setdiff, 'F')
#[[1]]
#[1] "A" "B"
#[[2]]
#[1] "C"
#[[3]]
#[1] "D" "E"
If you wanted to remove the third element of the third element of your list, you could try:
blocks[[3]] <- blocks[[3]][-3]
blocks
# [[1]]
# [1] "A" "B"
#
# [[2]]
# [1] "C"
#
# [[3]]
# [1] "D" "E"
If you wanted to remove all elements equal to "F", you could use lapply and a user-defined function to process each vector in the list, removing all "F" elements.
lapply(blocks, function(x) x[x != "F"])
# [[1]]
# [1] "A" "B"
#
# [[2]]
# [1] "C"
#
# [[3]]
# [1] "D" "E"

Resources