unlist to produce a vector of same length - r

I have a list like this:
lst <- list(a = c("y"), b = c("A", "B", "C"), c = c("x1", "x2"))
lst
> lst
$a
[1] "y"
$b
[1] "A" "B" "C"
$c
[1] "x1" "x2"
If I unlist it, I get:
unlist(lst)
> unlist(lst)
a b1 b2 b3 c1 c2
"y" "A" "B" "C" "x1" "x2"
How can I get a vector like:
a b c
"y" "A, B, C" "x1, x2"
Edit:
A similar question Convert a list of lists to a character vector was answered previously. The answer proposed by #42_ sapply( l, paste0, collapse="") could be used with a small modification: sapply( l, paste0, collapse=", "). Ronak Shah's sapply(lst, toString) to my question is a little more intuitive.

We can use toString to collapse all the elements in every list into a comma-separated string.
sapply(lst, toString)
# a b c
# "y" "A,B,C" "x1,x2"
which is same as using paste with collapse argument as ","
sapply(lst, paste, collapse = ",")

You can also do
unlist(Map(function(x) paste0(x,collapse = ","),lst))
Or
unlist(lapply(lst,function(x) paste0(x,collapse = ",")))
Or use purrr package
purrr::map_chr(lst,paste0,collapse = ",")

we can use map
library(purrr)
library(stringr)
map_chr(lst, str_c, collapse=",")

Related

Names of nested list containing dots (e.g. "c.2)

How can I get the names of the leafs of a nested list (containing a dataframe)
p <- list(a=1,b=list(b1=2,b2=3),c=list(c1=list(c11='a',c12='x'),c.2=data.frame("t"=1)))
into a vector format:
[[1]]
[1] "a"
[[2]]
[1] "b" "b1"
[[3]]
[1] "b" "b2"
[[4]]
[1] "c" "c1" "c11"
[[5]]
[1] "c" "c1" "c12"
[[6]]
[1] "c" "c.2"
The problem is that my list contains names with a dot (e.g. "c.2"). By using unlist, one gets "c.c.2" and I (or possibly strsplit) can't tell if the point is a delimiter of unlist or part of the name. That is the difference to this question.
It should ignore data.frames. My approach so far is adapted from here, but struggles with the points created by unlist:
listNames = function(l, maxDepth = 2) {
n = 0
listNames_rec = function(l, n) {
if(!is.list(l) | is.data.frame(l) | n>=maxDepth) TRUE
else {
n = n + 1
# print(n)
lapply(l, listNames_rec, n)
}
}
n = names(unlist(listNames_rec(l, n)))
return(n)
}
listNames(p, maxDepth = 3)
[1] "a" "b.b1" "b.b2" "c.c1.c11" "c.c1.c12" "c.c.2"
Like this?
subnames <- function(L, s) {
if (!is.list(L) || is.data.frame(L)) return(L)
names(L) <- gsub(".", s, names(L), fixed = TRUE)
lapply(L, subnames, s)
}
res <- listNames(subnames(p, ":"), maxDepth = 3)
gsub(":", ".",
gsub(".", "$", res, fixed = TRUE),
fixed = TRUE
)
#[1] "a" "b$b1" "b$b2" "c$c1$c11" "c$c1$c12" "c$c.2"
Not a full answer but I imagine rrapply package could help you here?
One option could be to extract all names:
library(rrapply)
library(dplyr)
rrapply(p, how = "melt") %>%
select(-value)
# L1 L2 L3
# 1 a <NA> <NA>
# 2 b b1 <NA>
# 3 b b2 <NA>
# 4 c c1 c11
# 5 c c1 c12
# 6 c c.2 t
The problem here is that data.frame names are included above too so you could extract them separately:
#extract data frame name
rrapply(p, classes = "data.frame", how = "melt") %>%
select(-value)
# L1 L2
# 1 c c.2
Then you could play around with these two datasets and perhaps extract duplicates but keep dataframe names
rrapply(p, how = "melt") %>%
bind_rows(rrapply(p, classes = "data.frame", how = "melt"))
#then filter etc...
A way might be:
listNames = function(l, n, N) {
if(!is.list(l) | is.data.frame(l) | n<1) list(rev(N))
else unlist(Map(listNames, l, n=n-1, N=lapply(names(l), c, N)), FALSE, FALSE)
}
listNames(p, 3, NULL)
#[[1]]
#[1] "a"
#
#[[2]]
#[1] "b" "b1"
#
#[[3]]
#[1] "b" "b2"
#
#[[4]]
#[1] "c" "c1" "c11"
#
#[[5]]
#[1] "c" "c1" "c12"
#
#[[6]]
#[1] "c" "c.2"

R use mapply on nested list

Using base R, I'd like to use the mapply function on a nested list. For example, in the code below, I'm trying to remove the letter "a" from each element of a nested list. I'd like to replace the last two lines with just a single line of code.
mylist <- list(
list(c("a", "b", "c"), c("d", "e", "f")),
list(c("a", "v", "w"), c("x", "y"), c("c", "b", "a"))
)
mylist
not_a <- lapply(mylist, lapply, `!=`, "a")
not_a
mylist[[1]] <- mapply(`[`, mylist[[1]], not_a[[1]], SIMPLIFY = FALSE)
mylist[[2]] <- mapply(`[`, mylist[[2]], not_a[[2]], SIMPLIFY = FALSE)
One option could be:
rapply(mylist, how = "replace", function(x) x[x != "a"])
[[1]]
[[1]][[1]]
[1] "b" "c"
[[1]][[2]]
[1] "d" "e" "f"
[[2]]
[[2]][[1]]
[1] "v" "w"
[[2]][[2]]
[1] "x" "y"
[[2]][[3]]
[1] "c" "b"
Or using map2
library(purrr)
map2(mylist, not_a, ~ map2(.x, .y, `[`))
Or using map_depth (if the OP is interested only in the final outcome)
map_depth(mylist, 2, ~ .x[.x != 'a'])
#[[1]]
#[[1]][[1]]
#[1] "b" "c"
#[[1]][[2]]
#[1] "d" "e" "f"
#[[2]]
#[[2]][[1]]
#[1] "v" "w"
#[[2]][[2]]
#[1] "x" "y"
#[[2]][[3]]
#[1] "c" "b"
Or more compactly
map_depth(mylist, 2, setdiff, 'a')
A double loop Map/mapply will do what the question asks for.
Map(function(i) mapply(`[`, mylist[[i]], not_a[[i]], SIMPLIFY = FALSE), seq_along(mylist))
Simpler:
Map(function(x, y) Map(`[`, x, y), mylist, not_a)

How to remove duplicate elements from two lists (pairwise)?

I have two very large lists (13000) elements. I would like to remove the duplicates pair-wise, i.e. remove object i in both lists if we find the same as object j.
The function unique() works very well for a single list, but does not work pairwise.
a = matrix(c(50,70,45,89), ncol = 2)
b = matrix(c(45,86), ncol = 2)
c = matrix(c(20,35), ncol = 2)
df1 = list(a,b,c)
df2 = list(a,b,a)
df3 = cbind(df1,df2)
v = unique(df3, incomparables = FALSE)
In the end, the expected result would be df1 = list(c) and df2 = list(a). Do you have a good approach for this? Thank you a lot!
If you only have single element for each component of your list, then you can:
df1 <- list("a", "b", "c")
df2 <- list("a", "b", "a")
comp <- unlist(df1) != unlist(df2)
df1[comp]
[[1]]
[1] "c"
df2[comp]
[[1]]
[1] "a"
is that what you were looking for?
a more generic (whatever you'd have in your lists) solution using purrr would be:
comp2 <- !purrr::map2_lgl(df1, df2, identical)
df1[comp2]
[[1]]
[1] "c"
df2[comp2]
[[1]]
[1] "a"
You can try
Filter(length, Map(function(x, y) x[x != y], df1, df2))
#[[1]]
#[1] "c"
Filter(length, Map(function(x, y) x[x != y], df2, df1))
#[[1]]
#[1] "a"

Concat named list of vectors

I'd trying to figure out how to transform a named list where the values are also list in a named list where the value is the result of a concatenation of the values within a vector.
I do not know if I explain correctly or easily, so follow the example.
x <- list(A = c("e", "f", "g"), B = c("a", "b", "c"), C = c("m", "l", "w"))
#$A
#[1] "e" "f" "g"
#$B
#[1] "a" "b" "c"
#$C
#[1] "m" "l" "w"
named_list_concat <- function(data){ ... }
named_list_concat(x)
#$A
#[1] "efg"
#$B
#[1] "abc"
#$C
#[1] "mlw"
One base possibility:
lapply(x, function(x) paste(x, collapse = ""))
$A
[1] "efg"
$B
[1] "abc"
$C
[1] "mlw"
Or the same thing in a shortened form:
lapply(x, paste, collapse = "")

Replace one element in vector with multiple elements

I have a vector where I want to replace one element with multiple element, I am able to replace with one but not multuiple, can anyone help?
For example I have
data <- c('a', 'x', 'd')
> data
[1] "a" "x" "d"
I want to replace "x" with "b", "c" to get
[1] "a" "b" "c" "d"
However
gsub('x', c('b', 'c'), data)
gives me
[1] "a" "b" "d"
Warning message:
In gsub("x", c("b", "c"), data) :
argument 'replacement' has length > 1 and only the first element will
be used
Here's how I would tackle it:
data <- c('a', 'x', 'd')
lst <- as.list(data)
unlist(lapply(lst, function(x) if(x == "x") c("b", "c") else x))
# [1] "a" "b" "c" "d"
We're making use of the fact that list-structures are more flexible than atomic vectors. We can replace a length-1 list-element with a length>1 element and then unlist the result to go back to an atomic vector.
Since you want to replace exact matches of "x" I prefer not to use sub/gsub in this case.
You may try this , although I believe the accepted answer is great:
unlist(strsplit(gsub("x", "b c", data), split = " "))
Logic: Replacing "x" with "b c" with space and then doing the strsplit, once its splitted we can convert is again back to vector using unlist.
This is a bit tricky of a problem because in your replacement you also want to grow your vector. That being said, I believe this should work:
replacement <- c("b","c")
new_data <- rep(data, times = ifelse(data=="x", length(replacement), 1))
new_data[new_data=="x"] <- replacement
new_data
#[1] "a" "b" "c" "d"
This will also work if you have multiple "x"s in your vector like:
data <- c("a","x","d","x")
Another approach:
data <- c('a', 'x', 'd')
pattern <- "x"
replacement <- c("b", "c")
sub_one_for_many <- function(pattern, replacement, x) {
removal_index <- which(x == pattern)
if (removal_index > 1) {
result <- c(x[1:(removal_index-1)], replacement, x[(removal_index+1):length(x)])
} else if (removal_index == 1) {
result <- c(replacement, x[2:length(x)])
}
return(result)
}
answer <- sub_one_for_many(pattern, replacement, data)
Output:
> answer
[1] "a" "b" "c" "d"

Resources