I have a lot of named lists. Now I want to separate them according to the number of letter "a" within each element. For instants,
library(stringr)
data1 <- c("apple","appreciate","available","account","adapt")
data2 <- c("tab","banana","cable","tatabox","aaaaaaa","aaaaaaaaaaa")
list1 <- list(data1,data2)
names(list1) <- c("a","b")
ca <- lapply(list1, function(x) str_count(x, "a")) #counting letter a
factor1 <- lapply(ca,as.factor) #convert ca to factor
#is that possible to associate factor1 to list1, then I can separate
#elements depends on the factor1?
#ideal results
result$1 or result[1]
$`1`
$`a`$`1`
[1] "apple" "account"
$`b`$`1`
[1] "tab" "cable"
You can get very close with one line using split and Map:
Map(split, list1, Map(stringr::str_count, list1, "a"))
$a
$a$`1`
[1] "apple" "account"
$a$`2`
[1] "appreciate" "adapt"
$a$`3`
[1] "available"
$b
$b$`1`
[1] "tab" "cable"
$b$`2`
[1] "tatabox"
$b$`3`
[1] "banana"
$b$`7`
[1] "aaaaaaa"
$b$`11`
[1] "aaaaaaaaaaa"
This lists all the "a" elements first and then all the "b" elements grouped by the number of "a" characters.
Related
I have a list containing a number of lists containing character vectors. The lists are always arranged so that the first list contains a vector with a single element, the second list contains a vector with two elements and the third contains one or more vectors containing three elements.
fruits <- list(
list(c("orange")),
list(c("pear", "orange")),
list(c("orange", "pear", "grape"),
c("orange", "lemon", "pear"))
)
I need to iterate through the lists in order to remove the elements from the vector in the previous list. i.e. I would first find the value from the vector in the first list ('orange') and remove it from the vector in the second list, then take the values from the second list ("pear", "orange") and remove them from both vectors in the third list, so I ended up with:
new_fruits <- list(
list(c("orange")),
list(c("pear")),
list(c("grape"),
c("lemon"))
)
I should add that I have had a go at doing this, but I'm finding the lists within lists make it quite complicated and my solution is long and not very efficient.
Maybe you can try the code below
new_fruits <- s <- c()
for (k in seq_along(fruits)) {
new_fruits[[k]] <- lapply(fruits[[k]],function(x) x[!x%in%s])
s <- union(s,unlist(fruits[[k]]))
}
which gives
> new_fruits
[[1]]
[[1]][[1]]
[1] "orange"
[[2]]
[[2]][[1]]
[1] "pear"
[[3]]
[[3]][[1]]
[1] "grape"
[[3]][[2]]
[1] "lemon"
Here is an idea where we unlist, convert to strings and resplit to differentiate between different vectors of same element. We then unlist one more time and get the unique values, i.e.
as.list(unique(unlist(strsplit(unlist(lapply(fruits, function(i) sapply(i, toString))), ', '))))
#[[1]]
#[1] "orange"
#[[2]]
#[1] "pear"
#[[3]]
#[1] "grape"
#[[4]]
#[1] "lemon"
Another two compact options:
> mapply(fruits,append(list(list("")),fruits[-length(fruits)], after = length(fruits)), FUN = function(x,y) sapply(x,function(item)list(setdiff(item,y[[1]]))))
[[1]]
[[1]][[1]]
[1] "orange"
[[2]]
[[2]][[1]]
[1] "pear"
[[3]]
[[3]][[1]]
[1] "grape"
[[3]][[2]]
[1] "lemon"
or also
> append(fruits[[1]],mapply(fruits[-1],fruits[-length(fruits)], FUN = function(x,y) sapply(x,function(item)list(setdiff(item,y[[1]])))), after = length(fruits))
[[1]]
[1] "orange"
[[2]]
[[2]][[1]]
[1] "pear"
[[3]]
[[3]][[1]]
[1] "grape"
[[3]][[2]]
[1] "lemon"
library(tidyverse)
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
test <- ridiculous_function("apple", "A")
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
This code produces a list of elements of a and b, however what I would like is to run the function over two vectors in parallel, and then put all of the results in the same list.
For example, with these two vectors:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
I would want to create a list which produces elements of character vectors for "apple", "A", "apricot", "B", "avocado", "C".. and so on. My real scenario is a lot more complex so I need a solution which works with the confines of my function.
Expected output:
> test
[[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
....
[[19]]
[1] "blueberry"
[[20]]
[1] "T"
How about:
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
library(tidyverse)
flatten(map2(fruits10, letters10, ridiculous_function))
which gives you
[1]]
[1] "apple"
[[2]]
[1] "A"
[[3]]
[1] "apricot"
[[4]]
[1] "B"
[[5]]
[1] "avocado"
[[6]]
[1] "C"
[[7]]
[1] "banana"
[[8]]
[1] "D"
etc...
Here are a few different ways of doing this:
library(tidyverse)
fruits10 <- fruit[1:10]
letters10 <- LETTERS[1:10]
ridiculous_function <- function(a, b){
moo <- a
baz <- b
list(moo, baz)
}
# using mapply, base R for writing packages
mapply(ridiculous_function, fruits10, letters10) %>%
split(rep(1:ncol(.), each = nrow(.)))
# using map2, takes two args
map2(fruits10, letters10, ridiculous_function)
# using pmap, can take as many args as you want
list(a = fruits10,
b = letters10) %>%
pmap(ridiculous_function)
You ask for results in a flat list format, so you can pop a flatten at the end
of each of these, but usually you would want to retain the list structure.
ll<-list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
looks like:
[[1]]
[[1]][[1]]
[1] "A" "B" "C"
[[1]][[2]]
[1] "Peter"
[[2]]
[1] "John"
[[3]]
[1] "Hans"
Lets say I have the indices in a list for "Peter" and "B" respectively.
peter.ind <- list(1,2) # correlates with ll[[1]][[2]]
B.ind <- list(1,1,2) # correlates with ll[[1]][[1]][[2]]
So how can I most effectively extract a "tangled" list element by its cascaded index chain?
Here is my already working function:
extract0r <- function(x,l) {
for(ind in l) {
x <- x[[ind]]
}
return(x)
}
call function:
extract0r(ll,peter.ind) #evals [1] "Peter"
extract0r(ll,B.ind) #evals [1] "B"
Is there a neater alternative to my function?
You can use a recursive function:
ll <- list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
my.ind <- function(L, ind) {
if (length(ind)==1) return(L[[ind]])
my.ind(L[[ind[1]]], ind[-1])
}
my.ind(ll, c(1,2))
my.ind(ll, c(1,1,2))
# > my.ind(ll, c(1,2))
# [1] "Peter"
# > my.ind(ll, c(1,1,2))
# [1] "B"
The recursive function has a (relative) clear coding, but during execution it has an overhead for the deep function calls.
There are many ways of doing this.
For example, you can build the commands from character strings:
my.ind.str <- function(L, ind) {
command <- paste0(c("L",sprintf("[[%i]]", ind)),collapse="")
return(eval(parse(text=command)))
}
With your example, I had to convert the lists of indices to vectors:
my.ind.str(ll, unlist(peter.ind))
[1] "Peter"
my.ind.str(ll, unlist(B.ind))
[1] "B"
I am currently trying to subset a list in R from a dataframe. My current attempt looks like:
list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))
for(i in list.level){
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
}
However, instead of filling the list it seems to create a duplicate list of the same amount of rows, returning:
[[1]]
NULL
[[2]]
NULL
...
NULL
[[22]]
NULL
[[23]]
NULL
$A
[1] "A"
$C
[1] "C" "C" "C"
$D
[1] "D" "D" "D"
...
$AJ
[1] "AJ" "AJ" "AJ" "AJ" "AJ"
$AK
[1] "AK" "AK"
A should be filling into 1, C into 2, etc. etc. How do I get these to fill in the original rows rather than creating extra rows at the bottom of the list?
Here is what is going on. Suppose your buckets$group is c("a","a","b","b").
list.level <- unique(buckets$group)
Now list.level is c("a","b")
bucket.group <- vector("list",length(list.level))
Since length(list.level) is 2, now your bucket.group is a list of 2 NULL elements, their names are 1 and 2.
for(i in list.level){
Recalling the value of list.level, it is the same as for i in c("a","b").
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
Since i loops over "a" and "b", you now fill bucket.group[["a"]] and bucket.group[["b"]], while bucket.group[[1]] and bucket.group[[2]] remain intact.
To fix this, you should write instead
list.level <- unique(buckets$group) # ok, this was correct
bucket.group <- list() # just empty list
for(i in 1:length(list.level)){
bucket.group[[i]] <- buckets$group[buckets$group == list.level[[i]] ]
}
I think the issue is with your for statement.
Your code is like this:
list.level<-letters[1:10]
> for(i in list.level) print(i)
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
It assigns each element in list.level to i, so i is a letter. When you do
bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
in the first iteration, i is a letter. So it looks for a list element called bucket.group[["a"]] and does not find it, so it creates it and stores the data there. If instead you use seq_along
for(i in seq_along(list.level)) print(i)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
now i will alway be a number and the code will do what you want.
So use seq_along instead.
this should work:
list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))
for(i in 1:length(list.level)){
bucket.group[[i]] <- subset(buckets$group,buckets$group == list.level[i])
}
I have a list of character vectors, where some elements are actual strings, such as "FA" and "EX". However, some others are just "". I want to delete these.
list1 <- c("FA", "EX", "")
list2 <- c("FA")
list3 <- c("")
list <- list(list1, list2, list3)
> list
[[1]]
[1] "FA" "EX" ""
[[2]]
[1] "FA"
[[3]]
[1] ""
Should then be
[[1]]
[1] "FA" "EX"
[[2]]
[1] "FA"
How can I accomplish this?
Try
lapply(list[list!=''], function(x) x[x!=''])