Checking for names in nested named list in R - r

I have a nested named list in R and given a name, I want to check whether that's present in the names of that nested list.
For level 1 depth, given_name %in% names(list) is working fine. But how to search for names at different levels.
For ex:
list (a:1, b:1, c:( c_a:2,c_b:3 )). How to check whether c$c_a is in the list.

I. Creating Nested List
Your_list <- list(a=list(x=c(4,5)),b=list(c=list(y=c(8,99)),d=c("a","b")))
names(Your_list)
# [1] "a" "b"
names(.Internal(unlist(Your_list, TRUE, TRUE)))
# [1] "a.x1" "a.x2" "b.c.y1" "b.c.y2" "b.d1" "b.d2"
str(Your_list)
# List of 2
# $ a:List of 1
# ..$ x: num [1:2] 4 5
# $ b:List of 2
# ..$ c:List of 1
# .. ..$ y: num [1:2] 8 99
# ..$ d: chr [1:2] "a" "b"
II. Removing Nesting from the list
New_list <- unlist(Your_list)
New_list
# a.x1 a.x2 b.c.y1 b.c.y2 b.d1 b.d2
# "4" "5" "8" "99" "a" "b"
class(New_list)
# [1] "character"
str(New_list)
# Named chr [1:6] "4" "5" "8" "99" "a" "b"
# - attr(*, "names")= chr [1:6] "a.x1" "a.x2" "b.c.y1" "b.c.y2" ...
III. Converting it to list without nesting
New_list <- as.list(New_list)
New_list
# $a.x1
# [1] "4"
# $a.x2
# [1] "5"
# $b.c.y1
# [1] "8"
# $b.c.y2
# [1] "99"
# $b.d1
# [1] "a"
# $b.d2
# [1] "b"
class(New_list)
# [1] "list"
str(New_list)
# List of 6
# $ a.x1 : chr "4"
# $ a.x2 : chr "5"
# $ b.c.y1: chr "8"
# $ b.c.y2: chr "99"
# $ b.d1 : chr "a"
# $ b.d2 : chr "b"
IV. Accessing elements from Flat list New_list by names
New_list$a.x1
# [1] "4"
New_list$a.x2
# [1] "5"
New_list$b.d2
# [1] "b"
New_list$b.c.y2
# [1] "99"
Note: Here, the class is not preserved for the elements of flatten list. You will need to preserve the class when unlisting the list.
As you see all of them are character at the end.

Related

how to get none empty list from a list of list

I have a list which contain a list of list. The structure looks like this:
Is it possible to create a with none empty list of list in it?
I tried datalist2 <- datalist[!is.na(datalist[[]])] which return 0 list, and datalist2 <- datalist[!is.na(datalist[[]])] whih return 5 lists(no changes). How can I only get 3 lists?
Any suggestion?
You can use sapply and length and then select those with non-zero length:
# create an example
dat <- list(list(1:3), list(), list(letters[1:4]), list(LETTERS[1:4]),
list(), list())
str(dat)
#R> List of 6
#R> $ :List of 1
#R> ..$ : int [1:3] 1 2 3
#R> $ : list()
#R> $ :List of 1
#R> ..$ : chr [1:4] "a" "b" "c" "d"
#R> $ :List of 1
#R> ..$ : chr [1:4] "A" "B" "C" "D"
#R> $ : list()
#R> $ : list()
# get the non-empty lists
res <- dat[sapply(dat, length) > 0]
# show the results
str(res)
#R> List of 3
#R> $ :List of 1
#R> ..$ : int [1:3] 1 2 3
#R> $ :List of 1
#R> ..$ : chr [1:4] "a" "b" "c" "d"
#R> $ :List of 1
#R> ..$ : chr [1:4] "A" "B" "C" "D"
You might wanna use purrr:
datalist2 <- datalist[!purrr:is_empty(datalist[[]])]
Don't know if it works though, could you please provide a sample?

How to return only the wanted vector in the which() funtion

I have this initial matrix:
> fil
2 3 6
1 1 1
> str(fil)
Named num [1:3] 1 1 1
- attr(*, "names")= chr [1:3] "2" "3" "6"
When I do this:
which(fil==min(fil,na.rm = TRUE))
I have this returned
> which(fil==min(fil,na.rm = TRUE))
2 3 6
1 2 3
And I wanted the names of the vector to be returned:
2 3 6
When you see an output like the one in the question, you must suspect that the upper vector are the names of the vector printed below them. What is below is the actual vector, its values, not the first line of the output.
This is confirmed with str
str(fil)
# Named num [1:3] 1 1 1
# - attr(*, "names")= chr [1:3] "2" "3" "6"
It starts by saying Named num, so it is a named numeric vector.
Then there is an attributes line. The attribute in question is "names". And there are functions to get some frequent attributes, such as the "names" attribute.
fil <- c('2' = 1, '3' = 1, '6' = 1)
fil
#2 3 6
#1 1 1
attributes(fil)
#$names
#[1] "2" "3" "6"
There are two ways to get the attribute "names". The second is the shorcut I will use:
attr(fil, "names")
#[1] "2" "3" "6"
names(fil)
#[1] "2" "3" "6"
Now, to answer the question, just subset the names that correspond to the minimum of the vector fil.
names(fil)[which(fil==min(fil,na.rm = TRUE))]
#[1] "2" "3" "6"

remove nested list components that don't match

Say I have a nested list like this
lst <- list(a=list(b=list("a", "b")), c=list("d"))
str(lst)
#List of 2
# $ a:List of 1
# ..$ b:List of 2
# .. ..$ : chr "a"
# .. ..$ : chr "b"
# $ c:List of 1
# ..$ : chr "d"
and I want to remove all the elements that don't match a vector of names (characters here), but I also want to remove the entire nested component if there are no matches. So, for example, using rapply I have this
## Just keep the branches that have an "a" value
keeps <- "a"
## Pass this function to rapply
f <- function(x) if(any(unlist(x) %in% keeps)) x else NULL
res <- rapply(lst, f, how="replace")
str(res)
# List of 2
# $ a:List of 1
# ..$ b:List of 2
# .. ..$ : chr "a"
# .. ..$ : NULL
# $ c:List of 1
# ..$ : NULL
So, I would have liked the entire c list to be cleaved. I don't think I can do this with a single rapply operation? If not, what would be a good way to do this.

replicate within list element to make nested list

say we have
a <- list(letters[1:4],letters[5:6])
how can we duplicate within each element to get a list like
b <- list(list(letters[1:4],letters[1:4]),list(letters[5:6],letters[5:6]))
I could make an empty list, for a[1] and a[2] fill it with replicated vectors then add it all in one big list.
but i think there should be a quick way that I am missing?
i did
lapply(a, function(x){replicate(2,x, simplify=FALSE)})
but the indexing seems strange
e.g.
[[1]]
[[1]][[1]]
[[1]][[1]][[1]]
[1] "a" "b" "c" "d"
[[1]][[1]][[2]]
[1] "a" "b" "c" "d"
Here's one option:
lapply(a, function(X) rep(list(X), 2))
# [[1]]
# [[1]][[1]]
# [1] "a" "b" "c" "d"
#
# [[1]][[2]]
# [1] "a" "b" "c" "d"
#
#
# [[2]]
# [[2]][[1]]
# [1] "e" "f"
#
# [[2]][[2]]
# [1] "e" "f"
You can apply replicate to each element in your list. Here we do so with lapply:
lapply(a, replicate, n=2, simplify=F)
n and simplify are arguments forwarded by lapply to replicate (see ?replicate). This produces:
List of 2
$ :List of 2
..$ : chr [1:4] "a" "b" "c" "d"
..$ : chr [1:4] "a" "b" "c" "d"
$ :List of 2
..$ : chr [1:2] "e" "f"
..$ : chr [1:2] "e" "f"
Note we're showing the output of str(...) for clarity, not the actual result.

rbind() returning an odd result

This has all the signs of being something that's so trivially stupid that I'll regret asking it in a public forum, but I've now stumped a few people on it so c'est la vie.
I'm running the following block of code, and not getting the result that I expect:
zz <- list(a=list('a', 'b', 'c', 'd'), b=list('f', 'g', '2', '1'),
c=list('t', 'w', 'x', '6'))
padMat <- do.call('cbind', zz)
headMat <- matrix(c(colnames(padMat), rep('foo', ncol(padMat))), nrow=2, byrow=TRUE)
rbind(headMat, padMat)
I had expected:
a b c
foo foo foo
a f t
b g w
c 2 x
d 1 6
Instead I'm getting:
a b c
a f t
b g w
c 2 x
d 1 6
NULL NULL NULL
It appears that it's filling in the upper part of the rbind by row, and then adding a row of NULL values at the end.
A couple of notes:
This works AOK as long as headMat is a single row
To double check, I also got rid of the dimnames for padMat, this wasn't affecting things
Another thought was that it somehow had to do with the byrow=TRUE, but the same behavior happens if you take that out
padMat is a list (with a dim attribute), not what you usually think of as a matrix.
> padMat <- do.call('cbind', zz)
> str(padMat)
List of 12
$ : chr "a"
$ : chr "b"
$ : chr "c"
$ : chr "d"
$ : chr "f"
$ : chr "g"
$ : chr "2"
$ : chr "1"
$ : chr "t"
$ : chr "w"
$ : chr "x"
$ : chr "6"
- attr(*, "dim")= int [1:2] 4 3
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:3] "a" "b" "c"
I suspect you want something like:
> padMat <- do.call(cbind,lapply(zz,c,recursive=TRUE))
> str(padMat)
chr [1:4, 1:3] "a" "b" "c" "d" "f" "g" "2" "1" "t" "w" ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:3] "a" "b" "c"
The lesson here is, "str is your friend." :)
The problem appears to stem from the fact that padMat is a strange matrix. R reports that is a list of 12 with dimensions:
R> str(padMat)
List of 12
$ : chr "a"
$ : chr "b"
$ : chr "c"
$ : chr "d"
$ : chr "f"
$ : chr "g"
$ : chr "2"
$ : chr "1"
$ : chr "t"
$ : chr "w"
$ : chr "x"
$ : chr "6"
- attr(*, "dim")= int [1:2] 4 3
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:3] "a" "b" "c"
That appears to be the source of the problem, as recasting as a matrix works:
R> rbind(headMat, matrix(unlist(padMat), ncol = 3))
[,1] [,2] [,3]
[1,] "a" "b" "c"
[2,] "foo" "foo" "foo"
[3,] "a" "f" "t"
[4,] "b" "g" "w"
[5,] "c" "2" "x"
[6,] "d" "1" "6"
Others have correctly pointed out the fact that padMat had mode list, which if you look at the docs for rbind and cbind, is bad:
In the default method, all the vectors/matrices must be atomic (see vector) or lists.
That's why the do.call works, since the elements of zz are themselves lists. If you change the definition of zz to the following:
zz <- list(a=c('a', 'b', 'c', 'd'), b=c('f', 'g', '2', '1'),
c=c('t', 'w', 'x', '6'))
the code works as expected.
More insight can be had, I think, from this nugget also in the docs for rbind and cbind:
The type of a matrix result determined from the highest type of any of the inputs
in the hierarchy raw < logical < integer < real < complex < character < list .

Resources