R: get element by name from a nested list - r

I have a nested list like so:
smth <- list()
smth$a <- list(a1=1, a2=2, a3=3)
smth$b <- list(b1=4, b2=5, b3=6)
smth$c <- "C"
The names of every element in the list are unique.
I would like to get an element from such a list merely by name without knowing where it is located.
Example:
getByName(smth, "c") = "C"
getByName(smth, "b2") = 5
Also I don't really want to use unlist since the real list has a lot of heavy elements in it.

The best solution so far is the following:
rmatch <- function(x, name) {
pos <- match(name, names(x))
if (!is.na(pos)) return(x[[pos]])
for (el in x) {
if (class(el) == "list") {
out <- Recall(el, name)
if (!is.null(out)) return(out)
}
}
}
rmatch(smth, "a1")
[1] 1
rmatch(smth, "b3")
[1] 6
Full credit goes to #akrun for finding it and mbedward for posting it here

Related

Nested list assignment R

I have a list of the following type
categories = list(
c("Women","Clothing", "Jeans"),
c("Women","Clothing", "Sweaters"),
c("Men","Accessories", "Belts"),
c("Women", "Accessories", "Jewelry" ))
I want to parse this list and create a list of lists to export in JSON and it should have the following structure:
Women={
Clothing= {
Jeans{},
Sweaters{}
},
accesories={
Jewleery{}
}
},
Men ={
Accessires={
Belts={}
}
So it should go over each element which is a char vector contained in the list and check if there is such element in the final list, if there isn't it should append it. It should append the element at the proper level. For example if Clothing is second element to Woman, it should append to the Women list of the final list. Or if Sweaters is thrid element to Women.Clothing it should apppend Clothing list of the Women list of the final list.
If the element exists at the given level already it should not append, instead it should go to next element in the char vector.
In the char vectors of the input lsit, the first element is always level 1 the second level 2 the third level 3 etc..
It should be done recursively, I tried few times but I have no idea how to assign to a nested list, specifically i need to do nested assigns.
I made the data into a matrix, transposed, then a dataframe:
x <- data.frame(t(vapply(categories, identity, character(3))), stringsAsFactors = F)
Then split, and lapply. You could do this recursively if you have more than 3 levels:
lapply(split(x, x$X1), function(df) {
lapply(split(df, df$X2), function(df) {
lapply(split(df, df$X3), function(x) list())
})
})
If you are looking for a recursive solution, then the following may help you:
output the full directory trajectory within a string at the end
## construct a data frame from list
df <- data.frame(matrix(unlist(categories),nrow = length(categories),byrow = T),stringsAsFactors = F)
## recursion function that makes nested list
f <- function(df, k=1) {
if (k == ncol(df)) return(lapply(split(df,df[,k]), toString)) ##
return(lapply(split(df,df[,k]), function(df) f(df, k+1)))
}
The nested list output looks as below
> f(df)
$Men
$Men$Accessories
$Men$Accessories$Belts
[1] "Men, Accessories, Belts"
$Women
$Women$Accessories
$Women$Accessories$Jewelry
[1] "Women, Accessories, Jewelry"
$Women$Clothing
$Women$Clothing$Jeans
[1] "Women, Clothing, Jeans"
$Women$Clothing$Sweaters
[1] "Women, Clothing, Sweaters"
output empty lists at the end
f <- function(df, k=1) {
if (k == ncol(df)) return(lapply(split(df,df[,k]), function(v) list()))
return(lapply(split(df,df[,k]), function(df) f(df, k+1)))
}
which gives:
> f(df)
$Men
$Men$Accessories
$Men$Accessories$Belts
list()
$Women
$Women$Accessories
$Women$Accessories$Jewelry
list()
$Women$Clothing
$Women$Clothing$Jeans
list()
$Women$Clothing$Sweaters
list()

trying to get a proper names(list) output

I'm trying to split a 2 level deep list of characters into a 1 level list using a suffix.
More precisely, I have a list of genes, each containing 6 lists of probes corresponding to 6 bins. The architecture looks like :
feat_indexed_probes_bin$HSPB6$bin1
[1] "cg14513218" "cg22891287" "cg20713852" "cg04719839" "cg27580050" "cg18139462" "cg02956481" "cg26608795" "cg15660498" "cg25654926" "cg04878216"
I'm trying to get a list "bins_indexed_probes" with the following architecture :
bins_indexed_probes$HSPB6_bin6 containing the same probes so I can pass it to my map-reducing function.
I tried many solutions such as melt(), for loop, etc but I can't figure how to perform a double nested loop ( on genes and on bins) and get a list output with only 1 level depth.
For the moment, my func to do so is the following :
create_map <- function(indexes = feat_indexed_probes_bin, binlist = c("bin1", "bin2", "bin3", "bin4", "bin5", "bin6"), genes = features) {
map <- list()
ret <- lapply(binlist, function(bin) {
lapply(rownames(features), function(gene) {
map[[paste(gene, "_", bin, sep = "")]] <- feat_indexed_probes_bin[[gene]][[bin]]
tmp_names <<- paste(gene, "_", bin, sep = "")
return(map)
})
names(map) <- tmp_names
rm(tmp_names)
})
return(ret)
}
it returns:
[[6]][[374]]
GDF10_bin6
"cg13565300"
[[6]][[375]]
NULL
[[6]][[376]]
[[6]][[376]]$HNF1B_bin6
[1] "cg03433642" "cg09679923" "cg17652435" "cg03348978" "cg02435495" "cg02701059" "cg05110178" "cg11862993" "cg09463047"
[[6]][[377]]
[[6]][[377]]$GPIHBP1_bin6
[1] "cg01953797" "cg00152340"
instead, I would expect something like
$GPIHBP1_bin1
"cg...." "cg...."
...
$GPIHBP1_bin6
"someotherprobe"
$someothergene_bin1
"probe" "probe"
...
I hope I'm being clear, and since this is my first time asking question, I already apologise if I didn't follow the stackoverflow protocol.
Thank you already for reading me
Consider a nested lapply with extract, [[, and setNames calls, all wrapped in do.call using c to bind return elements together.
bins_indexed_probes <- do.call(c,
lapply(1:6, function(i)
setNames(lapply(feat_indexed_probes_bin, `[[`, i),
paste0(names(feat_indexed_probes_bin), "_bin", i))
)
)
# RE-ORDER ELEMENTS BY NAME
bins_indexed_probes <- bins_indexed_probes[sort(names(bins_indexed_probes))]
Rextester Demo

Change data type of elements in a nested list

Is it possible to scan a list of lists for elements with a certain name and change their datatype but retain their value?
As an example, the following list containing elements 'N' of class 'character' or 'numeric'
x = list(list(N=as.character(1)),
list(a=1,b=2,c="another element",N=as.character(5)),
list(a=2,b=2,N=as.character(7),c=NULL),
list(a=2,b=2,list(N=as.character(3))))
should then become:
x = list(list(N=as.numeric(1)),
list(a=1,b=2,c="another element",N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7),c=NULL),
list(a=2,b=2,list(N=as.numeric(3))))
To be clear, the solution should allow for deeper nesting, and respect the data type of fields with names other than "N". I have not been able to find a general solution that works for lists with an arbitrary structure.
I have tried something along the lines of the solution given in this post:
a <- as.relistable(x)
u <- unlist(a)
u[names(u) == "N"] <- as.numeric(u[names(u) == "N"])
relist(u, a)
Unfortunately the substitution does not work in it's current form. In addition, relist does not seem to work in case the list contains NULL elements.
Use lapply to repeat the process over the list elements with a condition to check for your element of interest, so you don't inadvertently add elements to your sublists:
x <- lapply(x, function(i) {
if(length(i$N) > 0) {
i$N <- as.numeric(i$N)
}
return(i)
})
A solution that works only on a list of lists containing numbers or strings with numbers:
x <- list(list(N=as.character(1)),
list(a=1,b=2,N=as.character(5)),
list(a=2,b=2,N=as.character(7)),
list(a=2,b=2))
y1 <- lapply(x, function(y) lapply(y, as.numeric))
y2 <- list(list(N=as.numeric(1)),
list(a=1,b=2,N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7)),
list(a=2,b=2))
identical(y1,y2)
# [1] TRUE
EDIT. Here is a more general code that works on nested lists of number and strings. It uses a recursive function as_num and the list.apply function of the rlist package.
library(rlist)
x = list(list(N=as.character(1)),
list(a=1,b=2,c="another element",N=as.character(5)),
list(a=2,b=2,N=as.character(7),c=NULL),
list(a=2,b=2,list(N=as.character(3))))
# Test if the string contains a number
is_num <- function(x) grepl("[-]?[0-9]+[.]?[0-9]*|[-]?[0-9]+[L]?|[-]?[0-9]+[.]?[0-9]*[eE][0-9]+",x)
# A recursive function for numeric convertion of strings containing numbers
as_num <- function(x) {
if (!is.null(x)) {
if (class(x)!="list") {
y <- x
if (is.character(x) & is_num(x)) y <- as.numeric(x)
} else {
y <- list.apply(x, as_num)
}
} else {
y <- x
}
return(y)
}
y <- list.apply(x, as_num)
z = list(list(N=as.numeric(1)),
list(a=1,b=2,c="another element",N=as.numeric(5)),
list(a=2,b=2,N=as.numeric(7),c=NULL),
list(a=2,b=2,list(N=as.numeric(3))))
identical(y,z)
# [1] TRUE
The answer provided by marco sandri can be further generalised to:
is_num <- function(x) grepl("^[-]?[0-9]+[.]?[0-9]*|^[-]?[0-9]+[L]?|^[-]?[0-9]+[.]?[0-9]*[eE][0-9]+",x)
as_num <- function(x) {
if (is.null(x)||length(x) == 0) return(x)
if (class(x)=="list") return(lapply(x, as_num))
if (is.character(x) & is_num(x)) return(as.numeric(x))
return(x)
}
y <- as_num(z)
identical(y,z)
This solution also allows for list elements to contain numerical(0) and mixed datatypes such as 'data2005'.

Refer in name of element to other element

I have a problem in R and it is the following:
How can you assign a value to an element and then later recall an element in who's name you refer to the previously defined element.
Thus you define an element x
i <- value
Later you use x.i where "i" should be its value.
This is a problem in the following two cases:
1)
First you create 10 elements with the name x.1 till x.10
for(i in 1:10){
assign(paste0("X.", i), 1:3)
}
Then you want to change the name of the elements in x.1 till x.10
for(i in 1:10){
assign(names(paste0("X.", i)), c("foo","bar","norf"))
}
This does not work.
2)
I want to define two values:
year <- 1
code <- 2
And then in a dataframe "Data.year" (="Data.1") only those observations where the colum "code" is equal to the value of the previously defined "code" (=2) should be stored. With the name format: "Data.year.code" (=Data.1.2)
assign(paste0("Data.", i, code, sep="."), as.name(paste("Data",year , sep="."))[as.name(paste("Data",y , sep="."))$code==code,])
Here I tried to use as.name function in but this does not work.
The problem is that R can obviously not reconise that "year" and "code" in the expression "Data.year.code" have a value. In stata you solve this by using `, But I do not now how you do this in R.
Normally I just google something when I do not know the answer. But I have no idea how I should name this problem and thus can't find it...
It should have an easy and straightforward solution.
Based on your code with assign, an option is (but as #Roland mentioned in the comments, it would be easier and safer to work with a "list")
for(i in 1:10){
assign(paste0('X.',i), `names<-`(get(paste0('X.', i)),
c('foo', 'bar', 'norf')))
}
X.1
#foo bar norf
# 1 2 3
X.2
# foo bar norf
# 1 2 3
Or you can try it in a list
lst <- lapply(mget(paste0('X.',1:10)), function(x) {
names(x) <- c('foo', 'bar', 'norf')
x})
If you need to change the original variables to reflect the changes
list2env(lst, envir=.GlobalEnv)
Update
If you need to change the values in the vector, it is easier
list2env(lapply(mget(paste0('X.', 1:10)),
function(x) x <- c('foo', 'bar', 'norf')), envir=.GlobalEnv)
X.1
#[1] "foo" "bar" "norf"
Or just
for(i in 1:10){
assign(paste0('X.', i), c('foo', 'bar', 'norf'))
}
The idea (although I need to tell you that it is a bit frowned upon by the R community) is to make a 'big' string in the loop which will be evaluated afterwards:
This works as you say:
for(i in 1:10){
assign(paste0("X.", i), 1:3)
}
and so in order to change the names you do the following:
for(i in 1:10){
eval(parse(text=sprintf('X.%s <- c("foo","bar","norf")',i)))
}
Output:
> X.1
[1] "foo" "bar" "norf"
> X.2
[1] "foo" "bar" "norf"
So you make the big string with sprintf in this occasion:
> cat(sprintf('X.%s <- c("foo","bar","norf")',i)) #cat is used for demonstration here
X.1 <- c("foo","bar","norf") #this is the string for i==1 for example
and then you convert it to an expression with parse and then you evaluate the expression with eval.
As I said previously it is not the best technique to use but to be honest it has helped me quite a few times.

Recursively editing a list in R

In my program, I am recursively going over a nested list and adding elements to an overall list that I will return. There are a few details to be taken care of, so I can't just use unlist.
formulaPart is taken to be a formula object.
My code is:
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart))
if(typeof(currentVar == 'language'
parseVariables(currentVar, myList)
else
if(! toString(currentVar) %in% c(\\various characters)
list <- c(list, currentVar)
}
I have checked that the function correctly adds elements to the list when it should. The problem is that the list loses elements due to recursion. The elements added during one inner recursive call are not saved for another recursive call.
If this was in C++, I could just use a pointer; the same for Java. However, I do not understand how to handle this error in R.
R does something like pass-by-value, so you can't modify (most) existing objects just by passing them into a function. If you want to add on to something recursively, one trick would be to use an environment instead, which get passed by reference. This can easily be coerced to list when you're done.
parseVariables <- function(formulaPart, myList){
for(currentVar in as.list(formulaPart)) {
if(typeof(currentVar) == 'language') {
parseVariables(currentVar, myList)
}
else {
if(! toString(currentVar) %in% c(':', '+', '~'))
assign(toString(currentVar), currentVar, myList)
}
}
}
f1 <- z ~ a:b + x
f2 <- z ~ x + y
myList <- new.env()
parseVariables(f1, myList)
parseVariables(f2, mylist)
ls(myList)
# [1] "a" "b" "x" "z"
as.list(myList)
# $x
# x
#
# $z
# z
#
# $a
# a
#
# $b
# b

Resources