conversion of nested list to unnested with cumulative concatenation

conversion of nested list to unnested with cumulative concatenation - r

I would like to convert nested list like this:
l <- list(A=list(a=list(1),b=list(2)),
B=list(cd=list(c=list(3,4,5),d=list(6,7,8)),e=list(c(9,10))))
into list
o <- list(A=c(1,2),A.a=1,A.b=2,B=c(3:10),
B.cd=c(3:8),B.cd.c=c(3:5),B.cd.d=c(6:8),B.e=c(9:10))
At each list level values from nested lists should be concatenated.

Clearly a case for a recursive function, but getting the return values to unlist properly is tricky. Here's a function that will do it; it doesn't get the names quite right but that's easily fixed afterwards.
unnest <- function(x) {
if(is.null(names(x))) {
list(unname(unlist(x)))
}
else {
c(list(all=unname(unlist(x))), do.call(c, lapply(x, unnest)))
}
}
Output from unnest(l) is
$all
[1] 1 2 3 4 5 6 7 8 9 10
$A.all
[1] 1 2
$A.a
[1] 1
$A.b
[1] 2
$B.all
[1] 3 4 5 6 7 8 9 10
$B.cd.all
[1] 3 4 5 6 7 8
$B.cd.c
[1] 3 4 5
$B.cd.d
[1] 6 7 8
$B.e
[1] 9 10
and can be massaged into your desired output with
out <- unnest(l)
names(out) <- sub("\\.*all", "", names(out))
out[-1]
To not recurse when there's only one element, try
unnest2 <- function(x) {
if(is.null(names(x)) | length(x)==1) {
list(unname(unlist(x)))
} else {
c(list(all=unname(unlist(x))), do.call(c, lapply(x, unnest2)))
}
}

Related

How to organize the output of the list of list in R

Suppose this is my list of list (I would like to organize the result as my data contains more than 40 results and it is difficult for me to organize them manually).
s <- c(1,2,3)
ss <- c(4,5,6)
S <- list(s,ss)
h <- c(4,8,7)
hh <- c(0,3,4)
H <- list(h,hh)
HH <- list(S,H)
names1 <- c("First","Second")
lapply(setNames(HH, paste0(names1, '_Model')), function(x)
setNames(x, paste0('Res_', seq_along(x))))
#$First_Model
#$First_Model$Res_1
#[1] 1 2 3
#$First_Model$Res_2
#[1] 4 5 6
#$Second_Model
#$Second_Model$Res_1
#[1] 4 8 7
#$Second_Model$Res_2
#[1] 0 3 4
I would like to have the result similar to the following:
#$First_Model
#$First_Model$Res_1
#[1] 1 2 3
#$Second_Model
#$Second_Model$Res_1
#[1] 4 8 7
#$First_Model$Res_2
#[1] 4 5 6
#$Second_Model$Res_2
#[1] 0 3 4

The problem in question is how to rearrange the nested list from "Model No. > Results No." to "Results No. > Model No."
I was going for something similar to Wimpel's answer.
Res_no <- seq_along(HH[[1]]) # results elements
lapply(setNames(Res_no, paste0("Res_", Res_no)), function(x)
lapply(setNames(HH, paste0(names1, '_Model')), `[[`, x)
)
Output
#$Res_1
#$Res_1$First_Model
#[1] 1 2 3
#
#$Res_1$Second_Model
#[1] 4 8 7
#
#
#$Res_2
#$Res_2$First_Model
#[1] 4 5 6
#
#$Res_2$Second_Model
#[1] 0 3 4
The base of this solution is to extract the x-th element of the nested list (seen in the inner lapply() function of the code). You can do this with lapply or purrr:map, as described here.
The outer lapply() function lets you repeat it for all the "Results No."

Something like this perhaps?
# From your code, create a list L
L <- lapply(setNames(HH, paste0(names1, '_Model')), function(x)
setNames(x, paste0('Res_', seq_along(x))))
# get all x-th elements from the list, and add them to new list L2
L2 <- lapply( 1:length(L[[1]]), function(x) {
lapply(L, "[[", x)
})
# set names of L2
names(L2) <- names(L[[1]])
output
# $Res_1
# $Res_1$First_Model
# [1] 1 2 3
#
# $Res_1$Second_Model
# [1] 4 8 7
#
#
# $Res_2
# $Res_2$First_Model
# [1] 4 5 6
#
# $Res_2$Second_Model
# [1] 0 3 4

Return a vector free of duplicates [without using unique() or duplicate()]

I am trying to get rid of duplicates within a vector without using unique function (as this one doesn`t work in that instance).
My loop looks as follows:
#finding and deleting duplicates
dupes <- function(x) {
for (i in 1:(length(x))){
while (is_true(all.equal(x[i],x[i+1]))){
x=x[-i]
}
}
print(x)
}
I want to run a vector through the function and get a vector (free of dupes) returned.

Here's one simple way to do it -
# for numeric vector
x <- c(1:8, 4:10)
# [1] 1 2 3 4 5 6 7 8 4 5 6 7 8 9 10
x[ave(x, x, FUN = seq_along) == 1]
# [1] 1 2 3 4 5 6 7 8 9 10
# for character vector
x <- as.character(iris$Species)
x[ave(x, x, FUN = seq_along) == 1]
# [1] "setosa" "versicolor" "virginica"

Here are a couple of ways to do it, assuming that your vector is NOT numeric (i.e. It is integer or character),
set.seed(666)
v1 <- sample(15:20, 10, replace = TRUE)
as.integer(names(table(v1)))
#[1] 15 16 17 19 20
rle(sort(v1))$values
#[1] 15 16 17 19 20

dplyr's distinct() function will work for you .
library(dplyr)
df_new <- distinct(your_vector)

Remove elements in a list in R

I want to remove part of the list where it is a complete set of the other part of the list. For example, B intersect A and E intersect C, therefore B and E should be removed.
MyList <- list(A=c(1,2,3,4,5), B=c(3,4,5), C=c(6,7,8,9), E=c(7,8))
MyList
$A
[1] 1 2 3 4 5
$B
[1] 3 4 5
$C
[1] 6 7 8 9
$E
[1] 7 8
MyListUnique <- RemoveSubElements(MyList)
MyListUnique
$A
[1] 1 2 3 4 5
$C
[1] 6 7 8 9
Any ideas ? Any know function to do it ?

As long as your data is not too huge, you can use an approach like the following:
# preparation
MyList <- MyList[order(lengths(MyList))]
idx <- vector("list", length(MyList))
# loop through list and compare with other (longer) list elements
for(i in seq_along(MyList)) {
idx[[i]] <- any(sapply(MyList[-seq_len(i)], function(x) all(MyList[[i]] %in% x)))
}
# subset the list
MyList[!unlist(idx)]
#$C
#[1] 6 7 8 9
#
#$A
#[1] 1 2 3 4 5

Similar to the other answer, but hopefully clearer, using a helper function and 2 sapplys.
#helper function to determine a proper subset - shortcuts to avoid setdiff calculation if they are equal
is.proper.subset <- function(x,y) !setequal(x,y) && length(setdiff(x,y))==0
#double loop over the list to find elements which are proper subsets of other elements
idx <- sapply(MyList, function(x) any(sapply(MyList, function(y) is.proper.subset(x,y))))
#filter out those that are proper subsets
MyList[!idx]
$A
[1] 1 2 3 4 5
$C
[1] 6 7 8 9

Getting index of vector with delimited parts

I have vectors that looks like these variations:
cn1 <- c("Probe","Genes","foo","bar","Probe","Genes","foo","bar")
# 0 1 2 3 4 5 6 7
cn2 <- c("Probe","Genes","foo","bar","qux","Probe","Genes","foo","bar","qux")
# 0 1 2 3 4 5 6 7 8 9
Note that in each vector above consists of two parts. They are separated with "Probe","Genes".
What I want to do is to get the indexes of the first part of the entry in between that separator. Yielding
cn1_id ------> [2,3]
cn2_id ------> [2,3,4]
How can I achieve that in R?
I tried this but it doesn't do what I want:
> split(cn1,c("Probe","Genes"))
$Genes
[1] "Genes" "bar" "Genes" "bar"
$Probe
[1] "Probe" "foo" "Probe" "foo"

Here's a function that you can use. Note that R vectors are 1-based so counting starts at 1 rather than 0.
findidx <- function(x) {
idx <- which(x=="Probe" & c(tail(x,-1),NA)=="Genes")
if (length(idx)>1) {
(idx[1]+2):(idx[2]-1)
} else {
NA # what to return if no match found
}
}
findidx(cn1)
# [1] 3 4
findidx(cn2)
# [1] 3 4 5

You could try between from data.table
indx <- between(cn1, 'Genes', 'Probe')
indx2 <- between(cn2, 'Genes', 'Probe')
which(cumsum(indx)==2)[-1]-1
#[1] 2 3
which(cumsum(indx2)==2)[-1]-1
#[1] 2 3 4

Calculate the rank of each index in a vector

I'd like to calculate the rank of each index within a vector, e.g:
x <- c(0.82324952352792, 0.11953364405781, 0.588659686036408, 0.41683742380701,
0.11452184105292, 0.438547774450853, 0.586471405345947, 0.943002870306373,
0.28184655145742, 0.722095313714817)
calcRank <- function(x){
sorted <- x[order(x)]
ranks <- sapply(x, function(x) which(sorted==x))
return(ranks)
}
calcRank(x)
> calcRank(x)
[1] 9 2 7 4 1 5 6 10 3 8
Is there a better way to do this?

Why not just:
rank(x) # ..... ?
# [1] 9 2 7 4 1 5 6 10 3 8

match is what you want:
match(x, sort(x))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

conversion of nested list to unnested with cumulative concatenation - r

Related

How to organize the output of the list of list in R

Return a vector free of duplicates [without using unique() or duplicate()]

Remove elements in a list in R

Getting index of vector with delimited parts

Calculate the rank of each index in a vector

Categories

Resources