I would like to convert nested list like this:
l <- list(A=list(a=list(1),b=list(2)),
B=list(cd=list(c=list(3,4,5),d=list(6,7,8)),e=list(c(9,10))))
into list
o <- list(A=c(1,2),A.a=1,A.b=2,B=c(3:10),
B.cd=c(3:8),B.cd.c=c(3:5),B.cd.d=c(6:8),B.e=c(9:10))
At each list level values from nested lists should be concatenated.
Clearly a case for a recursive function, but getting the return values to unlist properly is tricky. Here's a function that will do it; it doesn't get the names quite right but that's easily fixed afterwards.
unnest <- function(x) {
if(is.null(names(x))) {
list(unname(unlist(x)))
}
else {
c(list(all=unname(unlist(x))), do.call(c, lapply(x, unnest)))
}
}
Output from unnest(l) is
$all
[1] 1 2 3 4 5 6 7 8 9 10
$A.all
[1] 1 2
$A.a
[1] 1
$A.b
[1] 2
$B.all
[1] 3 4 5 6 7 8 9 10
$B.cd.all
[1] 3 4 5 6 7 8
$B.cd.c
[1] 3 4 5
$B.cd.d
[1] 6 7 8
$B.e
[1] 9 10
and can be massaged into your desired output with
out <- unnest(l)
names(out) <- sub("\\.*all", "", names(out))
out[-1]
To not recurse when there's only one element, try
unnest2 <- function(x) {
if(is.null(names(x)) | length(x)==1) {
list(unname(unlist(x)))
} else {
c(list(all=unname(unlist(x))), do.call(c, lapply(x, unnest2)))
}
}
Related
Suppose this is my list of list (I would like to organize the result as my data contains more than 40 results and it is difficult for me to organize them manually).
s <- c(1,2,3)
ss <- c(4,5,6)
S <- list(s,ss)
h <- c(4,8,7)
hh <- c(0,3,4)
H <- list(h,hh)
HH <- list(S,H)
names1 <- c("First","Second")
lapply(setNames(HH, paste0(names1, '_Model')), function(x)
setNames(x, paste0('Res_', seq_along(x))))
#$First_Model
#$First_Model$Res_1
#[1] 1 2 3
#$First_Model$Res_2
#[1] 4 5 6
#$Second_Model
#$Second_Model$Res_1
#[1] 4 8 7
#$Second_Model$Res_2
#[1] 0 3 4
I would like to have the result similar to the following:
#$First_Model
#$First_Model$Res_1
#[1] 1 2 3
#$Second_Model
#$Second_Model$Res_1
#[1] 4 8 7
#$First_Model$Res_2
#[1] 4 5 6
#$Second_Model$Res_2
#[1] 0 3 4
The problem in question is how to rearrange the nested list from "Model No. > Results No." to "Results No. > Model No."
I was going for something similar to Wimpel's answer.
Res_no <- seq_along(HH[[1]]) # results elements
lapply(setNames(Res_no, paste0("Res_", Res_no)), function(x)
lapply(setNames(HH, paste0(names1, '_Model')), `[[`, x)
)
Output
#$Res_1
#$Res_1$First_Model
#[1] 1 2 3
#
#$Res_1$Second_Model
#[1] 4 8 7
#
#
#$Res_2
#$Res_2$First_Model
#[1] 4 5 6
#
#$Res_2$Second_Model
#[1] 0 3 4
The base of this solution is to extract the x-th element of the nested list (seen in the inner lapply() function of the code). You can do this with lapply or purrr:map, as described here.
The outer lapply() function lets you repeat it for all the "Results No."
Something like this perhaps?
# From your code, create a list L
L <- lapply(setNames(HH, paste0(names1, '_Model')), function(x)
setNames(x, paste0('Res_', seq_along(x))))
# get all x-th elements from the list, and add them to new list L2
L2 <- lapply( 1:length(L[[1]]), function(x) {
lapply(L, "[[", x)
})
# set names of L2
names(L2) <- names(L[[1]])
output
# $Res_1
# $Res_1$First_Model
# [1] 1 2 3
#
# $Res_1$Second_Model
# [1] 4 8 7
#
#
# $Res_2
# $Res_2$First_Model
# [1] 4 5 6
#
# $Res_2$Second_Model
# [1] 0 3 4
I am trying to get rid of duplicates within a vector without using unique function (as this one doesn`t work in that instance).
My loop looks as follows:
#finding and deleting duplicates
dupes <- function(x) {
for (i in 1:(length(x))){
while (is_true(all.equal(x[i],x[i+1]))){
x=x[-i]
}
}
print(x)
}
I want to run a vector through the function and get a vector (free of dupes) returned.
Here's one simple way to do it -
# for numeric vector
x <- c(1:8, 4:10)
# [1] 1 2 3 4 5 6 7 8 4 5 6 7 8 9 10
x[ave(x, x, FUN = seq_along) == 1]
# [1] 1 2 3 4 5 6 7 8 9 10
# for character vector
x <- as.character(iris$Species)
x[ave(x, x, FUN = seq_along) == 1]
# [1] "setosa" "versicolor" "virginica"
Here are a couple of ways to do it, assuming that your vector is NOT numeric (i.e. It is integer or character),
set.seed(666)
v1 <- sample(15:20, 10, replace = TRUE)
as.integer(names(table(v1)))
#[1] 15 16 17 19 20
rle(sort(v1))$values
#[1] 15 16 17 19 20
dplyr's distinct() function will work for you .
library(dplyr)
df_new <- distinct(your_vector)
I want to remove part of the list where it is a complete set of the other part of the list. For example, B intersect A and E intersect C, therefore B and E should be removed.
MyList <- list(A=c(1,2,3,4,5), B=c(3,4,5), C=c(6,7,8,9), E=c(7,8))
MyList
$A
[1] 1 2 3 4 5
$B
[1] 3 4 5
$C
[1] 6 7 8 9
$E
[1] 7 8
MyListUnique <- RemoveSubElements(MyList)
MyListUnique
$A
[1] 1 2 3 4 5
$C
[1] 6 7 8 9
Any ideas ? Any know function to do it ?
As long as your data is not too huge, you can use an approach like the following:
# preparation
MyList <- MyList[order(lengths(MyList))]
idx <- vector("list", length(MyList))
# loop through list and compare with other (longer) list elements
for(i in seq_along(MyList)) {
idx[[i]] <- any(sapply(MyList[-seq_len(i)], function(x) all(MyList[[i]] %in% x)))
}
# subset the list
MyList[!unlist(idx)]
#$C
#[1] 6 7 8 9
#
#$A
#[1] 1 2 3 4 5
Similar to the other answer, but hopefully clearer, using a helper function and 2 sapplys.
#helper function to determine a proper subset - shortcuts to avoid setdiff calculation if they are equal
is.proper.subset <- function(x,y) !setequal(x,y) && length(setdiff(x,y))==0
#double loop over the list to find elements which are proper subsets of other elements
idx <- sapply(MyList, function(x) any(sapply(MyList, function(y) is.proper.subset(x,y))))
#filter out those that are proper subsets
MyList[!idx]
$A
[1] 1 2 3 4 5
$C
[1] 6 7 8 9
I have vectors that looks like these variations:
cn1 <- c("Probe","Genes","foo","bar","Probe","Genes","foo","bar")
# 0 1 2 3 4 5 6 7
cn2 <- c("Probe","Genes","foo","bar","qux","Probe","Genes","foo","bar","qux")
# 0 1 2 3 4 5 6 7 8 9
Note that in each vector above consists of two parts. They are separated with "Probe","Genes".
What I want to do is to get the indexes of the first part of the entry in between that separator. Yielding
cn1_id ------> [2,3]
cn2_id ------> [2,3,4]
How can I achieve that in R?
I tried this but it doesn't do what I want:
> split(cn1,c("Probe","Genes"))
$Genes
[1] "Genes" "bar" "Genes" "bar"
$Probe
[1] "Probe" "foo" "Probe" "foo"
Here's a function that you can use. Note that R vectors are 1-based so counting starts at 1 rather than 0.
findidx <- function(x) {
idx <- which(x=="Probe" & c(tail(x,-1),NA)=="Genes")
if (length(idx)>1) {
(idx[1]+2):(idx[2]-1)
} else {
NA # what to return if no match found
}
}
findidx(cn1)
# [1] 3 4
findidx(cn2)
# [1] 3 4 5
You could try between from data.table
indx <- between(cn1, 'Genes', 'Probe')
indx2 <- between(cn2, 'Genes', 'Probe')
which(cumsum(indx)==2)[-1]-1
#[1] 2 3
which(cumsum(indx2)==2)[-1]-1
#[1] 2 3 4
I'd like to calculate the rank of each index within a vector, e.g:
x <- c(0.82324952352792, 0.11953364405781, 0.588659686036408, 0.41683742380701,
0.11452184105292, 0.438547774450853, 0.586471405345947, 0.943002870306373,
0.28184655145742, 0.722095313714817)
calcRank <- function(x){
sorted <- x[order(x)]
ranks <- sapply(x, function(x) which(sorted==x))
return(ranks)
}
calcRank(x)
> calcRank(x)
[1] 9 2 7 4 1 5 6 10 3 8
Is there a better way to do this?
Why not just:
rank(x) # ..... ?
# [1] 9 2 7 4 1 5 6 10 3 8
match is what you want:
match(x, sort(x))