R Order elements of list - r

I want to order 2 lists in R and intersect both on the 2 elements with most length. example:
Membership1
[[1]]
[1] 3 4 6 7 8
[[2]]
[1] 5 13 23
[[3]]
[1] 1 2 12 14 15 16 18 21 25 28
Membership2
[[1]]
[1] 8 13 20 21 23
[[2]]
[1] 3 6 7
[[3]]
[1] 1 2 4 5 10 15 17 19 24 25 29
Here, the result would be:
[[3]]
[1] 1 2 12 14 15 16 18 21 25 28
[[1]]
[1] 3 4 6 7 8
[[2]]
[1] 5 13 23
and
[[3]]
[1] 1 2 4 5 10 15 17 19 24 25 29
[[1]]
[1] 8 13 20 21 23
[[2]]
[1] 3 6 7
And then 1, 2, 15, 25 and 8 (intersect of both [[3]])
The intersect function is pretty straighforward, but I donĀ“t understand how to order those lists the way I want.
Manuel

To order a list w.r.t. the lengths of its elements, nonincreasingly, call:
x <- list(1:5, 1:3, 1:7)
(x <- x[order(sapply(x, length), decreasing=TRUE)])
## [[1]]
## [1] 1 2 3 4 5 6 7
##
## [[2]]
## [1] 1 2 3 4 5
##
## [[3]]
## [1] 1 2 3
Thus, the whole task may be solved with e.g.:
Membership1 <- list(c(3, 4, 6, 7, 8), c(5, 13, 23), c(1, 2, 12, 14, 15, 16, 18, 21, 25, 28))
Membership2 <- list(c(8, 13, 20, 21, 23), c(3, 6, 7), c(1, 2, 4, 5, 10, 15, 17, 19, 24, 25, 29))
Membership1 <- Membership1[order(sapply(Membership1, length), decreasing=TRUE)]
Membership2 <- Membership2[order(sapply(Membership2, length), decreasing=TRUE)]
lapply(seq_along(Membership1), function(i) intersect(Membership1[[i]], Membership2[[i]]))
## [[1]]
## [1] 1 2 15 25
##
## [[2]]
## [1] 8
##
## [[3]]
## numeric(0)
Equivalently, as #flodel suggested, the last step may be performed as follows:
Map(intersect, Membership1, Membership2)
or even:
mapply(intersect, Membership1, Membership2, SIMPLIFY=FALSE)

Related

how to find where the interval of continuous numbers starts and ends?

I have a vector
vec <- c(2, 3, 5, 6, 7, 8, 16, 19, 22, 23, 24)
The continuous numbers are:
c(2, 3)
c(5, 6, 7, 8)
c(22, 23, 24)
So the first vector starts at 2 and ends at 3;
for the second vector starts at 5 and ends at 8;
for the third vector starts at 22 and ends at 24;
There is a function to identify where the continuous numbers starts and ends?
By using diff to check the differences between each consecutive value, you can find where the difference is not +1.
diff(vec)
## [1] 1 2 1 1 1 8 3 3 1 1
c(1, diff(vec)) != 1
## [1] FALSE FALSE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
Then use cumsum to make a group identifier:
cumsum(c(1, diff(vec))!=1)
## [1] 0 0 1 1 1 1 2 3 4 4 4
And use this to split your data up:
split(vec, cumsum(c(1, diff(vec))!=1))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`2`
##[1] 16
##
##$`3`
##[1] 19
##
##$`4`
##[1] 22 23 24
Which can be Filtered to consecutive values:
Filter(\(x) length(x) > 1, split(vec, cumsum(c(1, diff(vec))!=1)))
##$`0`
##[1] 2 3
##
##$`1`
##[1] 5 6 7 8
##
##$`4`
##[1] 22 23 24
Another one
vec=c( 2 , 3 , 5 , 6 , 7 , 8 , 16 , 19 , 22 , 23 , 24 )
x <- replace(NA, vec, vec)
# [1] NA 2 3 NA 5 6 7 8 NA NA NA NA NA NA NA 16 NA NA 19 NA NA 22 23 24
l <- split(x, with(rle(is.na(x)), rep(seq.int(length(lengths)), lengths)))
# l <- split(x, data.table::rleid(is.na(x))) ## same as above
l <- Filter(Negate(anyNA), l)
l
# $`2`
# [1] 2 3
#
# $`4`
# [1] 5 6 7 8
#
# $`6`
# [1] 16
#
# $`8`
# [1] 19
#
# $`10`
# [1] 22 23 24
If you have a length requirement:
l[lengths(l) > 1]
# $`2`
# [1] 2 3
#
# $`4`
# [1] 5 6 7 8
#
# $`10`
# [1] 22 23 24

Adding a vector to components of a list

I have the following list:
A <- c(11)
B <- c(7, 13)
C <- c(1, 10, 11, 12)
my_list <- list(A, B, C)
> my_list
[[1]]
[1] 11
[[2]]
[1] 7 13
[[3]]
[1] 1 10 11 12
I would like to add -2, -1, 0, 1, and 2 to each number in this list, and retain all of the unique values within each list element, to obtain the following resulting list:
> my_new_list
[[1]]
[1] 9 10 11 12 13
[[2]]
[1] 5 6 7 8 9 11 12 13 14 15
[[3]]
[1] -1 0 1 2 3 8 9 10 11 12 13 14
I tried the following code, but I did not get the result I was hoping for:
my_new_list <- lapply(res, `+`, -2:2)
> my_new_list
$`1`
[1] 9 10 11 12 13
$`2`
[1] 5 12 7 14 9
$`3`
[1] -1 9 11 13 3
Why is this happening, and how can I obtain the result I'd like? Thanks!
Assuming that we need the unique values
lapply(my_list, function(x) sort(unique(unlist(lapply(x, `+`, -2:2)))))
Or with outer
lapply(my_list, function(x) sort(unique(c(outer(x, -2:2, `+`)))))
Or with rep and recyling
lapply(my_list, function(x) sort(unique(rep(-2:2, each = length(x)) + x)))
#[[1]]
# [1] 9 10 11 12 13
#[[2]]
# [1] 5 6 7 8 9 11 12 13 14 15
#[[3]]
# [1] -1 0 1 2 3 8 9 10 11 12 13 14
How about this:
my_new_list <- lapply(my_list, function(x) unique(union(x,sapply(x, function(y) y +c(-2:2)) )))
my_new_list <- lapply(my_new_list, sort)
my_new_list
[[1]]
[1] 9 10 11 12 13
[[2]]
[1] 5 6 7 8 9 11 12 13 14 15
[[3]]
[1] -1 0 1 2 3 8 9 10 11 12 13 14

Duplicating elements in a list based on count vector

I have a list like the following:
example <- list(c(1, 5, 3, 6, 3), c(4, 2, 56, 2, 56, 2), c(4, 2, 6,
2, 6, 1, 34))
And I would like to duplicate elements of the list based on this numeric vector:
count <- c(5, 2, 1)
I want a final output to be a list of length 8 (sum(count)) which has the first element of the list repeated 5 times, second element 2 times, and third element only once.
How would you do this?
If I understand the question correctly, the base R function rep() should do what the OP expects:
rep(example, count)
[[1]]
[1] 1 5 3 6 3
[[2]]
[1] 1 5 3 6 3
[[3]]
[1] 1 5 3 6 3
[[4]]
[1] 1 5 3 6 3
[[5]]
[1] 1 5 3 6 3
[[6]]
[1] 4 2 56 2 56 2
[[7]]
[1] 4 2 56 2 56 2
[[8]]
[1] 4 2 6 2 6 1 34

R: create a vector based on a list

I have the following list called m1:
> m1
[[1]]
[1] 36 37 38
[[2]]
[1] 34 35
[[3]]
[1] 30 31 32 33
[[4]]
[1] 24 25 26 27 28 29
[[5]]
[1] 20 21 22 23
[[6]]
[1] 14 15 16 17 18 19
[[7]]
[1] 11 12 13
[[8]]
[1] 7 8 9 10
[[9]]
[1] 5 6
[[10]]
[1] 1 2 3 4
[[11]]
integer(0)
I would like to create a vector based on this list, which has the value 1 at positions 36, 37, and 38; the value 2 at positions 34 and 35, etc. The final output should be:
vector_1 <- c(10, 10, 10, 10, 9, 9, 8, 8, 8, 8, 7, 7, 7, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 1, 1, 1)
How can I accomplish this in R?
EDIT:
Thanks to a comment below:
> rep(length(m1):1, sapply(m1, length))
[1] 11 11 11 10 10 9 9 9 9 8 8 8 8 8 8 7 7 7 7 6 6 6 6 6 6 5 5 5 4
[30] 4 4 4 3 3 2 2 2 2
That doesn't quite give me what I want, but it's definitely on the right track!
This should handle cases with empty entries and non-sequential entries....
m1 <- list(c(7,4,5), c(2,10,9), c(1,3,6,8), integer())
# [[1]]
# [1] 7 4 5
#
# [[2]]
# [1] 2 10 9
#
# [[3]]
# [1] 1 3 6 8
#
# [[4]]
# integer(0)
rep(seq_along(m1), sapply(m1, length))[order(unlist(m1))]
#[1] 3 2 3 1 1 3 1 3 2 2
This solution should work for more general cases too even if the elements inside m1 are not in a specific order
#DATA
m1 = list(36:38, 34:35, 30:33, 24:29, 20:23,
14:19, 11:13, 7:10, 5:6, 1:4, integer(0))
#Extract the maximum element in m1
mymax = max(unlist(m1))
#Go through m1 using index and replace respective indices in the position
#defined by the elements of m1, otherwise make the elements zero
Reduce("+", lapply(1:length(m1), function(i)
replace(rep(0, mymax), m1[[i]], i)))
# [1] 10 10 10 10 9 9 8 8 8 8 7 7 7 6 6 6 6 6 6 5 5 5 5
#[24] 4 4 4 4 4 4 3 3 3 3 2 2 1 1 1
Here is a straightforward base-R solution:
# data
m1 <- list(36:38, 34:35, 30:33, 24:29, 20:23, 14:19, 11:13, 7:10, 5:6, 1:4, integer(0))
# Count length, and repeat each number in 1:11 accordingly
rev(rep(1:11, sapply(m1, length)))
[1] 10 10 10 10 9 9 8 8 8 8 7 7 7 6 6 6 6 6 6 5 5 5 5 4 4 4 4 4 4 3 3 3
[33] 3 2 2 1 1 1
Edit:
A more generalisable answer would be:
rev(rep(seq_along(m1), sapply(m1, length)))
Try this:
rev(unlist(sapply(1:length(m1), function(x) rep(x,length(m1[[x]])))))
#or even better, #snoram's edited version of this:
rev(rep(seq_along(m1), sapply(m1, length)))
Output:
[1] 10 10 10 10 9 9 8 8 8 8 7 7 7 6 6 6 6 6 6 5 5 5 5 4
[25] 4 4 4 4 4 3 3 3 3 2 2 1 1 1
Sample data:
m1 <- list(36:38,34:35,30:33,24:29,20:23,
14:19,11:13,7:10,5:6,1:4)
names(m1) <- 1:10

how to make a vector of x from 1 to max value

In a dataset like this one
what code should I use if I want to make a vector of
x <- 1: max (day)/ID
? So x will be
1:7 for B1
1:11 for B2
1:22 for B3
I tried
max_day <- summaryBy(day ~ ID , df ,FUN=max) # to extract the maximum day per ID
df<- merge (df, max_day) ## to create another column with the maximum day
max_day<- unique(df[,c("ID", " day.max")]) ## to have one value (max) per ID
##& Finlay the vector
x <- 1: (max_day$day.max)
I got this message
Warning message:
In 1:(max_day$day.max) :
numerical expression has 11134 elements: only the first used
Any suggestions?
tapply(df$day, df$ID, function(x) 1:max(x))
I don't know how should look your output, but you can try this:
my_data <- data.frame(ID = c(rep("B1", 3), rep("B2", 4), rep("B3", 3)),
day = sample(1:20, 10, replace = TRUE))
tmp <- aggregate(test$day, by = list(test$ID), FUN = max)
sapply(1:nrow(tmp), function(y) return(1:tmp$x[y]))
# [[1]]
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# [[2]]
# [1] 1 2 3 4 5 6 7 8 9 10 11
# [[3]]
# [1] 1 2 3 4 5 6 7 8 9 10 11
We can use sapply to loop over unique element of ID and generate a sequence from 1 to the max for that ID in the day column
sapply(unique(df$ID), function(x) seq(1, max(df[df$ID == x, "day"])))
#[[1]]
#[1] 1 2 3 4 5 6 7
#[[2]]
#[1] 1 2 3 4 5 6 7 8 9 10 11
#[[3]]
#[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
If we want all as one vector , we can try unlist
unlist(sapply(unique(df$ID), function(x) seq(1, max(df[df$ID == x, "day"]))))
#[1] 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10
# 11 12 13 14 15 16 17 18 19 20 21 22
Yet another option, using Hadley Wickham's purrr package, as part of the tidyverse.
d <- data.frame(id = rep(c("B1", "B2", "B3"), c(3, 4, 5)),
v = c(1:3, 1:4, 1:5),
day = c(1, 3, 7, 1, 5, 9, 11, 3, 5, 11, 20, 22),
number = c(15, 20, 30, 25, 26, 28, 35, 10, 12, 14, 16, 18))
library(purrr)
d %>%
split(.$id) %>%
map(~1:max(.$day))
# $B1
# [1] 1 2 3 4 5 6 7
# $B2
# [1] 1 2 3 4 5 6 7 8 9 10 11
# $B3
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
df <-
data.frame(ID = c(rep("B1",3),rep("B2",4),rep("B3",5)),
V = c(1,2,3,1,2,3,4,1,2,3,4,5),
day = c(1,3,7,1,5,9,11,3,5,11,20,22),
number = c(15,20,30,25,26,28,35,10,12,14,16,18))
x <- list()
n <- 1
for(i in unique(df$ID)){
max_day <- max(df$day[df$ID==i])
x[[n]] <- 1:max_day
n <- n+1
}
x
[[1]]
[1] 1 2 3 4 5 6 7
[[2]]
[1] 1 2 3 4 5 6 7 8 9 10 11
[[3]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Resources