Split a vector by its sequences [duplicate] - r

This question already has answers here:
Create grouping variable for consecutive sequences and split vector
(5 answers)
Closed 5 years ago.
The following vector x contains the two sequences 1:4 and 6:7, among other non-sequential digits.
x <- c(7, 1:4, 6:7, 9)
I'd like to split x by its sequences, so that the result is a list like the following.
# [[1]]
# [1] 7
#
# [[2]]
# [1] 1 2 3 4
#
# [[3]]
# [1] 6 7
#
# [[4]]
# [1] 9
Is there a quick and simple way to do this?
I've tried
split(x, c(0, diff(x)))
which gets close, but I don't feel like appending 0 to the differenced vector is the right way to go. Using findInterval didn't work either.

split(x, cumsum(c(TRUE, diff(x)!=1)))
#$`1`
#[1] 7
#
#$`2`
#[1] 1 2 3 4
#
#$`3`
#[1] 6 7
#
#$`4`
#[1] 9

Just for fun, you can make use of Carl Witthoft's seqle function from his "cgwtools" package. (It's not going to be anywhere near as efficient as Roland's answer.)
library(cgwtools)
## Here's what seqle does...
## It's like rle, but for sequences
seqle(x)
# Run Length Encoding
# lengths: int [1:4] 1 4 2 1
# values : num [1:4] 7 1 6 9
y <- seqle(x)
split(x, rep(seq_along(y$lengths), y$lengths))
# $`1`
# [1] 7
#
# $`2`
# [1] 1 2 3 4
#
# $`3`
# [1] 6 7
#
# $`4`
# [1] 9

Related

How to create bijection between two lists?

Good afternoon !
I'm wanting to transform a list like the following :
list_1= list(c(1,30,25),c(51,70),c(102,130,125))
to be :
list_2=list(c(1,2,3),c(4,5),c(6,7,8))
I know that we can retrieve list_1 lengths with :
lengths(list_1)
3 2 3
The list_2 represent indices of list_1 elements ( in case we unlist them ) .
I hope my question is clear , thank you for help in advance !
Using split.
ll <- lengths(list_1)
unname(split(seq(unlist(list_1)), rep(seq(ll), ll)))
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5
#
# [[3]]
# [1] 6 7 8
An option with relist
relist(seq_along(unlist(list_1)), skeleton = list_1)
#[[1]]
#[1] 1 2 3
#[[2]]
#[1] 4 5
#[[3]]
#[1] 6 7 8

Split to list() based on condition, omiting the False elements

What is the most elegant way to split a vector into n-Elements based on a condition?
Every separate true-block should go into its own list element. All the false elements get thrown away.
example1:
vec <- c(1:3,NA,NA,NA,4:6,NA,NA,NA,7:9,NA)
cond <- !is.na(vec)
result = list(1:3,4:6,7:9)
example2:
vec_2 <- c(3:1,11:13,6:4,14:16,9:7,20)
cond_2 <- vec_2 < 10
results_2 = list(3:1,6:4,9:7)
It would be great to have a general solution for a vector vec and a relating condition cond.
My best try:
res <- split(vec,data.table::rleidv(cond))
odd <- as.logical(seq_along(res)%%2)
res[if(cond[1])odd else !odd]
I guess this should work generally:
> split(vec[cond], data.table::rleid(cond)[cond])
$`1`
[1] 1 2 3
$`3`
[1] 4 5 6
$`5`
[1] 7 8 9
Let's make it a function:
> f <- function(vec, cond) split(vec[cond], data.table::rleid(cond)[cond])
> f(vec_2, cond_2)
$`1`
[1] 3 2 1
$`3`
[1] 6 5 4
$`5`
[1] 9 8 7
Here is a base R option with rle
grp <- with(rle(cond), rep(seq_along(values) * NA^ !values, lengths))
split(vec[cond], grp[cond])
#$`1`
#[1] 1 2 3
#$`3`
#[1] 4 5 6
#$`5`
#[1] 7 8 9
Similarly with 'vec_2'
grp <- with(rle(cond_2), rep(seq_along(values) * NA^ !values, lengths))
split(vec_2[cond_2], grp[cond_2])
#$`1`
#[1] 3 2 1
#$`3`
#[1] 6 5 4
#$`5`
#[1] 9 8 7
Or create a grouping variable with cumsum and diff
grp <- cumsum(c(TRUE, diff(cond) < 0)) * NA^ is.na(vec)

Getting all the combination of numbers from a list that would sum to a specific number

I have the following list of numbers (1,3,4,5,7,9,10,12,15) and I want to find out all the possible combinations of 3 numbers from this list that would sum to 20.
My research on stackoverflow has led me to this post:
Finding all possible combinations of numbers to reach a given sum
There is a solution provided by Mark which stand as follows:
subset_sum = function(numbers,target,partial=0){
if(any(is.na(partial))) return()
s = sum(partial)
if(s == target) print(sprintf("sum(%s)=%s",paste(partial[-1],collapse="+"),target))
if(s > target) return()
for( i in seq_along(numbers)){
n = numbers[i]
remaining = numbers[(i+1):length(numbers)]
subset_sum(remaining,target,c(partial,n))
}
}
However I am having a hard time trying to tweak this set of codes to match my problem. Or may be there is a simpler solution?
I want the output in R to show me the list of numbers.
Any help would be appreciated.
You can use combn function and filter to meet your criteria. I have performed below calculation in 2 steps but one can perform it in single step too.
v <- c(1,3,4,5,7,9,10,12,15)
AllComb <- combn(v, 3) #generates all combination taking 3 at a time.
PossibleComb <- AllComb[,colSums(AllComb) == 20] #filter those with sum == 20
#Result: 6 sets of 3 numbers (column-wise)
PossibleComb
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 1 1 3 3 4
# [2,] 4 7 9 5 7 7
# [3,] 15 12 10 12 10 9
#
# Result in list
split(PossibleComb, col(PossibleComb))
# $`1`
# [1] 1 4 15
#
# $`2`
# [1] 1 7 12
#
# $`3`
# [1] 1 9 10
#
# $`4`
# [1] 3 5 12
#
# $`5`
# [1] 3 7 10
#
# $`6`
# [1] 4 7 9
The combn also have a FUN parameter which we can describe to output as list and then Filter the list elements based on the condition
Filter(function(x) sum(x) == 20, combn(v, 3, FUN = list))
#[[1]]
#[1] 1 4 15
#[[2]]
#[1] 1 7 12
#[[3]]
#[1] 1 9 10
#[[4]]
#[1] 3 5 12
#[[5]]
#[1] 3 7 10
#[[6]]
#[1] 4 7 9
data
v <- c(1,3,4,5,7,9,10,12,15)

How to check if the given value belong to the vectors in list?

Suppose we have a value y=4, and a list of vectors, I want to check if this value belongs to any vector in the list if yes, I will add this value to all the elements of vectors.
y<-4
M<- list( c(1,3,4,6) , c(2,3,5), c(1,3,6) ,c(1,4,5,6))
> M
[[1]]
[1] 1 3 4 6
[[2]]
[1] 2 3 5
[[3]]
[1] 1 3 6
[[4]]
[1] 1 4 5 6
The outcomes will be similar to :
> R
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
We can use keep which only keeps elements that satisfy a predicate. In this case, it is only keeping the vectors that contain y.
We then add y to each of the vectors.
library('tidyverse')
keep(M, ~y %in% .) %>%
map(~. + y)
Here is a simple hacky way to do this:
lapply(M[sapply(M, function(x){y %in% x})],function(x){x+y})
returning:
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Logic: use sapply to work out which parts of M have a 4 in, then add 4 to those with lapply
You can do this with...
lapply(M[sapply(M, `%in%`, x=y)], `+`, y)
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Here is a method with lapply and set functions.
# loop through M, check length of intersect
myList <- lapply(M, function(x) if(length(intersect(y, x)) > 0) x + y else NULL)
# now subset, dropping the NULL elements
myList <- myList[lengths(myList) > 0]
this returns
myList
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10
Wow! everyone has given great answers, just including the use of Map functionality.
Map("+",M[unlist(Map("%in%", y,M))],y)
[[1]]
[1] 5 7 8 10
[[2]]
[1] 5 8 9 10

Splitting numeric vectors in R

If I have a vector, c(1,2,3,5,7,9,10,12)...and another vector c(3,7,10), how would I produce the following:
[[1]]
1,2,3
[[2]]
5,7
[[3]]
9,10
[[4]]
12
Notice how 3 7 and 10 become the last number of each list element (except the last one). Or in a sense the "breakpoint". I am sure there is a simple R function I am unknowledgeable of or having loss of memory.
Here's one way using cut and split:
split(x, cut(x, c(-Inf, y, Inf)))
#$`(-Inf,3]`
#[1] 1 2 3
#
#$`(3,7]`
#[1] 5 7
#
#$`(7,10]`
#[1] 9 10
#
#$`(10, Inf]`
#[1] 12
Could do
split(x, cut(x, unique(c(y, range(x)))))
## $`[1,3]`
## [1] 1 2 3
## $`(3,7]`
## [1] 5 7
## $`(7,10]`
## [1] 9 10
## $`(10,12]`
## [1] 12
Similar to #beginneR 's answer, but using findInterval instead of cut
split(x, findInterval(x, y + 1))
# $`0`
# [1] 1 2 3
#
# $`1`
# [1] 5 7
#
# $`2`
# [1] 9 10
#
# $`3`
# [1] 12

Resources