Split to list() based on condition, omiting the False elements - r

What is the most elegant way to split a vector into n-Elements based on a condition?
Every separate true-block should go into its own list element. All the false elements get thrown away.
example1:
vec <- c(1:3,NA,NA,NA,4:6,NA,NA,NA,7:9,NA)
cond <- !is.na(vec)
result = list(1:3,4:6,7:9)
example2:
vec_2 <- c(3:1,11:13,6:4,14:16,9:7,20)
cond_2 <- vec_2 < 10
results_2 = list(3:1,6:4,9:7)
It would be great to have a general solution for a vector vec and a relating condition cond.
My best try:
res <- split(vec,data.table::rleidv(cond))
odd <- as.logical(seq_along(res)%%2)
res[if(cond[1])odd else !odd]

I guess this should work generally:
> split(vec[cond], data.table::rleid(cond)[cond])
$`1`
[1] 1 2 3
$`3`
[1] 4 5 6
$`5`
[1] 7 8 9
Let's make it a function:
> f <- function(vec, cond) split(vec[cond], data.table::rleid(cond)[cond])
> f(vec_2, cond_2)
$`1`
[1] 3 2 1
$`3`
[1] 6 5 4
$`5`
[1] 9 8 7

Here is a base R option with rle
grp <- with(rle(cond), rep(seq_along(values) * NA^ !values, lengths))
split(vec[cond], grp[cond])
#$`1`
#[1] 1 2 3
#$`3`
#[1] 4 5 6
#$`5`
#[1] 7 8 9
Similarly with 'vec_2'
grp <- with(rle(cond_2), rep(seq_along(values) * NA^ !values, lengths))
split(vec_2[cond_2], grp[cond_2])
#$`1`
#[1] 3 2 1
#$`3`
#[1] 6 5 4
#$`5`
#[1] 9 8 7
Or create a grouping variable with cumsum and diff
grp <- cumsum(c(TRUE, diff(cond) < 0)) * NA^ is.na(vec)

Related

How do I Split a vector to equal length with the left over adding up from the first element(s) of parent vector with R

I want R to split a vector to subvector of equal length but if the last subvector is not equal to the length of other subvectors to add it up with the first element(s) of the parent vector.
I have tried this from an answer to a question Here which is not what I desire.
ts <- 1:11
bs <- 3
nb <- ceiling(length(ts) / bs)
split(ts, rep(1:nb, each=bs, length.out = length(ts)))
#$`1`
#[1] 1 2 3
#$`2`
#[1] 4 5 6
#$`3`
#[1] 7 8 9
#$`4`
#[1] 10 11
What I want as output
#$`1`
#[1] 1 2 3
#$`2`
#[1] 4 5 6
#$`3`
#[1] 7 8 9
#$`4`
#[1] 10 11 1
#Extend the `ts` to have a total length of `bs * nb`
split(rep(ts, length.out = nb * bs), rep(1:nb, each = bs))
#OR use modular arithmetic
split(ts[((sequence(nb * bs) - 1) %% length(ts)) + 1], rep(1:nb, each = bs))
#$`1`
#[1] 1 2 3
#$`2`
#[1] 4 5 6
#$`3`
#[1] 7 8 9
#$`4`
#[1] 10 11 1

Convert each row of dataframe to new list in R

I have below sample input data-
> df <- data.frame(a=c(1,2,9),b=c(3,4,5),c=c(2,6,7))
> df
a b c
1 1 3 2
2 2 4 6
3 9 5 7
I am trying to convert rach row into separate list.
My Attempt-
> apply(df,1,as.list)
The above solution converts each row into sublists. But, I am looking for 3 separate list in this case.
nrow(df) = no. of lists
Desired Output-
> list1
$a
[1] 1
$b
[1] 3
$c
[1] 2
> list2
$a
[1] 2
$b
[1] 4
$c
[1] 6
> list3
$a
[1] 9
$b
[1] 5
$c
[1] 7
You can use by and as.list
out <- by(df, 1:nrow(df), as.list)
out
#1:nrow(df): 1
#$a
#[1] 1
#
#$b
#[1] 3
#$c
#[1] 2
#------------------------------------------------------------------------------
#1:nrow(df): 2
#$a
#[1] 2
#$b
#[1] 4
#$c
#[1] 6
#------------------------------------------------------------------------------
#1:nrow(df): 3
#$a
#[1] 9
#$b
#[1] 5
#$c
#[1] 7
That creates an object of class by. So you may call unclass(out) in the end.

How to remove zero vectors from a list of list in r

Problem:
I have a list of two lists of three vectors. I would like to remove the zero vector from each sublist.
Example:
x <- list(x1=c(0,0,0), x2=c(3,4,5), x3=c(45,34,23))
y <- list(y1=c(2,33,4), y2=c(0,0,0), y3=c(4,5,44))
z <- list(x, y)
Try:
I tried this:
res <- lapply(1:2, function(i) {lapply(1:3, function(j) z[[i]][[j]][z[[i]][[j]] != 0])})
Which gave me this:
> res
[[1]]
[[1]][[1]]
numeric(0)
[[1]][[2]]
[1] 3 4 5
[[1]][[3]]
[1] 45 34 23
[[2]]
[[2]][[1]]
[1] 2 33 4
[[2]][[2]]
numeric(0)
[[2]][[3]]
[1] 4 5 44
Problem with the output:
I do not want numeric(0).
Expected output:
x= list(x2, x3)
y=list(y1, y3)
Any idea, please?
You can try a tidyverse if the nested list structure is not important
library(tidyverse)
z %>%
flatten() %>%
keep(~all(. != 0))
$x2
[1] 3 4 5
$x3
[1] 45 34 23
$y1
[1] 2 33 4
$y3
[1] 4 5 44
Given your structure of list of lists I would go with the following:
filteredList <- lapply(z, function(i) Filter(function(x) any(x != 0), i))
x <- filteredList[[1]]
y <- filteredList[[2]]
x
##$`x2`
##[1] 3 4 5
##$x3
##[1] 45 34 23
y
##$`y1`
##[1] 2 33 4
##$y3
##[1] 4 5 44
define z as
z <- c(x, y)
# z <- unlist(z, recursive = F) if you cannot define z by yourself.
then use:
z[sapply(z, any)]
#$`x2`
#[1] 3 4 5
#$x3
#[1] 45 34 23
#$y1
#[1] 2 33 4
#$y3
#[1] 4 5 44
Please note:
As in the tradition of lang C. Every integer/ numeric != 0 will be casted to TRUE. So in this task we can use this logic. ?any will eval FALSE if all values are 0.
Or:
x <- list(x1=c(0,0,0), x2=c(3,4,5), x3=c(45,34,23))
y <- list(y1=c(2,33,4), y2=c(0,0,0), y3=c(4,5,44))
z <- list(x, y)
lapply(z, function(a) a[unlist(lapply(a, function(b) !identical(b, rep(0,3))))])
#[[1]]
#[[1]]$`x2`
#[1] 3 4 5
#
#[[1]]$x3
#[1] 45 34 23
#
#
#[[2]]
#[[2]]$`y1`
#[1] 2 33 4
#
#[[2]]$y3
#[1] 4 5 44
with purrr it can be really compact
library(purrr)
map(z, keep ,~all(.!=0))
# [[1]]
# [[1]]$x2
# [1] 3 4 5
#
# [[1]]$x3
# [1] 45 34 23
#
#
# [[2]]
# [[2]]$y1
# [1] 2 33 4
#
# [[2]]$y3
# [1] 4 5 44
If it wasn't for the annoying warnings we could do just map(z, keep , all)

Replace NAs in multiple list elements in R

Assume I have the following list:
list(c(1:5,NA,NA),NA,c(NA,6:10))
[[1]]
[1] 1 2 3 4 5 NA NA
[[2]]
[1] NA
[[3]]
[1] NA 6 7 8 9 10
I want to replace all NAs with 0:
[[1]]
[1] 1 2 3 4 5 0 0
[[2]]
[1] 0
[[3]]
[1] 0 6 7 8 9 10
I was originally thinking is.na would be involved, but couldn't get it to affect all list elements. I learned from the related question (Remove NA from list of lists), that using lapply would allow me to apply is.na to each element, but that post demonstrates how to remove (not replace) NA values.
How do I replace NA values from multiple list elements?
I've tried for loops and ifelse approaches, but everything I've tried is either slow, doesn't work or just plain clunky. There's got to be a simple way to do this with an apply function...
And there is!
Here's a simple lapply approach using the replace function:
L1 <-list(c(1:5,NA,NA),NA,c(NA,6:10))
lapply(L1, function(x) replace(x,is.na(x),0))
With the desired result:
[[1]]
[1] 1 2 3 4 5 0 0
[[2]]
[1] 0
[[3]]
[1] 0 6 7 8 9 10
There are multiple ways to do this:
using map from purrrr package.
lt <- list(c(1:5,NA,NA),NA,c(NA,6:10))
lt %>%
map(~replace(., is.na(.), 0))
#output
[[1]]
[1] 1 2 3 4 5 0 0
[[2]]
[1] 0
[[3]]
[1] 0 6 7 8 9 10
kk<- list(c(1:5,NA,NA),NA,c(1,6:10))
lapply(kk, function(i)
{ p<- which(is.na(i)==TRUE)
i[p] <- 0
i
})
Edited upon Gregor's commment
lapply(kk, function(i) {i[is.na(i)] <- 0; i})
I've decided to benchmark the various lapply approaches mentioned:
lapply(Lt, function(x) replace(x,is.na(x),0))
lapply(Lt, function(x) {x[is.na(x)] <- 0; x})
lapply(Lt, function(x) ifelse(is.na(x), 0, x))
Benchmarking code:
Lt <- lapply(1:10000, function(x) sample(c(1:10000,rep(NA,1000))) ) ##Sample list
elapsed.time <- data.frame(
m1 = mean(replicate(25,system.time(lapply(Lt, function(x) replace(x,is.na(x),0)))[3])),
m2 = mean(replicate(25,system.time(lapply(Lt, function(x) {x[is.na(x)] <- 0; x}))[3])),
m3 = mean(replicate(25,system.time(lapply(Lt, function(x) ifelse(is.na(x), 0, x)))[3]))
)
Results:
Function Average Elapsed Time
lapply(Lt, function(x) replace(x,is.na(x),0)) 0.8684
lapply(Lt, function(x) {x[is.na(x)] <- 0; x}) 0.8936
lapply(Lt, function(x) ifelse(is.na(x), 0, x)) 8.3176
The replace approach is fastest followed closely by the [] approach. The ifelse approach is 10x slower.
This will deal with any list depth and structure:
x <- eval(parse(text=gsub("NA","0",capture.output(dput(a)))))
# [[1]]
# [1] 1 2 3 4 5 0 0
#
# [[2]]
# [1] 0
#
# [[3]]
# [1] 0 6 7 8 9 10
Try this:
lapply(enlist, function(x) { x[!is.na(x)]})
where:
enlist <- list(c(1:5,NA,NA),NA,c(NA,6:10))
This yields:
[[1]]
[1] 1 2 3 4 5
[[2]]
logical(0)
[[3]]
[1] 6 7 8 9 10

Sub-setting elements of a list in R

Assume that this is my list
a <- list(c(1,2,4))
a[[2]] <- c(2,10,3,2,7)
a[[3]] <- c(2, 2, 14, 5)
How do I subset this list to exclude all the 2's. How do I obtain the following:
[[1]]
[1] 1 4
[[2]]
[1] 10 3 7
[[3]]
[1] 14 5
My current solution:
for(j in seq(1, length(a))){
a[[j]] <- a[[j]][a[[j]] != 2]
}
However, this approach feels a bit unnatural. How would I do the same thing with a function from the apply family?
Thanks!
lapply(a, function(x) x[x != 2])
#[[1]]
#[1] 1 4
#
#[[2]]
#[1] 10 3 7
#
#[[3]]
#[1] 14 5
Using lapply you can apply the subset to each vector in the list. The subset used is, x[x != 2].
Or use setdiff by looping over the list with lapply
lapply(a, setdiff, 2)
#[[1]]
#[1] 1 4
#[[2]]
#[1] 10 3 7
#[[3]]
#[1] 14 5

Resources