Remove negative values from nested list - r

I have following list:
[[1]]
[1] 0 -1 -2 -3 -4
[[2]]
[1] 2 1 0 -1 -2
[[3]]
[1] 2 3 4 5 6
I want to remove negative values from above list.
I am trying with following code in r
x[ x > 0 ]
But, it does not remove negative values.

We can use lapply over each list element and select only those values which are greater than or equal to 0.
lapply(lst, function(x) x[x >= 0])
#$l1
#[1] 0
#$l2
#[1] 2 1 0
#$l3
#[1] 2 3 4 5 6
Data
lst = list(l1 = c(0,-1,-2,-3,-4),l2 = c(2,1,0,-1,-2), l3 = c(2,3,4,5,6))

Related

Estimate how many consecutives true elements there is in a vector in R

I have a really large boolean vector (i.e. T or F) and I want to simply be able to estimate how many "blocks" of consecutive T there are in my vector contained between the F elements.
A simple example of a vector with 3 of these consecutive "blocks" of T elements:
x <- c(T,T,T,T,F,F,F,F,T,T,T,T,F,T,T)
Output:
1,1,1,1,0,0,0,0,2,2,2,2,0,3,3
You can do:
rle <- rle(x)
rle$values <- with(rle, cumsum(values) * values)
inverse.rle(rle)
[1] 1 1 1 1 0 0 0 0 2 2 2 2 0 3 3
And a simplified and more elegant version of the basic idea (proposed by #Lyngbakr):
with(rle(x), rep(cumsum(values) * values, lengths))
Another solution with rle/inverse.rle:
x <- c(T,T,T,T,F,F,F,F,T,T,T,T,F,T,T)
rle_x <- rle(x)
rle_x$values[rle_x$values] <- 1:length(which(rle_x$values))
inverse.rle(rle_x)
# [1] 1 1 1 1 0 0 0 0 2 2 2 2 0 3 3

na.pad not working in diff() function

For some reason the diff() functions na.pad parameter is not working properly? Anyone else having this problem or have a work around?
yo <- c(5,3,3,4,5,6,5,8,9)
diff(yo, na.pad = TRUE)
[1] -2 0 1 1 1 -1 3 1
The resulting vector should be:
[1] NA -2 0 1 1 1 -1 3 1
The function diff you use certainly comes from xts package, na.pad does not apply on base R vectors. And you also need to convert your vector to times series:
library(xts)
library(zoo)
yy = zoo(yo)
diff(yy, na.pad=TRUE)
# 1 2 3 4 5 6 7 8 9
#NA -2 0 1 1 1 -1 3 1

Unexpected output from rollapply due to zero length output from the function

I'm encountering a problem that I'm failling to understand. Here is the commented code :
library(zoo)
#Pattern p used as row feeding matrix to apply() for function f
> p
[,1] [,2] [,3]
[1,] -1 1 1
[2,] 1 1 1
#Function f supposed to take rows of matrix p as vectors,
#compare them with vector x and return index
f <- function(x) { # identifies which row of `patterns` matches sign(x)
which(apply(p,1,function(row)all(row==sign(x))))
}
#rollapplying f over c(1,-1,1,1) : there is no vector c(1,-1,1) in p
#so why the first atom is 1 ?
> rollapply(c(1,-1,1,1),width=3,f,align="left")
[1] 1 1
#rollapply identity, rollapply is supposed to feed row to the function right ?
> t = rollapply(c(1,-1,1,1),width=3,function(x)x,align="left")
[,1] [,2] [,3]
[1,] 1 -1 1
[2,] -1 1 1
#Feeding the first row of the precedent matrix to f is giving the expected result
> f(t[1,])
integer(0)
#rollapply feeds the rolls to the function
> rollapply(c(1,-1,1,1),width=3,function(x) paste(x,collapse=","),align="left")
[1] "1,-1,1" "-1,1,1"
#rollapply feeds vectors to the function
> rollapply(c(1,-1,1,1),width=3,function(x) is.vector(x),align="left")
[1] TRUE TRUE
#Unconsistent with the 2 precedent results
> f(c(1,-1,1))
integer(0)
Basically I don't understand why rollapply(c(1,-1,1,1),width=3,f,align="left") is returning 1 1 when the first roll from rollapply is supposed to yield the vector 1 -1 1 that is absent from the pattern matrix p. I was expecting the result NA 1 instead. There must be something I don't understand about rollapply but strangely enough if I feed the vector c(-1, -1, -1 ,-1) to rollapply I get the expected result NA NA. In some cases I have a mix 1 2 but never a mix NA 1 or NA 2.
According to G. Grothendieck rollapply does not support functions producing zero length outputs. It is possible to get rid of the problem by adding a condition in the function f returning a specific output in case it was returning zero length output.
f <- function(x) { # identifies which row of `patterns` matches sign(x)
t <- which(apply(patterns,1,function(row)all(row==sign(x))))
ifelse(length(t)==0, return(0), return(t))
}
For completeness, quoting GGrothendieck's comment. "rollapply does not support functions producing zero length outputs. " That is consistent with the behavior below.
Further confusion, at least for me (this should be a comment but I wanted some decent formatting):
sfoo<-c(1,-1,1,1,1,-1,1,1)
rollapply(sfoo,width=3,function(j){k<-f(j);print(j);return(k)})
[1] 1 -1 1
[1] -1 1 1
[1] 1 1 1
[1] 1 1 -1
[1] 1 -1 1
[1] -1 1 1
[1] 1 2 1 1 2 1
I then tried:
ff<-function(x) which(rowSums(p)==sum(x))
sbar<-c(0,1,1,1,-1,0,-1)
rollapply(sbar,width=3,function(j){k<-ff(j);print(j);return(k)})
[1] 0 1 1
[1] 1 1 1
[1] 1 1 -1
[1] 1 -1 0
[1] -1 0 -1
[1] 2 1 2 1 2
Which sure looks like rollapply is doing a na.locf-sort of filling in operation.

Replace NaN values in a list with zero (0)

Hi dear I have a problem with NaN. I am working with a large dataset with many variables and they have NaN. The data is like this:
z=list(a=c(1,2,3,NaN,5,8,0,NaN),b=c(NaN,2,3,NaN,5,8,NaN,NaN))
I used this commands to force the list to data frame but I got this:
z=as.data.frame(z)
> is.list(z)
[1] TRUE
> is.data.frame(z)
[1] TRUE
> replace(z,is.nan(z),0)
Error en is.nan(z) : default method not implemented for type 'list'
I forced z to data frame but it wasn't enough, maybe there is a form to change NaN in list. Thanks for your help. This data is only an example my original data has 36000 observations and 40 variables.
This is a perfect use case for rapply.
> rapply( z, f=function(x) ifelse(is.nan(x),0,x), how="replace" )
$a
[1] 1 2 3 0 5 8 0 0
$b
[1] 0 2 3 0 5 8 0 0
lapply would work too, but rapply deals properly with nested lists in this situation.
As you don't seem to mind having your data in a dataframe, you can do something highly vectorised too. However, this will only work if each list element is of equal length. I am guessing in your data (36000/40 = 900) that this is the case:
z <- as.data.frame(z)
dim <- dim(z)
y <- unlist(z)
y[ is.nan(y) ] <- 0
x <- matrix( y , dim )
# [,1] [,2]
# [1,] 1 0
# [2,] 2 2
# [3,] 3 3
# [4,] 0 0
# [5,] 5 5
# [6,] 8 8
# [7,] 0 0
# [8,] 0 0
Following OP's edit: Following your edited title, this should do it.
unstack(within(stack(z), values[is.nan(values)] <- 0))
# a b
# 1 1 0
# 2 2 2
# 3 3 3
# 4 0 0
# 5 5 5
# 6 8 8
# 7 0 0
# 8 0 0
unstack automatically gives you a data.frame if the resulting output is of equal length (unlike the first example, shown below).
Old solution (for continuity).
Try this:
unstack(na.omit(stack(z)))
# $a
# [1] 1 2 3 5 8 0
# $b
# [1] 2 3 5 8
Note 1: It seems from your post that you want to replace NaN with 0. The output of stack(z), it can be saved to a variable and then replaced to 0 and then you can unstack.
Note 2: Also, since na.omit removes NA as well as NaN, I also assume that your data contains no NA (from your data above).
z = do.call(data.table, rapply(z, function(x) ifelse(is.nan(x),0,x), how="replace"))
If you initially have data.table and want to 1-line the replacement.
But keep in mind that keys are need to be redefined after that:
> key(x1)
[1] "date"
> x1 = do.call(data.table, rapply(x1, function(x) ifelse(is.na(x), 0, x), how="replace"))
> key(x1)
NULL

Generate vectors using R

I would like to ask,if some of You dont know any simple way to solve this kind of problem:
I need to generate all combinations of A numbers taken from a set B (0,1,2...B), with their sum = C.
ie if A=2, B=3, C=2:
Solution in this case:
(1,1);(0,2);(2,0)
So the vectors are length 2 (A), sum of all its items is 2 (C), possible values for each of vectors elements come from the set {0,1,2,3} (maximum is B).
A functional version since I already started before SO updated:
A=2
B=3
C=2
myfun <- function(a=A, b=B, c=C) {
out <- do.call(expand.grid, lapply(1:a, function(x) 0:b))
return(out[rowSums(out)==c,])
}
> out[rowSums(out)==c,]
Var1 Var2
3 2 0
6 1 1
9 0 2
z <- expand.grid(0:3,0:3)
z[rowSums(z)==2, ]
Var1 Var2
3 2 0
5 1 1
7 0 2
If you wanted to do the expand grid programmatically this would work:
z <- expand.grid( rep( list(C), A) )
You need to expand as a list so that the items remain separate. rep(0:3, 3) would not return 3 separate sequences. So for A=3:
> z <- expand.grid(rep(list(0:3), 3))
> z[rowSums(z)==2, ]
Var1 Var2 Var3
3 2 0 0
6 1 1 0
9 0 2 0
18 1 0 1
21 0 1 1
33 0 0 2
Using the nifty partitions() package, and more interesting values of A, B, and C:
library(partitions)
A <- 2
B <- 5
C <- 7
comps <- t(compositions(C, A))
ii <- apply(comps, 1, FUN=function(X) all(X %in% 0:B))
comps[ii, ]
# [,1] [,2]
# [1,] 5 2
# [2,] 4 3
# [3,] 3 4
# [4,] 2 5

Resources