Replacing values in a list based on a condition - r

I have a list of values called squares and would like to replace all values which are 0 to a 40.
I tried:
replace(squares, squares==0, 40)
but the list remains unchanged

If it is a list, then loop through the list with lapply and use replace
squares <- lapply(squares, function(x) replace(x, x==0, 40))
squares
#[[1]]
#[1] 40 1 2 3 4 5
#[[2]]
#[1] 1 2 3 4 5 6
#[[3]]
#[1] 40 1 2 3
data
squares <- list(0:5, 1:6, 0:3)

I think for this purpose, you can just treat it as if it were a vector as follows:
squares=list(2,4,6,0,8,0,10,20)
squares[squares==0]=40
Output:
[[1]]
[1] 2
[[2]]
[1] 4
[[3]]
[1] 6
[[4]]
[1] 40
[[5]]
[1] 8
[[6]]
[1] 40
[[7]]
[1] 10
[[8]]
[1] 20

Related

Get subsets between one element and the previous same element

Consider a vector:
vec <- c(1, 3, 4, 3, 3, 1, 1)
I'd like to get, for each element of the vector, a subset of the values in between the nth element and its previous occurrence.
The expected output is:
f(vec)
# [[1]]
# [1] 1
#
# [[2]]
# [1] 3
#
# [[3]]
# [1] 4
#
# [[4]]
# [1] 3 4 3
#
# [[5]]
# [1] 3 3
#
# [[6]]
# [1] 1 3 4 3 3 1
#
# [[7]]
# [1] 1 1
We may loop over the sequence of the vector, get the index of the last match of the same element ('i1') from the previous elements of the vector and get the sequence (:) to subset the vector
lapply(seq_along(vec), function(i) {
i1 <- tail(which(vec[1:(i-1)] == vec[i]), 1)[1]
i1[is.na(i1)] <- i
vec[i1:i]
})
-output
[[1]]
[1] 1
[[2]]
[1] 3
[[3]]
[1] 4
[[4]]
[1] 3 4 3
[[5]]
[1] 3 3
[[6]]
[1] 1 3 4 3 3 1
[[7]]
[1] 1 1

Convert a vector into a specialised list efficiently

I'm looking for a more efficient way to get from my current input to my expected output.
Input
vec <- 1:4
Expected output
[[1]]
[1] 1 2 3 4
[[2]]
[1] 1
[[3]]
[1] 2
[[4]]
[1] 3
[[5]]
[1] 4
Current solution:
lis <- list()
lis[2:5] <- as.list(vec)
lis[[1]] <- vec
We can do
c(list(vec), as.list(vec))
#[[1]]
#[1] 1 2 3 4
#[[2]]
#[1] 1
#[[3]]
#[1] 2
#[[4]]
#[1] 3
#[[5]]
#[1] 4

Remove outliers based on a preceding value

How to remove outliers using a criterion that a value cannot be more than 2-fold higher then its preceding one.
Here is my try:
x<-c(1,2,6,4,10,20,50,10,2,1)
remove_outliers <- function(x, na.rm = TRUE, ...) {
for(i in 1:length(x))
x < (x[i-1] + 2*x)
x
}
remove_outliers(y)
expected outcome: 1,2,4,10,20,2,1
Thanks!
I think the first 10 should be removed in your data because 10>2*4. Here's a way to do what you want without loops. I'm using the dplyr version of lag.
library(dplyr)
x<-c(1,2,6,4,10,20,50,10,2,1)
x[c(TRUE,na.omit(x<=dplyr::lag(x)*2))]
[1] 1 2 4 20 10 2 1
EDIT
To use this with a data.frame:
df <- data.frame(id=1:10, x=c(1,2,6,4,10,20,50,10,2,1))
df[c(TRUE,na.omit(df$x<=dplyr::lag(df$x,1)*2)),]
id x
1 1 1
2 2 2
4 4 4
6 6 20
8 8 10
9 9 2
10 10 1
A simple sapply:
bool<-sapply(seq_along(1:length(x)),function(i) {ifelse(x[i]<2*x[i-1],FALSE,TRUE)})
bool
[[1]]
logical(0)
[[2]]
[1] TRUE
[[3]]
[1] TRUE
[[4]]
[1] FALSE
[[5]]
[1] TRUE
[[6]]
[1] TRUE
[[7]]
[1] TRUE
[[8]]
[1] FALSE
[[9]]
[1] FALSE
[[10]]
[1] FALSE
resulting in:
x[unlist(bool)]
[1] 1 2 4 10 20 1

Reverse sort list by max element

I have a list of vectors
l = list(c(1,2),c(3,4),c(2,3),c(7,8),c(5,6))
and would to reverse sort it by the vector maximums:
> l
[[1]]
[1] 7 8
[[2]]
[1] 5 6
[[3]]
[1] 3 4
[[4]]
[1] 2 3
[[5]]
[1] 1 2
Any idea how I could do this in a one liner? thx
One way is
l[order(sapply(l, max), decreasing=TRUE)]
#[[1]]
#[1] 7 8
#[[2]]
#[1] 5 6
#[[3]]
#[1] 3 4
#[[4]]
#[1] 2 3
#[[5]]
#[1] 1 2
You could replace sapply(l, max) with vapply(l, max, numeric(1L)) as well.
Or a compact form suggested by #DavidArenburg
l[order(-sapply(l, max))]

Splitting numeric vectors in R

If I have a vector, c(1,2,3,5,7,9,10,12)...and another vector c(3,7,10), how would I produce the following:
[[1]]
1,2,3
[[2]]
5,7
[[3]]
9,10
[[4]]
12
Notice how 3 7 and 10 become the last number of each list element (except the last one). Or in a sense the "breakpoint". I am sure there is a simple R function I am unknowledgeable of or having loss of memory.
Here's one way using cut and split:
split(x, cut(x, c(-Inf, y, Inf)))
#$`(-Inf,3]`
#[1] 1 2 3
#
#$`(3,7]`
#[1] 5 7
#
#$`(7,10]`
#[1] 9 10
#
#$`(10, Inf]`
#[1] 12
Could do
split(x, cut(x, unique(c(y, range(x)))))
## $`[1,3]`
## [1] 1 2 3
## $`(3,7]`
## [1] 5 7
## $`(7,10]`
## [1] 9 10
## $`(10,12]`
## [1] 12
Similar to #beginneR 's answer, but using findInterval instead of cut
split(x, findInterval(x, y + 1))
# $`0`
# [1] 1 2 3
#
# $`1`
# [1] 5 7
#
# $`2`
# [1] 9 10
#
# $`3`
# [1] 12

Resources