I need to replace the sequence "1,0,1" with "1,1,1" whenever it is found in a vector. How can I do this?
x <- c(1,2,3,4,1,0,1)
Edit:
This search needs to be dynamic. If after changing from 1,0,1 to 1,1,1 another 1,0,1 occurs, this must also be replaced.
Considering:
x <- c (1,2,3,4,1,0,1,0,1,2)
I want the algorithm to do:
x <- c (1,2,3,4,1,1,1,0,1,2)
And after:
x <- c (1,2,3,4,1,1,1,1,1,2)
A function that deals dynamically with the length of the sub-vector (being sought). Solutions that convert to/from strings are going to be hugely inefficient asymptotically. Solutions that hard-code a sub-vec of length 3 are limited to sub-vecs of length 3. This deals with anything as long as the source vector is as large or larger than the sub-vec to be found.
#' Find a matching sub-vector
#'
#' Given a vector (`invec`) and a no-larger sub-vector (`subvec`),
#' determine if the latter occurs perfectly.
#' #param invec vector
#' #param subvec vector
#' #return integer positions, length 0 or more
find_subvec <- function(invec, subvec) {
sublen <- seq_along(subvec) - 1L
if (length(subvec) > length(invec)) return(integer(0))
which(
sapply(seq_len(length(invec) - length(subvec) + 1L),
function(i) all(subvec == invec[i + sublen]))
)
}
Use:
find_subvec(c(1,2,3,4,1,0,1), c(1,0,1))
# [1] 5
find_subvec(c(1,2,3,4,1,0,1,0,1), c(1,0,1))
# [1] 5 7
A literal replacement.
z <- c(1,1,1)
x <- c(1,2,3,4,1,0,1)
y <- c(1,0,1)
z <- c(1,1,1)
ind <- find_subvec(x, y)
for (i in ind) x[i + seq_along(y) - 1] <- z
x
# [1] 1 2 3 4 1 1 1
There could be edge cases as mentioned by #Onyambu when the expected results are not clear, but one option could be:
x + (x == 0 & c(NA, head(x, -1)) == 1 & c(tail(x, -1), NA) == 1)
1] 1 2 3 4 1 1 1
Here, it is not treating x as a string, but it is assessing whether the lag and lead values are 1 and the value in the middle is 0.
This should work well enough
library(tidyverse)
x <- c(1,2,3,4,1,0,1,0,1)
x %>%
reduce(str_c) %>%
str_replace_all("(?<=1)0(?=1)","1")
#> [1] "123411111"
Created on 2020-06-14 by the reprex package (v0.3.0)
I am not sure why I get different results from these functions.
change_it1 <- function(x) {
x[x == 5] <- -10
}
change_it2 <- function(x) {
x[x == 5] <- -10
x
}
x <- 1:5
x <- change_it1(x)
x
x <- 1:5
x <- change_it2(x)
x
Why do both functions not change x in the same way as?
x[x==5] <- -10
The assignment operator <- is really a function that has the side effect of changing a variables value. But as a function, it also invisibly returns the value that was used on the right hand side for assignment. We can force the invisible value to be seen with a print(). For example
x <- 1:2
print(names(x) <- c("a","b"))
# [1] "a" "b"
or again with subsetting
print(x[1] <- 10)
# [1] 10
print(x[2] <- 20)
# [1] 20
x
# a b
# 10 20
See in each case the assignment returned the right-hand-side value and not the updated value of x. Functions will return whatever value was returned by the last expression. In the first case, you are returning the value returned by the assignment (which is just the value -10) and in the second case you are explicitly returning the updated x value.
The functions both change x in the same way (at least in the scope of the function), but you are just not returning the updated x value in both cases.
I have theoretically identical solutions, one is vectorized solution and another is with for-loop. But vectorized solution returns wrong result and I want to understand why. Solution's logic is simple: need to replace NA with previous non-NA value in the vector.
# vectorized
f1 <- function(x) {
idx <- which(is.na(x))
x[idx] <- x[ifelse(idx > 1, idx - 1, 1)]
x
}
# non-vectorized
f2 <- function(x) {
for (i in 2:length(x)) {
if (is.na(x[i]) && !is.na(x[i - 1])) {
x[i] <- x[i - 1]
}
}
x
}
v <- c(NA,NA,1,2,3,NA,NA,6,7)
f1(v)
# [1] NA NA 1 2 3 3 NA 6 7
f2(v)
# [1] NA NA 1 2 3 3 3 6 7
The two pieces of code are different.
The first one replace NA with the previous element if this one is not NA.
The second one replace NA with the previous element if this one is not NA, but the previous element can be the result of a previous NA substitution.
Which one is correct really depends on you. The second behaviour is more difficult to vectorize, but there are some already implemented functions like zoo::na.locf.
Or, if you only want to use base packages, you could have a look at this answer.
These two solutions are not equivalent. The first function is rather like:
f2_as_f1 <- function(x) {
y <- x # a copy of x
for (i in 2:length(x)) {
if (is.na(y[i])) {
x[i] <- y[i - 1]
}
}
x
}
Note the usage of the y vector.
Suppose I have a collection of independent vectors, of the same length. For example,
x <- 1:10
y <- rep(NA, 10)
and I wish to turn them into a list whose length is that common length (10 in the given example), in which each element is a vector whose length is the number of independent vectors that were given. In my example, assuming output is the output object, I'd expect
> str(output)
List of 10
$ : num [1:2] 1 NA
...
> output
[[1]]
[1] 1 NA
...
What's the common method of doing that?
use mapply and c:
mapply(c, x, y, SIMPLIFY=FALSE)
[[1]]
[1] 1 NA
[[2]]
[1] 2 NA
..<cropped>..
[[10]]
[1] 10 NA
Another option:
split(cbind(x, y), seq(length(x)))
or even:
split(c(x, y), seq(length(x)))
or even (assuming x has no duplicate values as in your example):
split(c(x, y), x)
Here is a solution that allows you to zip arbitrary number of equi-length vectors into a list, based on position of the element
merge_by_pos <- function(...){
dotlist = list(...)
lapply(seq_along(dotlist), function(i){
Reduce('c', lapply(dotlist, '[[', i))
})
}
x <- 1:10
y <- rep(NA, 10)
z <- 21:30
merge_by_pos(x, y, z)
I am painfully new to R. I have a list of data, and I wrote a loop to find which values are greater than a certain number:
for (i in listname){
if(i > x)
print(i)
}
I would like for the printed values to also include the row name... how would I go about doing that?
Thanks for your patience.
Strangely, when the item itself is the iterator, the name is lost. If you instead iterate over the number of the item, print works as expected:
for (i in 1:length(listname)){
if (listname[i] > x){
print(listname[i]) # value with name
}
}
Once you've learned more about R, you will probably want to do this in a "vectorized" way, instead of using a loop:
idx <- which(listname > x) # row numbers
listname[idx] # values with names
or with logical subsetting
gt_x<- listname > x # TRUE or FALSE
listname[gt_x] # values with names
Example: Try this with
listname <- 1:10
names(listname) <- letters[1:10]
x <- 4
idx <- which(listname > x) # row numbers
listname[idx] # values with names
# e f g h i j
# 5 6 7 8 9 10