generate increasing sequence of varying length in R - r

Given n, generate a sequence like this:
0, 0, 1, 0, 1, 2, ........, 0, 1, 2, 3, 4, 5, 6, ....n
Let's say n=3, then the sequence should be:
0, 0, 1, 0, 1, 2, 0, 1, 2, 3
I've tried using rep, but it only generates a fixed length, where as I need the sequence length to increase each time.

You can use a simply Map with an unlist to get the result you want
n <- 3
unlist(Map(seq, from=0, to=0:n))
# [1] 0 0 1 0 1 2 0 1 2 3

From this answer
n <- 3
sequence(0:(n+1))-1
# [1] 0 0 1 0 1 2 0 1 2 3

Related

R Lookback few days and assign new value if old value exists

I have two timeseries vectors as follows -
a <- c(1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0)
b <- c(1, 0, 1, 0)
I want to look back 7 days and replace only 1's in vectors a and b with 2. It is important to check if there were any values 7 days before replacing.
The expected result is -
a = c(1, 0, 0, 0, 1, 0, 2, 1, 1, 0, 2, 0)
b = c(1, 0, 1, 0) - Since no value existed 7 days ago, nothing changes here.
Thanks!
We can create a condition with lag
library(dplyr)
f1 <- function(vec) replace(vec, lag(vec, 6) == 1, 2)
-output
f1(a)
#[1] 1 0 0 0 1 0 2 1 1 0 2 0
f1(b)
#[1] 1 0 1 0
A base R option by defining an user function f
f <- function(v) replace(v, (ind <- which(v == 1) + 6)[ind <= length(v)], 2)
such that
> f(a)
[1] 1 0 0 0 1 0 2 1 1 0 2 0
> f(b)
[1] 1 0 1 0

Creating multiple summary tables with one function in R

I couldn't find an answer to this specific question sorry if it's been asked:
library(tidyverse)
#sampledata
df <- data.frame(group=c(1, 1, 1, 1, 0, 0, 0, 0),
v1=c(1, 0, 0, 1, 0, 1, 1, 1),
v2=c(0, 0, 0, 0, 1, 0, 0, 1),
v3=c(0, 1, 0, 1, 1, 0, 1, 1))
I want to find the number of "1"s and "0"s in each v1, v2, v3 for each level of "group".
Currently I have been using
table(df$group, df$v1)
table(df$group, df$v2)
table(df$group, df$v3)
ad nauseum to get the number of "1" in each variable but I can't figure out how to create many such tables with one function...Any help would be greatly appreciated
We can use lapply to apply the same function to multiple columns.
lapply(df[-1], function(x) table(df$group, x))
#$v1
# x
# 0 1
# 0 1 3
# 1 2 2
#$v2
# x
# 0 1
# 0 2 2
# 1 4 0
#$v3
# x
# 0 1
# 0 1 3
# 1 2 2
Or with dplyr we can use count
purrr::map(names(df)[-1], ~count(df, group, !!sym(.x)))

Recode a value in a vector based on surrounding values

I'm trying to programmatically change a variable from a 0 to a 1 if there are three 1s before and after a 0.
For example, if the number in a vector were 1, 1, 1, 0, 1, 1, and 1, then I want to change the 0 to a 1.
Here is data in the vector dummy_code in the data.frame df:
original_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1))
Here is how I'm trying to have the values be recoded:
desired_df <- data.frame(dummy_code = c(1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1)
I tried to use the function fill in the package tidyr, but this fills in missing values, so it won't work. If I were to recode the 0 values to be missing, then that would not work either, because it would simply code every NA as 1, when I would only want to code every NA surrounded by three 1s as 1.
Is there a way to do this in an efficient way programmatically?
An rle alternative, using the x from #G. Grothendieck's answer:
r <- rle(x)
Find indexes of runs of three 1:
i1 <- which(r$lengths == 3 & r$values == 1)
Check which of the "1 indexes" that surround a 0, and get the indexes of the 0 to be replaced:
i2 <- i1[which(diff(i1) == 2)] + 1
Replace relevant 0 with 1:
r$values[i2] <- 1
Reverse the rle operation on the updated runs:
inverse.rle(r)
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
A similar solution based on data.table::rleid, slightly more compact and perhaps easier to read:
library(data.table)
d <- data.table(x)
Calculate length of each run:
d[ , n := .N, by = rleid(x)]
For "x" which are zero and the preceeding and subsequent runs of 1 are of length 3, set "x" to 1:
d[x == 0 & shift(n) == 3 & shift(n, type = "lead") == 3, x := 1]
d$x
# [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
Here is a one-liner using rollapply from zoo:
library(zoo)
rollapply(c(0, 0, 0, x, 0, 0, 0), 7, function(x) if (all(x[-4] == 1)) 1 else x[4])
## [1] 1 0 0 1 1 1 1 1 1 1 0 0 1
Note: Input used was:
x <- c(1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1)

how to remove one data in r

In R I have some vector.
x <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0)
I want to remove only "0" in x vector, but it removes all '0' in this vector.
Example
x=x[!x %in% 0 )]
All zero in this vector had been remove in x vector
For Example in Python
x = [0,1,0,1,0,0,0,1]
x.remove(0)
x
[1, 0, 1, 0, 0, 0, 1]
x.remove(0)
x
[1, 1, 0, 0, 0, 1]
We can use match to remove the first occurrence of a particular number
x <- c(1, 0, 1, 0, 0, 0, 1)
x[-match(1, x)]
#[1] 0 1 0 0 0 1
If you have any other number to remove in array, for example 5 in the case below,
x <- c(1, 0, 5, 5, 0, 0, 1)
x[-match(5, x)]
#[1] 1 0 5 0 0 1
You may need which.min(),
which determines the index of the first minimum of a vector:
x <- c(0,1,0,1,0,0,0,1)
x <- x[-which.min(x)]
x
# [1] 1 0 1 0 0 0 1
If your vector contains elements other than 0 or 1: x <- x[-which.min(x != 0)]

R: creating sequence of numbers by group and starting the sequence by a particular condition

I would like to create a new variable, Number, which sequentially generate numbers within a group ID, starting at a particular condition (in this case, when Percent > 5).
groupID <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3)
Percent <- c( 3, 4, 5, 10, 2, 1, 6, 8, 4, 8, 10, 11)
Number <- ifelse (Percent < 5, 0, 1:4)
I get:
> Number
[1] 0 0 3 4 0 0 3 4 0 2 3 4
But I'd like:
0 0 1 2 0 0 1 2 0 1 2 3
I did not include groupID variable within the ifelse statement and used 1:4 instead, as there are always 4 rows within each groupID.
Any suggestions or clues? Thank you!
ave(Percent, groupID, FUN=function(x) cumsum(x>=5))
[1] 0 0 1 2 0 0 1 2 0 1 2 3
To the example in the comments below, this is my alternate logical test to be cumsum()-ed:
ave(Percent, groupID, FUN=function(x) cumsum(seq_along(x)>= which(x >=5)[1]) )
It's ugly and throws warnings, but it gets you what you want:
ave(Percent,groupID,FUN=function(x) {x[x<5] <- 0; x[x>=5] <- 1:4; x} )
#[1] 0 0 1 2 0 0 1 2 0 1 2 3
#BondedDust's answer below using cumsum is almost certainly more appropriate though.
If your data was not always in ascending order in each group, you could also replace all the >=5 values like:
Percent <- c( 3, 5, 4, 10, 2, 1, 6, 8, 4, 8, 10, 11)
ave(Percent, list(groupID,Percent>=5), FUN=function(x) cumsum(x>=5))
#[1] 0 1 0 2 0 0 1 2 0 1 2 3
Try this:
ID <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3)
Percent <- c( 3, 4, 5, 10, 2, 1, 6, 8, 4, 8, 10, 11)
Number <- Percent >= 5
result = lapply(seq_along(Number), function(i){
if( length(which(! Number[1:i]) ) == 0){start = 1}
else {start =max(which(! Number[1:i]) )}
sum( Number[start : i])
})
> unlist(result)
[1] 0 0 1 2 0 0 1 2 0 1 2 3

Resources