Code onset from event occurrence - r

I have a vector that gives presence/absence of an event (insurgency in this case) over time, and I'd like to create another vector that gives onset of the event, i.e.:
occurrence <- c(1, 1, 0, 0, 1, 0, 0, 1, 1, 1)
onset <- c(0, 0, 0, 0, 1, 0, 0, 1, 0, 0)
The following loop will get what I need:
answer <- 0
for (t in 2:length(occurrence) {
answer[t] <- ifelse((occurrence[t]-occurrence[t-1])==1, 1, 0)
}
> answer
[1] 0 0 0 0 1 0 0 1 0 0
Is there an easier way of doing this?
Thanks.

Use pmax() and diff():
c(NA, pmax(0, diff(occurrence)))
[1] NA 0 0 0 1 0 0 1 0 0
This works because diff() calculates the difference between successive elements, resulting in 1 for every start. Then you need to remove the 0 and -1 values. pmax is a parallel version of max() and is handy to change all -1s to zero
diff(occurrence)
[1] 0 -1 0 1 -1 0 1 0 0

You can compare the prior time interval to the current one and chose those where 0 is followed by one with this code:
> c(0, as.numeric(occurrence[-length(occurrence)] == 0 & occurrence[-1]==1) )
[1] 0 0 0 0 1 0 0 1 0 0
(I padded it with a leading 0 because you didn't want leading occurrence:1's to be counted as new events.)

Related

How to produce a binary variable that takes value of 0 before the last non-zero value of another variable and 1 after it

Given the variable
a <- c(1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0)
I would like to mark the last non-zero value to produce a variable that takes value of 0 before the benchmark and 1 after it. In this case
b <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
A simple command that allows to mark the last non-zero value of a a column is
tail(which(a!=0),1)
But how to create the variable I need conditional on such benchmark? Any help would be much appreciated!
One option could be:
+(cumsum(a) == max(cumsum(a)))
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
Or:
+(cumsum(a) == sum(a))
We can use which to get the position, index where a is 1, wrap with max to return the last position, create a logical vector with sequence of 'a' and coerce to binary (+)
+(seq_along(a) >= max(which(a == 1)))
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
Or use Position
+(seq_along(a) >= Position(function(x) x == 1, a, right = TRUE))

Increase the data resolution of a vector of episode durations without a loop function

I have a vector Z<-c(0, 0, 0, 0, 360, 0, 0, 0, 0) of a daily duration of events (each element of the vector is a day and the value of each element is the episode duration in minutes) so that each element of the vector is [0,1440].
I would like to convert the vector from daily minutes to 1 minute intervals (so that each element of the vector is a 1 minute interval) and the value of each element would be [0,1]. I came up with the following solution
Z<-c(0, 0, 0, 0, 360, 0, 0, 0, 0)
c.Z<-1440- Z
temp<-NULL
for (i in 1:length(Z)){
temp<-c(temp, c(rep(1,Z[i]), rep(0,c.Z[i])))
}
temp
> sum(temp)==sum (Z)
[1] TRUE
> length(temp)==1440*length(Z)
[1] TRUE
Is there a faster, nicer or quicker way to do this without the for function?
You can use rep in this fashion :
temp2 <- rep(rep(c(1, 0), length(Z)), c(rbind(Z, 1440-Z)))
identical(temp, temp2)
#[1] TRUE
The first rep repeats 1, 0 for each day.
rep(c(1, 0), length(Z))
#[1] 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
We use c(Z, 1440-Z) to create times argument for second rep.
c(rbind(Z, 1440-Z))
#[1] 0 1440 0 1440 0 1440 0 1440 360 1080 0 1440 0 1440 0 1440 0 1440

How to reference previous entries in a column while building with for loop in R?

In R:: I'm trying to create a new column in a data frame with a for loop that references the previous row in the same column. I am returned an error message that reads "replacement has length zero."
I have tried using the "reduce" and " filter" functions.
df$STATUS <- 0
for(i in 1:nrow(df)) {
df$STATUS[i] <- ifelse(df$start[i]==1 | ((df$STATUS[i-1])==1 & df$stop[i]==0), 1, 0)
}
I expected this code to fill the STATUS column according to the if statement nested in the for loop. The STATUS column is intended to write a 1 when start =1, and remain 1 until stop = 1. Instead, I received the error message:
Error in
df$STATUS <- ifelse(df$start[i] == 1 | ((df$STATUS[i - :
replacement has length zero
As Yannis mentioned, assign a non-NA value for df$STATUS[1] and ten star the loop from 2.
df <- data.frame(start = c(1, 1, 0, 0, 1, 1, 0, 0), stop = c(0,
1, 1, 0, 1, 0, 1, 0))
df$STATUS <- 0 #THIS GIVES ALL VALUES OF STATUS, INCLUDING ROW 1 THE VALUE 0
print(df)
# start stop STATUS
# 1 0 0
# 1 1 0
# 0 1 0
# 0 0 0
# 1 1 0
# 1 0 0
# 0 1 0
# 0 0 0
for(i in 2:nrow(df)) {
df$STATUS[i] <- ifelse(df$start[i]==1 | ((df$STATUS[i-1])==1 &
df$stop[i]==0), 1, 0)
}
print(df)
# start stop STATUS
# 1 0 0
# 1 1 1
# 0 1 0
# 0 0 0
# 1 1 1
# 1 0 1
# 0 1 0
# 0 0 0

Generating lists in R with patterns related to the entry number

Is there a smart way to generate a list like the one below in R using perhaps lapply() or other more extrapolable procedures?
ones = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
twos = c(1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
threes = c(1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0)
fours = c(1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0)
fives = c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1)
l = list(ones, twos, threes, fours)
[[1]]
[1] 1 1 1 1 1 1 1 1 1 1 1
[[2]]
[1] 1 0 1 0 1 0 1 0 1 0 1
[[3]]
[1] 1 0 0 1 0 0 1 0 0 1 0
[[4]]
[1] 1 0 0 0 1 0 0 0 1 0 0
These correspond to polynomials coefficients in generating functions for partitions.
The first list is for ones and so the counting is in steps of 1 integer at a time; hence the vector 1,1,1,1,1,1,1,... In the entry [[2]] we have the twos, and we are counting by 2's starting at 0, skipping the 1 (coded as 0). In [[3]] we are counting by 3's: zero, three, six, nine, etc.
A fairly straightforward way in base R is
lapply(seq(0L, 5L), function(i) rep(c(1L, integer(i)), length.out=11L))
[[1]]
[1] 1 1 1 1 1 1 1 1 1 1 1
[[2]]
[1] 1 0 1 0 1 0 1 0 1 0 1
[[3]]
[1] 1 0 0 1 0 0 1 0 0 1 0
[[4]]
[1] 1 0 0 0 1 0 0 0 1 0 0
[[5]]
[1] 1 0 0 0 0 1 0 0 0 0 1
seq(0L, 5L) produces the vector 0 through 5, an equivalent would be seq_len(5L)-1L, which is faster for creation of large vectors.
c(1L, integer(i)) produces the inner, repeated part of the 0-1 vectors, which rep repeats according to the desired length (here 11) using the length.out argument.
lapply and function(i) allow the number of 0s to increase as we loop through the vector.

R : Updating a vector given a set of indices

I have a vector (initialized to zeros) and a set of indices of that vector. For each value in indices, I want to increment the corresponding index in the vector. So, say 6 occurs twice in indices (as in the example below), then the value of the 6th element of the vector should be 2.
Eg:
> v = rep(0, 10)
> v
[1] 0 0 0 0 0 0 0 0 0 0
> indices
[1] 7 8 6 6 2
The updated vector should be
> c(0, 1, 0, 0, 0, 2, 1, 1, 0, 0)
[1] 0 1 0 0 0 2 1 1 0 0
What is the most idiomatic way of doing this without using loops?
The function tabulate is made for that
> indices = c(7,8,6,6,2);
> tabulate(bin=indices, nbins=10);
[1] 0 1 0 0 0 2 1 1 0 0
You can use rle for this:
x <- rle(sort(indices))
v[x$values] <- x$lengths
v
# [1] 0 1 0 0 0 2 1 1 0 0

Resources