The diff function for vectors [closed] - r

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I am trying out R studio myself and have a question.
I have a vector
vec <- c(1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1)
I want to make a function to do the following: if the distance between two subsequences of 1's less then 5, then it is going to show 0. But if it is more than 5 it will show 1.
So, if looking at
vec <- c(1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1)
the output is going to be:
0 0 1
I understand how I can find a position of 1:
function_start_of_seq <- function(x) {
one_pos<-which(rle(x)$values==1 %in% TRUE)
And I know that I need to use diff function and cumsum, but I don't know how...

Perhaps an approach regarding rather the 0s than the 1s is more appropriate. In the next line you can check the lengths of the rle() output which distance (i.e. number of 0s between the 1s) exceeds the 5. Just convert it into 0-1 with as.numeric()at the end.
fun1 <- function(x) {
null_pos <- which(rle(x)$values == 0)
tf <- rle(x)$lengths[null_pos] > 5
return(as.numeric(tf))
}
> fun1(vec)
[1] 0 0 1
Does that make sense?
In case you want a one-liner, just do
> as.numeric(rle(vec)$lengths[which(rle(vec)$values == 0)] > 5)
[1] 0 0 1
The part which(rle(vec)$values == 0) selects the positions with distance between 1s sequences (i.e. the output of rle() regarding the 0s) is greater than 5.
as.numeric() then "translates" the output into the 0-1 - form you desire.

An uncool, non-obfuscated, only-calling-rle-once, no-use-of-which answer:
vec <- c(1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1)
r <- rle(vec)
r
## Run Length Encoding
## lengths: int [1:7] 1 2 2 4 1 6 1
## values : num [1:7] 1 0 1 0 1 0 1
So it seems the distance between the 1 sequences is what you're after. We'll assume you know you always have 0's and 1's.
r$values == 0 will return a vector with TRUE or FALSE for the result of each positional evalution. We can use that directly in r$lengths.
rl <- r$lengths[r$values == 0]
rl
## [1] 2 4 6
Since it's just 0 and 1, we don't need a double. integers will do just fine:
as.integer(rl > 5)
## [1] 0 0 1

Related

Count the max number of ones in a vector

I am doing the next task.
Suppose that I have the next vector.
(1,1,0,0,0,1,1,1,1,0,0,1,1,1,0)
I need to extract the next info.
the maximum number of sets of consecutive zeros
the mean number of consecutive zeros.
FOr instance in the previous vector
the maximum is: 3, because I have 000 00 0
Then the mean number of zeros is 2.
I am thinking in this idea because I need to do the same but with several observations. I think to implement this inside an apply function.
We could use rle for this. As there are only binary values, we could just apply the rle on the entire vector, then extract the lengths that correspond to 0 (!values - returns TRUE for 0 and FALSE others)
out <- with(rle(v1), lengths[!values])
And get the length and the mean from the output
> length(out)
[1] 3
> mean(out)
[1] 2
data
v1 <- c(1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0)
You can try another option using regmatches
> v <- c(1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0)
> s <- paste0(v, collapse = "")
> zeros <- unlist(regmatches(s, gregexpr("0+", s)))
> length(zeros)
[1] 3
> mean(nchar(zeros))
[1] 2

How to solve a system of linear inequalities in R

Suppose I have a system of linear inequalities: Ax <= b. I'm trying to figure out how to solve this in R.
I know that the eliminate function from the package lintools performs variable elimination. The output is a list of the following information:
A: the A corresponding to the system with variables eliminated.
b: the constant vector corresponding to the resulting system
neq: the number of equations
H: The memory matrix storing how each row was derived
h: The number of variables eliminated from the original system.
I wrote a loop to try to perform variable elimination. However, I am not sure how to get the final solutions from this system of linear inequalities:
library(lintools)
A <- matrix(c(
4, -5, -3, 1,
-1, 1, -1, 0,
1, 1, 2, 0,
-1, 0, 0, 0,
0, -1, 0, 0,
0, 0, -1, 0),byrow=TRUE,nrow=6)
b <- c(0,2,3,0,0,0)
L <- vector("list", length = nrow(A))
L[[1]] <- list(A = A, b = b, neq = 0, nleq = nrow(A), variable = 1)
for(i in 1:(nrow(A) - 3)){
print(i)
L[[i + 1]] <- eliminate(A = L[[i]]$A, b = L[[i]]$b, neq = L[[i]]$neq, nleq = L[[i]]$nleq, variable = i + 1)
}
Presumably you will know what to do with this (I don't):
str(L) # the last two items in L are NULL
tail(L,n=3)[[1]] #Take the first of the last three.
$A
[1,] -0.5 0 0 0
[2,] 0.5 0 0 0
[3,] -1.0 0 0 0
$b
[1] 3.5 1.5 0.0
$neq
[1] 0
$nleq
[1] 3
$H
NULL
$h
[1] 0

R: randomly sample a nonzero element in a vector and replace other elements with 0

Suppose I have a vector
vec <- c(0, 1, 0, 0, 0, 1, 1, 1, 1, 2)
How do I random sample a nonzero element and turn other elements into 0?
Suppose the element sampled was vec[2], then the resulting vector would be
vec <- c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
I know that I can sample the indice of one nonzero element by sample(which(vec != 0), 1), but I am not sure how to proceed from that. Thanks!
You can try the code below
> replace(0 * vec, sample(which(vec != 0), 1), 1)
[1] 0 0 0 0 0 0 0 1 0 0
where
which returns the indices of non-zero values
sample gives a random index
replace replaces the value to 1 at the specific index
Watch out for sample's behavior if which returns only 1 value:
> vec <- c(rep(0, 9), 1)
> sample(which(vec != 0), 1)
[1] 4
This preserves the vector value (instead of turning it to 1) and guards against vectors with only one nonzero value using rep to guarantee sample gets a vector with more than one element:
vec[-sample(rep(which(vec != 0), 2), 1)] <- 0

Maximum consecutive repeats in a vector in R

I have a vector containing 0 and 1. I want to return the maximum value of number of times 1 appears consecutively. For e.g. if x is the input vector
x <-c(0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1)
Expected Output: 3
My attempt:
I'm using function rle to do this job. Here is my sample code:
x<-c(0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1)
y<-rle(x)
max_repeat <-max(y$lengths)
In this scenario, I get output as 4 (corresponding to 0 instead of 1). I tried to use tapply to access the complete output of rle, but I am not able to extract out the maximum repeat corresponding to value 1.
out <-tapply(y$lengths, y$values, max)
This is what I get for out:
0 1
4 3
When I look at the structure of out, it is " int [1:2(1d)] 4 3". I do not have enough experience with dealing this type of variables. I need to extract the value corresponding to 1 i.e. 3. Any help will be appreciated!
Thanks
You can try this:
x<-c(0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1)
y<-rle(x)
max(y$lengths[y$values==1])
# 3
If you want informations about any object, here y you can use the function str, which will return informations about what contains any object.
I found it like this =>
x <-c(0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1)
y=rle(x)
max(y$lengths[y$values==1])
hope it meets your expectations.

Determining size of identical adjacents values in a vector

I have a vector made of 0 and 1. It refers to hourly met data with 0 = no rain, 1 = rain event during the corresponding hour.
The objective is to determine the duration of all rain events i.e. the length of each block of 1s in the vector.
Is there anything better than a loop screening all values and neigthbours 1 by 1.
Thanks in advance for your help.
All the best,
Vincent
As #joran suggests, rle is what you want.
hourly.rain <- c(0, 0, 1, 1, 0, 1, 0, 0, 1, 1)
with(rle(hourly.rain), lengths[values == 1])
#[1] 2 1 2
If you want to observe an inter-event time, say 2 hours, (i.e., events separated by 2 hours or less are considered the same event), you can also use rle to replace those 0s within the inter-event period with 1s.
inter.event <- 2
hourly.rain <- c(0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1)
with(rle(hourly.rain), {
     fwd.lag <- c(head(values, -1), 1)
     bkwd.lag <- c(1, tail(values, -1))
     replace.vals <- values == 0 & lengths <= inter.event & fwd.lag == bkwd.lag
     rep(replace(values, replace.vals, 1) , lengths)
})
# [1] 0 0 0 1 1 1 1 0 0 0 1 1 1 1

Resources