R generate number (Id) along sequence using two different vectors, lapply? - r

I have two vectors, which are basically starting and ending Row indices.
I want to group them using this vectors.
Example
a<-c(1,4,7,12)
b<-c(3,6,11,15)
my output vector should be
d <- c(1,1,1,2,2,2,3,3,3,3,3,4,4,4,4)

You can use rep to repeat value b-a times.
rep(seq_along(a), (b - a) + 1)
#[1] 1 1 1 2 2 2 3 3 3 3 3 4 4 4 4

Will this work:
> rep(1:length(a), c(b[1],diff(b)))
[1] 1 1 1 2 2 2 3 3 3 3 3 4 4 4 4
>

We can use
rep(seq_len(length(a)), (b - a) + 1)
#[1] 1 1 1 2 2 2 3 3 3 3 3 4 4 4 4

Related

rep and/or seq function to create continuously reducing vector?

Suppose I have a vector from 1 to 5,
a<-c(1:5)
What I need to do is to repeat the vector by losing one element continuously. That is, the final outcome should be like
1 2 3 4 5 1 2 3 4 1 2 3 1 2 1
We can reverse the vector and apply sequence
sequence(rev(a))
#[1] 1 2 3 4 5 1 2 3 4 1 2 3 1 2 1
Or another option is toeplitz
m1 <- toeplitz(a)
m1[lower.tri(m1, diag=TRUE)]
#[1] 1 2 3 4 5 1 2 3 4 1 2 3 1 2 1

R merge matrices with function

I would like to merge two matrices with different length on their incommon row.names with a function:
My first matrix (T) looks similar to this:
1 2 3 4
1 -4 3 2 2
1 2 1 1 5
2 3 -2 4 6
2 -2 1 -1 -9
Now I want to join this function into my new matrix (M), however in this matrix there should be only the colsum of the matching rows which are >=0 plus 1:
1 2 3 4
1 2 3 3 3
2 2 2 2 2
I tried following formula, which I found here in the forum, however it does not work:
merge.default(as.data.frame(M), as.data.frame(T), by = "row.names", function(x){colSums(T[,]>0)+1})
Do you have an idea, where my mistake is?
Thank you very much
EDIT: my desired output would be my Matrix T, which is at the moment empty:
T now:
1 2 3 4
1
2
T after merge which is now filled with the function:
colsums(T[,] >=0)+1
1 2 3 4
1 2 3 3 3
2 2 2 2 2
T[1,1]= 2 as there is 1 value in Matrix M which is >=0 and then I add 1 to it
T[2,1]= 3 : two values >=0 and plus 1

Proportion of dataset equal to a value

I have the following dataset called asteroids
3 4 3 3 1 4 1 3 2 3
1 1 4 2 3 3 2 6 1 1
3 3 2 2 2 2 1 3 2 1
6 1 3 2 2 1 2 2 4 2
I need to find out what proportion of this dataset is 1.
If you have a specific value in mind you can just do an equality comparison and then use mean on the resulting logical vector.
> asteroids <- scan(what=numeric())
1: 3 4 3 3 1 4 1 3 2 3 1 1 4 2 3 3 2 6 1 1 3 3 2 2 2 2 1 3 2 1 6 1 3 2 2 1 2 2 4 2
41:
Read 40 items
> mean(asteroids == 1)
[1] 0.25
This works since the equality comparison will give TRUE and FALSE and when T/F are coerced numerically they become 1s and 0s so mean ends up giving us the proportion of TRUEs.
I assumed asteroids was a vector. You don't specify in your question but if it's a different type of structure you'll probably need to coerce it into a vector in some way or another.
Assuming that 'asteroids' is a data.frame, unlist it, get the table and find the proportion with prop.table.
prop.table(table(unlist(asteroids)==1))
# FALSE TRUE
# 0.75 0.25
Or as #Richard Scriven mentioned, we can convert the data.frame to a logical matrix, and use table directly on it as 'matrix' is a vector with dim attributes.
prop.table(table(asteroids == 1))

In R: How to create a vector of lagged differences but keep the original value for negative differences without using loops

I have a vector in R of the form:
> a <- c(1,3,5,7,9,11,1,3,5,7,9,11,1,3,5,7,9,11)
> a
[1] 1 3 5 7 9 11 1 3 5 7 9 11 1 3 5 7 9 11
I can take the lagged differences like this:
b <- diff(a)
> b
[1] 2 2 2 2 2 -10 2 2 2 2 2 -10 2 2 2 2 2
But I would like the negative differences to be replaced by the original values in the vector a. Or, in this case the -10's to be replaced by the 1's.
Is there a way to do this without looping though the vectors?
Thanks
One possible way:
indices<-which(b<0)
b[indices]<-a[indices+1]
One approach using replacement:
d <- diff(a)
d_neg <- d < 0
d[d_neg] <- a[-1][d_neg]
# [1] 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2
One approach using ifelse:
d <- diff(a)
ifelse(d < 0, a[-1], d)
# [1] 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2
One approach using mathematics and pmax:
d <- diff(a)
(d < 0) * a[-1] + pmax(d, 0)
# [1] 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2

R - How create a variable based in another variable

I have:
v1 <- c(1,1,1,2,2,2,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4)
and I want create v2 which assigns to v1 the number of sets of 3 elements:
v2 <- c(1,1,1,1,1,1,1,1,1,2,2,2,3,3,3,1,1,1,2,2,2)
Explanation:
For the first three times a number is repeated the value corresponding to that number is a 1, for the second three times it's a 2, and so on.
v1 <- c(1,1,1,2,2,2,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4)
Use rle to find the run lengths:
l <- rle(v1)$lengths
#[1] 3 3 9 6
Create a sequence 1:n for each run length n:
s <- sequence(l)
#[1] 1 2 3 1 2 3 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6
Use integer division:
(s - 1) %/% 3 + 1
#[1] 1 1 1 1 1 1 1 1 1 2 2 2 3 3 3 1 1 1 2 2 2

Resources