Group sequence of integers

Group sequence of integers - r

I have bunch of observations
x = c(1, 2, 4, 1, 6, 7, 11, 11, 12, 13, 14)
that I want to turn into the group:
y = c(1, 1, 1, 1, 2, 2, 3, 3, 3, 3, 3)
I.e I want the first 5 integers (1 to 5) to constitute one group, the next 5 integers to constitute the next group (6 to 10), and so on.
Is there a straightforward way to accomplish this without a loop?
Clarification: I need to programmatically create the groups form the input vector (x)

We can use %/% to create the group
x%/%5+1
#[1] 1 1 1 1 2 2 3 3 3 3 3

You can use ceiling to create groups
ceiling(x/5)
# [1] 1 1 1 1 2 2 3 3 3 3 3

Related

How to identify each integer sequence regardless of ties in a vector

This question is related to this identify whenever values repeat in r
While searching for answer there this new question arose:
I have this vector:
vector <- c(1, 1, 2, 3, 5, 6, 6, 7, 1, 1, 1, 1, 2, 3, 3)
I would like to identify each consecutive (by 1) integer sequence e.g. 1,2,3,.. or 3,4,5,.. or 4,5,6,7,...
BUT
It should allow ties 1,1,2,3,.. or 3,3,4,5,... or 4,5,5,6,6,7
The expected output would be a list like:
sequence1 <- c(1, 1, 2, 3)
sequence2 <- c(5, 6, 6, 7)
sequence3 <- c(1, 1, 1, 1, 2, 3, 3)
So far the nearest approach I found here Check whether vector in R is sequential?, but could not transfer it to what I want.

An option is with diff and cumsum
split(vector, cumsum(c(TRUE, abs(diff(vector)) > 1)))
-output
`1`
[1] 1 1 2 3
$`2`
[1] 5 6 6 7
$`3`
[1] 1 1 1 1 2 3 3

Vector of repeated index values

I have a vector of the following form:-
a <- c(4, 6, 3, 6, 1)
What I want is to make a vector such that it has the index of the vector a the number of times the value of that index in vector a.
Like the first index has value 4, so there should be 4 ones, followed by 6 twos, followed by 3 threes, and so on.
Then resulting vector should be of the following form:-
b <- c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5)
Thanks in advance.

We can use rep as :
a <- c(4, 6, 3, 6, 1)
rep(seq_along(a), a)
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 5

We can use sequence
cumsum(sequence(a) == 1)
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 3 4 4 4 4 4 4 5
Or using uncount
library(dplyr)
library(tidyr)
tibble(a) %>%
mutate(rn = row_number()) %>%
uncount(a)

index from one vector to another by closest values

Given two sorted vectors, how can you get the index of the closest values from one onto the other.
For example, given:
a = 1:20
b = seq(from=1, to=20, by=5)
how can I efficiently get the vector
c = (1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4)
which, for each value in a, provides the index of the largest value in b that is less than or equal to it. But the solution needs to work for unpredictable (though sorted) contents of a and b, and needs to be fast when a and b are large.

You can use findInterval, which constructs a sequence of intervals given by breakpoints in b and returns the interval indices in which the elements of a are located (see also ?findInterval for additional arguments, such as behavior at interval boundaries).
a = 1:20
b = seq(from = 1, to = 20, by = 5)
findInterval(a, b)
#> [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

We can use cut
as.integer(cut(a, breaks = unique(c(b-1, Inf)), labels = seq_along(b)))
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

Count occurence of multiple numbers in vector one by one

I have two vectors
a <- c(1, 5, 2, 1, 2, 3, 3, 4, 5, 1, 2)
b <- (1, 2, 3, 4, 5, 6)
I want to know how many times each element in b occurs in a. So the result should be
c(3, 3, 2, 1, 2, 0)
All methods I found like match(),==, %in% etc. are not suited for entire vectors. I know I can use a loop over all elements in b,
for (i in 1:length(b)) {
c[I] <- sum(a==b, na.rm=TRUE)
}
but this is used often and takes to long. That's why I'm looking for a vectorized way, or a way to use apply().

You can do this using factor and table
table(factor(a, unique(b)))
#
#1 2 3 4 5 6
#3 3 2 1 2 0
Since you mentioned match, here is a possibility without sapply loop (thanks to #thelatemail)
table(factor(match(a, b), unique(b)))
#
#1 2 3 4 5 6
#3 3 2 1 2 0

Here is a base R option, using sapply with which:
a <- c(1, 5, 2, 1, 2, 3, 3, 4, 5, 1, 2)
b <- c(1, 2, 3, 4, 5, 6)
sapply(b, function(x) length(which(a == x)))
[1] 3 3 2 1 2 0
Demo

Here is a vectorised method
x = expand.grid(b,a)
rowSums( matrix(x$Var1 == x$Var2, nrow = length(b)))
# [1] 3 3 2 1 2 0

Group matching numbers in random order in R

I'm working on a Monte-Carlo simulation type problem and need to generate a vector of repeated random numbers, with the matching numbers grouped together, but in random order.
It's easier to explain with an example. If I had:
1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12
I would like it sorted as:
7, 7, 7, 3, 3, 3, 12, 12, 12, 1, 1, 1 (or with the groups of matching numbers in any order but ascending/descending).
The reason I need the random order is because my MC simulation is for 2 variables, so if both are in order they won't vary independently.
I've got as far as:
sort(rep(runif(50,1,10),10), decreasing = FALSE)
Which generates 50 random numbers between 1 and 10, repeats each 10 times, then sorts the 50 groups of 10 matching random numbers in ascending order (or it could easily be descending order if I changed "FALSE" to "TRUE"). I just can't figure out the last step of getting 50 groups of 10 matching numbers in random order. Can anyone help?

Here is one option with split
unlist(sample(split(v1, v1)), use.names = FALSE)
#[1] 3 3 3 1 1 1 12 12 12 7 7 7
Or another option is match with unique
v1[order(match(v1, sample(unique(v1))))]
data
v1 <- c(1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12)

An option could be as:
v <- c(1, 3, 7, 12, 1, 3, 7, 12, 1, 3, 7, 12)
lst <- split(v, unique(v))
sapply(sample(seq(length(lst)),length(lst)), function(i)lst[[i]])
# [,1] [,2] [,3] [,4]
#[1,] 3 12 7 1
#[2,] 3 12 7 1
#[3,] 3 12 7 1
#OR for having just a vector
as.vector(sapply(sample(seq(length(lst)),length(lst)), function(i)lst[[i]]))
#[1] 3 3 3 12 12 12 7 7 7 1 1 1

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Group sequence of integers - r

We can use %/% to create the group x%/%5+1 #[1] 1 1 1 1 2 2 3 3 3 3 3

You can use ceiling to create groups ceiling(x/5) # [1] 1 1 1 1 2 2 3 3 3 3 3

Related

How to identify each integer sequence regardless of ties in a vector

Vector of repeated index values

index from one vector to another by closest values

Count occurence of multiple numbers in vector one by one

Group matching numbers in random order in R

Categories

Resources