R - find all possible combinations of numbers WITH constraints on combination length - r

Let's say you have the following vector of numbers:
1, 2, 3, 4, 5
I want to find all possible combinations of numbers with the combination length 3. The combinations must not overlap, i.e. 1, 2, 3 is the same as 1, 3, 2 and only one of those should appear in the output!
So, the answers would be:
1, 2, 3
1, 2, 4
1, 2, 5
1, 3, 4
1, 3, 5
1, 4, 5
2, 3, 4
2, 3, 5
2, 4, 5
3, 4, 5
This is just a simple example, in reality I have a vector of length 10000 and I need to find all combinations with length 8000. What code would you use to generate those combinations in R?

#chinsoon12 suggested the package RcppAlgos. I investigated it and found that the following works:
comboIter(1:10000, 8000)

Related

Function that reveals the position in R [duplicate]

This question already has answers here:
rank and order in R
(7 answers)
Closed 4 years ago.
What is the difference between sort(), rank(), and order() in R.
Can you explain with examples?
sort() sorts the vector in an ascending order.
rank() gives the respective rank of the numbers present in the vector, the smallest number receiving the rank 1.
order() returns the indices of the vector in a sorted order.
for example: if we apply these functions are applied to the vector - c (3, 1, 2, 5, 4)
sort(c (3, 1, 2, 5, 4)) will give c(1,2,3,4,5)
rank(c (3, 1, 2, 5, 4)) will give c(3,1,2,5,4)
order(c (3, 1, 2, 5, 4)) will give c(2,3,1,5,4).
if you put these indices in this order, you will get the sorted vector. Notice how v[2] = 1, v[3] = 2, v[1] = 3, v[5] = 4 and v[4] = 5
also there is a tie handling method in R. If you run rank(c (3, 1, 2, 5, 4, 2)) it will give Rank 1 to 1, since there are two 2 present R will rank them on 2 and 3 but assign Rank 2.5 to each of them, next 3 will get Rank 4.0, so
rank(c (3, 1, 2, 5, 4, 2)) will give you output [4.0 1.0 2.5 6.0 5.0 2.5]
Hope this is helpful.

R purrr, can I use two arrays in a modify_if statement?

I want to use one array to decide whether or not to modify another, hoping to use modify_if, but there might be a simpler way. Here is my mwe.
vec1 <- array( c(1, 3, 5, 6, 7, 3, 2, 3))
vec2 <- array(c(TRUE, TRUE, FALSE,TRUE, FALSE, FALSE, TRUE, FALSE))
vec1 %<>% purrr::modify_if(~ .x[vec2], vec1 + 1)
So I already have the logical array to tell me which ones to change. If vec2 is TRUE, then I want to modify the value at that index to increment the current value of vec1 by 1, otherwise if vec2 is false then leave in the original value. The result I am looking for is
2, 4, 5, 7, 7, 3, 3, 3. Thx, J.
Within the tidyverse, I think you want
vec1
# [1] 1 3 5 6 7 3 2 3
vec1 %<>% purrr::modify_if(vec2, ~ .x + 1)
vec1
# [1] 2 4 5 7 7 3 3 3
Though as #MartinGal suggested, base R's might be more direct:
vec1 <- array( c(1, 3, 5, 6, 7, 3, 2, 3))
vec1[vec2] <- vec1[vec2] + 1
vec1
# [1] 2 4 5 7 7 3 3 3

Number Duplicated Cases

I want to identify duplicate cases and number them as a vector (such as with an ID variable). Any case without any direct matches should be labeled as a fixed value (such as zero). Any case with a corresponding duplicate should be labeled 1, with each subsequent case being labeled n+1. So, if I have an ID variable like this 1, 2, 2, 2, 3, 4, 4, 5, I'd want the corresponding vector to produce: 0, 1, 2, 3, 0, 1, 2, 0.
How can I do this?
Duplicate identifies the first case as a non-duplicate, so that doesn't work.
Base R, ave with seq_along
x<-c(1,2,2,2,3,4,4,5)
ave(seq_along(x),x,FUN=function(g) if(length(g)>1) seq_along(g) else 0)
#> 0 1 2 3 0 1 2 0

How to preserve the order of a vector in a table in R?

Pretty simple question, I assume. I am trying to do this for a different type of object (with class 'acf' and type 'list'), but I assume the answer is easily extendable for a vector (class numeric, type 'double'):
x<-c(4, 5, 6, 1, 2, 10, 15)
table(x)
x
1 2 4 5 6 10 15
1 1 1 1 1 1 1
I would like the output of the table to be in the same order as the vector (4, 5, 6, 1, 2, 10, 15). How can I achieve this?
table(factor(x, levels=unique(x)))

Reduce each consecutive sequence to its value and length

Assume you have a vector with runs of consecutive values:
v <- c(1, 1, 1, 2, 2, 2, 2, 1, 1, 3, 3, 3, 3)
How can it be best reduced to one value per run and the length of each run. I.e. the first run is 1 repeated two times; 2nd run: 2 repeated four times; 3rd run: 1 repeated two times, and so on:
v.df <- data.frame(value = c(1, 2, 1, 3),
repetitions = c(3, 4, 2, 4))
In a procedural language I might just iterate through a loop and build the data.frame as I go, but with a large dataset in R such an approach is inefficient. Any advice?
or more simply
data.frame(rle(v)[])
with(rle(v), data.frame(values, lengths))
should get you what you need.
values lengths
1 3
2 4
1 2
3 4

Resources