Find all m-tuples that sum to n - r

I want to find ALL the non-negative integer solutions to the equation i+j+k+l+m=n where n is a non-negative integer. That is, I want to find all possible 5-tuples (i,j,k,l,m) with respect to a certain n, in R.
I wrote a code which is not working. I am suspicious there is something wrong in the looping.
For your convenience, I have taken n=3, so I am basically trying to compute all vectors (i,j,k,l,m) which are 35 in number, and the matrix a(35 by 5) is the matrix that is supposed to display those vectors. The whole thing is in the function "sample(n)", where if I put n=3 i.e. sample(3) when called will give me the matrix a. Please note that a (35 by 5) is defined beforehand with all entries 0.
sample=function(n){
i=0
j=0
k=0
l=0
m=0
for(p in 1:35){
while(i<=3){
while(j<=3){
while(k<=3){
while(l<=3){
m=n-(i+j+k+l)
if(m>-1){
a[p,]=c(i,j,k,l,m)
}
l=l+1}
k=k+1}
j=j+1}
i=i+1}
}
return(a)
}
When I call sample(3), I get my original a i.e. the matrix with all elements 0. What is wrong with this code? Please rectify it.

I don't think a brute-force approach will bring you much joy for this task. Instead you should look for existing functions that can be used and are efficient (i.e. implemented in C/C++).
n <- 3
library(partitions)
blockparts(rep(n, 5), n)
#[1,] 3 2 1 0 2 1 0 1 0 0 2 1 0 1 0 0 1 0 0 0 2 1 0 1 0 0 1 0 0 0 1 0 0 0 0
#[2,] 0 1 2 3 0 1 2 0 1 0 0 1 2 0 1 0 0 1 0 0 0 1 2 0 1 0 0 1 0 0 0 1 0 0 0
#[3,] 0 0 0 0 1 1 1 2 2 3 0 0 0 1 1 2 0 0 1 0 0 0 0 1 1 2 0 0 1 0 0 0 1 0 0
#[4,] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 3 0 0 0 0 0 0 1 1 1 2 0 0 0 1 0
#[5,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3

I believe that your code isn't answering your stated problem (as I understand it), on top of possible errors in your code.
One way to think of the problem is that, given the quadruple (i,j,k,l), the value of m = n - (i + j + k + l), while noting that the quadruple (i,j,k,l) is constrained so that n >= i+j+k+l AND i,j,k,l >= 0. For example, consider the following algorithm:
Let i freely take any value between 0 and n.
Given i, j can take values between 0 and n-i.
Given (i,j), k takes values between 0 and n-i-j.
Given (i,j,k), l takes values between 0 and n-i-j-k.
Given (i,j,k,l), m is defined as m = n - i - j - k -l.
The following code ought to answer your question. Please comment if this is not what you were looking for.
sample.example = function(n){
a=array(0,c(0,5))
for(i in 0:n){
for(j in seq(from=0,to=n-i,by=1)){
for(k in seq(from=0,to=n-i-j,by=1)){
for(l in seq(from=0,to=n-i-j-k,by=1)){
m = n - i -j - k - l
a = rbind(a,c(i,j,k,l,m))
}}}}
return(a)
}

Related

Drawing conditional combinations of a binary vector one by one

I am trying to write a routine to find combinations conditionally of a binary vector. For example, consider the following vector:
> A <- rep(c(1,0,0),3)
> A
[1] 1 0 0 1 0 0 1 0 0
Note that, length of the vector A is always multiple of 3. So the following condition always holds:
length(A) %% 3 == 0
The main condition is that there must be only a single 1 in each set of 3 vectors consecutively. In this example, for instance, one element of A[1:3] will be 1, one element of A[4:6] will be 1 and one element of A[7:9] will be 1 and the rest are all 0. Therefore, for this example, there will be a total of 27 possible combinations.
Objective is to make a routine to draw/return the next valid combination until all the possible legal combinations are returned.
Note that, I am not looking for a table with all the possible combinations. That Solution is already available in my other query in StackOverflow. However, with that method, I am running into memory problems when going beyond more than a length of 45 elements in A, as it is returning the full matrix which is huge. Therefore instead of storing the full matrix, I want to retrieve one combination at a time, and then decide later if I want to store it or not.
What the OP is after is an iterator. If we were to do this properly, we would write a class in C++ with a get_next method, and expose this to R. As it stands, with base R, since everything is passed by value, we must call a function on our object-to-be-updated and reassign the object-to-be-updated every time.
Here is a very crude implementation:
get_next <- function(comb, v, m) {
s <- seq(1L, length(comb), length(v))
e <- seq(length(v), length(comb), length(v))
last_comb <- rev(v)
can_be_incr <- sapply(seq_len(m), function(x) {
!identical(comb[s[x]:e[x]], last_comb)
})
if (all(!can_be_incr)) {
return(FALSE)
} else {
idx <- which(can_be_incr)[1L]
span <- s[idx]:e[idx]
j <- which(comb[span] == 1L)
comb[span[j]] <- 0L
comb[span[j + 1L]] <- 1L
if (idx > 1L) {
## Reset previous maxed out sections
for (i in 1:(idx - 1L)) {
comb[s[i]:e[i]] <- v
}
}
}
return(comb)
}
And here is a simple usage:
m <- 3L
v <- as.integer(c(1,0,0))
comb <- rep(v, m)
count <- 1L
while (!is.logical(comb)) {
cat(count, ": ", comb, "\n")
comb <- get_next(comb, v, m)
count <- count + 1L
}
1 : 1 0 0 1 0 0 1 0 0
2 : 0 1 0 1 0 0 1 0 0
3 : 0 0 1 1 0 0 1 0 0
4 : 1 0 0 0 1 0 1 0 0
5 : 0 1 0 0 1 0 1 0 0
6 : 0 0 1 0 1 0 1 0 0
7 : 1 0 0 0 0 1 1 0 0
8 : 0 1 0 0 0 1 1 0 0
9 : 0 0 1 0 0 1 1 0 0
10 : 1 0 0 1 0 0 0 1 0
11 : 0 1 0 1 0 0 0 1 0
12 : 0 0 1 1 0 0 0 1 0
13 : 1 0 0 0 1 0 0 1 0
14 : 0 1 0 0 1 0 0 1 0
15 : 0 0 1 0 1 0 0 1 0
16 : 1 0 0 0 0 1 0 1 0
17 : 0 1 0 0 0 1 0 1 0
18 : 0 0 1 0 0 1 0 1 0
19 : 1 0 0 1 0 0 0 0 1
20 : 0 1 0 1 0 0 0 0 1
21 : 0 0 1 1 0 0 0 0 1
22 : 1 0 0 0 1 0 0 0 1
23 : 0 1 0 0 1 0 0 0 1
24 : 0 0 1 0 1 0 0 0 1
25 : 1 0 0 0 0 1 0 0 1
26 : 0 1 0 0 0 1 0 0 1
27 : 0 0 1 0 0 1 0 0 1
Note, this implementation will be memory efficient, however it will be very slow.

Creating a repeated sequence of zero and ones with uneven "breaks" between

I am trying to create a sequence consisting of 1 and 0 using Rstudio.
My desired output is a sequence that first has five 1 then six 0, followed by four 1 then six 0. Then this should all be repeat until the end of a given vector.
The result should be like this:
1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 .....
Hope someone has a good solution, and sorry if I have some grammar mistakes
Best,
HB
rep(c(rep(1,5),rep(0,6),rep(1,4),rep(0,6)),n)
repeating your pattern n times.
You could use Map.
unlist(Map(function(x, ...) c(rep(x, ...), rep(0, 6)), 1, times=length(v):1))
# [1] 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0
Instead of length(v):1 you may also use rev(seq(v)) but it's slower.
Data
v <- c("Vector", "of", "specific", "length", "five")

R - Creating a new column within a data frame when two or more columns are a match in a row

I'm currently stuck on a part of my code that feels intuitive but I can't figure a way to do it. I have a very big data frame (nrows = 34036, ncol = 43) in which I want to create a continuous sequence of the variables where the value of the row is 1 (without having multiple columns with 1). It consists of only zeros and ones similar to the following:
A B C D
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 0
0 0 0 0
1 0 1 0
1 0 1 0
0 1 0 0
0 1 0 0
1 0 0 1
I was able to remove the zeroes using:
#find the sum of each row
placeholderData <- transform(placeholderData, sum=rowSums(placeholderData))
placeholderData <- placeholderData[!(placeholderData$sum <= 0),]
And the data frame now looks like:
A B C D sum
1 0 0 0 1
0 0 0 1 1
0 0 0 1 1
1 0 1 0 2
1 0 1 0 2
0 1 0 0 1
0 1 0 0 1
1 0 0 1 2
My main problem comes when there are two or more 1's in a row. To try to solve this, I used the following code to identify the columns that have a sum of 2 or more:
placeholderData$Matches <- lapply(apply(placeholderData == 1, 1, which), names)
Which added the following column to the data frame:
A B C D sum Matches
1 0 0 0 1 A
0 0 0 1 1 D
0 0 0 1 1 D
1 0 1 0 2 c("A","C")
1 0 1 0 2 c("A","C")
0 1 0 0 1 B
0 1 0 0 1 B
1 0 0 1 2 c("A", "D")
I added the Matches column as an approach to solve the problem, but I'm not sure how would I do it without using a lot of logical operators (I don't know what columns have matches or not). What I would like to do is to aggregate the rows that have more than (or equal to) two 1's into a new column, to be able to have a data frame like this:
A B C D AC AD sum Matches
1 0 0 0 0 0 1 A
0 0 0 1 0 0 1 D
0 0 0 1 0 0 1 D
0 0 0 0 1 0 1 c("A","C")
0 0 0 0 1 0 1 c("A","C")
0 1 0 0 0 0 1 B
0 1 0 0 0 0 1 B
0 0 0 0 0 1 1 c("A", "D")
Then, I would be able to use my code as normal (It works just fine when there are no repeated values in rows). I tried searching to find similar questions, but I'm not sure if I was even asking the right question. I was wondering if anyone could provide some help or some ideas that I could try.
Thank you very much!
This seems a lot like making dummy variables, so I would use the model.matrix function commonly used for dummy variables (one-hot encoding):
m = read.table(header = T, text = "A B C D
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 0
0 0 0 0
1 0 1 0
1 0 1 0
0 1 0 0
0 1 0 0
1 0 0 1")
m = m[rowSums(m) > 0, ]
d = factor(sapply(apply(m == 1, 1, which), function(x) paste(names(m)[x], collapse = "")))
result = data.frame(model.matrix(~ d + 0))
names(result) = levels(d)
# A AC AD B D
# 1 1 0 0 0 0
# 2 0 0 0 0 1
# 3 0 0 0 0 1
# 4 0 1 0 0 0
# 5 0 1 0 0 0
# 6 0 0 0 1 0
# 7 0 0 0 1 0
# 8 0 0 1 0 0

Permutation position of numbers in R

I'm looking for a function in R which can do the permutation. For example, I have a vector with five 1 and ten 0 like this:
> status=c(rep(1,5),rep(0,10))
> status
[1] 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Now I'd like to randomly permute the position of these numbers but keep the same number of 0 and 1 in vector and to get new series of number, for example to get something like this:
1 1 0 1 0 1 0 0 0 0 0 1 0 0 0
or
1 0 0 0 0 0 0 1 1 0 0 1 0 1 0
I found the function sample() can help us to sample, but the number of 1 and 0 is not the same each time. Do you know how can I do this with R? Thanks in advance.
We can use sample
sample(status)
#[1] 1 0 0 1 0 0 1 0 0 0 0 1 0 1 0
sample(status)
#[1] 0 0 0 0 1 1 0 0 1 1 0 0 0 1 0
If we use sample to return the entire vector, it will do the permutation and give the frequency count same for each of the unique elements
colSums(replicate(5, sample(status)))
#[1] 5 5 5 5 5
i.e. we get 5 one's in each of the sampling. So, the remaining 0's would be 10.

reverse lexicographic order after using expand.grid

I'm trying to generate the following matrix, based on a multinomial framework. For example, if I had three columns, I'd get:
0 0 0
1 0 0
0 1 0
0 0 1
1 1 0
1 0 1
0 1 1
1 1 1
But, I want many more columns. I know I can use expand.grid, like:
u <- list(0:1)
expand.grid(rep(u,3))
But, it returns what I want in the wrong order:
0 0 0
1 0 0
0 1 0
1 1 0
0 0 1
1 0 1
0 1 1
1 1 1
Any ideas? Thanks.
You can reorder your rows to match your expected output:
u <- list(0:1)
g <- expand.grid(rep(u,3))
g <- g[order(rowSums(g)), ]

Resources