Finding n tuples from a list whose aggregation satisfies a condition - r

I have a list of two-element vectors. From this list, I'd like to find n vectors (not necessarily distinct) (x,y), such that the sum of the ys of these vectors is larger or equal to a number k. If multiple vectors satisfy this condition, select the one where the sum of the xs is the smallest.
For example, I'd like to find n=2 vectors (x1,y1) and (x2,y2) such that y1+y2 >= k. If there are more than just one which satisfies this condition, select the one where x1+x2 is the smallest.
I've so far only managed to set-up the following code:
X <- c(3, 2, 3, 8, 7, 7, 13, 11, 12, 12)
Y <- c(2, 1, 3, 6, 5, 6, 8, 9, 10, 9)
df <- data.frame(A, B)
l <- list()
for (i in seq(1:nrow(df))){
n <- as.numeric(df[i,])
l[[i]] <- n
}
Using the values above, let's say n=1, k=9, then I'll pick the tuple (x,y)=(11,9) because even though (12,9) also matches the condition that y=k, the x is smaller.
If n=2, k=6, then I'll pick (x1,y1)=(3,3) and (x2,y2)=(3,3) because it's the smallest x1+x2 that satisfies y1+y2 >= 6.
If n=2, k=8, then I'll pick (x1,y1)=(3,3) and (x2,y2)=(7,5) because y1+y2>=8 and in the next alternative tuples (3,3) and (8,6), 3+8=11 is larger than 3+7.
I feel like a brute-force solution would be possible: all possible n-sized combinations of each vector with the rest, for each permutation calculate yTotal=y1+y2+y3... find all yTotal combinations that satisfy yTotal>=k and of those, pick the one where xTotal=x1+x2+x3... is minimal.
I definitely struggle putting this into R-code and wonder if it's even the right choice. Thank you for your help!

First, it seems from your question that you allow to select from Y with replacement. The code basically does your brute-force approach: use the permutations in the gtools library to generate the permutations. Then basically do the filtering for sum(Y)>=k, and ordering first by smallest sum(Y) and then sum(X).
X <- c(3, 2, 3, 8, 7, 7, 13, 11, 12, 12)
Y <- c(2, 1, 3, 6, 5, 6, 8, 9, 10, 9)
n<-1
perm<-gtools::permutations(n=length(Y),r=n, repeats.allowed=T)
result<-apply(perm,1,function(x){ c(sum(Y[x]),sum(X[x])) })
dim(result) # 2 10
k=9 ## Case of n=1, k=9
keep<-which(result[1,]>=k)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # 9 and 11
##### n=2 cases ##########
n<-2
perm<-gtools::permutations(n=length(Y),r=n, repeats.allowed=T)
result<-apply(perm,1,function(x){ c(sum(Y[x]),sum(X[x])) })
dim(result) # 2 100
## n=2, k=6
keep<-which(result[1,]>=6)
keep[order(result[1,keep],result[2,keep])[1]] # the 23 permutation
perm[23,] # 3 3 is (Y1,Y2)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # sum(Y)=6 and sum(X)=6
## n=2, k=8
keep<-which(result[1,]>=8)
keep[order(result[1,keep],result[2,keep])[1]] # the 6 permutation
perm[6,] # 1 6 is (Y1,Y2)
result[,keep[order(result[1,keep],result[2,keep])[1]]] # sum(Y)=8 and sum(X)=10

Related

Permute the position of a subset of a vector

I want to permute a subset of a vector.
For example, say I have a vector (x) and I select a random subset of the vector (e.g., 40% of its values).
What I want to do is output a new vector (x2) that is identical to (x) except the positions of the values within the random subset are randomly swapped.
For example:
x = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
random subset = 1, 4, 5, 8
x2 could be = 4, 2, 3, 8, 1, 6, 7, 5, 9, 10
Here's an an example vector (x) and how I'd select the indices of a random subset of 40% of its values. Any help making (x2) would be appreciated!
x <- seq(1,10,1)
which(x%in%sample(x)[seq_len(length(x)*0.40)])
First draw a sample of proportion p from the indices, then sample and re-assign elements with that indices.
f <- \(x, p=0.4) {
r <- sample(seq_along(x), length(x)*p)
x[r] <- sample(x[r])
`attr<-`(x, 'subs', r) ## add attribute w/ indices that were sampled
}
set.seed(42)
f(x)
# [1] 8 2 3 4 1 5 7 10 6 9
# attr(,"subs")
# [1] 1 5 10 8
Data:
x <- 1:10
For sure there is a faster code to do what you are asking, but, a solution would be:
x <- seq(1,10,1)
y <- which(x%in%sample(x)[seq_len(length(x)*0.40)]) # Defined as "y" the vector of the random subset
# required libraries
library(combinat)
permutation <- permn(y) # permn() function in R generates a list of all permutations of the elements of x.
# https://www.geeksforgeeks.org/calculate-combinations-and-permutations-in-r/
permutation_sampled <- sample(permutation,1) # Sample one of the permutations.
x[y] <- permutation_sampled[[1]] # Substitute the selected permutation in x using y as the index of the elements that should be substituted.

Find index of element comparing with sorted vector

If I have a sorted vector, like
vec <- c(5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
and I have
x <- 9.5
Then x is between the 5th and 6th value in my sorted row, and I want to get the index 5. How can I do it?
The following will give the result you're looking for:
x<-c(5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
findInterval(9.5,x)
> [1] 5
Alternative solutions include:
> max(which(x < 9.5))
[1] 5
There should be multiple ways to do this. One way using which.max
which.max(vec > x) - 1
#[1] 5
This finds first index where vec is greater than x and then returns an index 1 less than that.
As it is sorted the opposite should work as well
which.min(vec < x) - 1
#[1] 5

How can you find the indexes of the n least values in a vector?

Is there a quick way to find the indexes of the n least values in a vector in R?
I know that to find the least value you can use which.min(c(1, 5, 6, 4)). Can this be extended?
From the comments we shared either order
n <- 3
head(order(vect), n)
or with sort
head(sort(vect, index.return = TRUE)$ix, n)
data
vect <- c(1, 5, 6, 4)
which.nmin <- function(x, n){
order(x)[seq_len(n)]
}
set.seed(123)
x <- rnorm(100)
which.nmin(x, 6)
# [1] 72 18 26 57 43 8

R-Randomly pick a number and do it over and over until a condition is achivied

I want to randomly pick a number from a vector with 8 elements that sums to 35. If the number is 0 look for another number. If the number is greater than 0, make this number -1. Do this in a loop until the sum of the vector is 20. How can I do this in R?
For example: vec<-c(2,3,6,0,8,5,6,5)
Pick a number from this list randomly and make the number -1 until the sum of the elements becomes 20.
I'm really really not sure that is what you want, but for what I understand of your question, here is my solution. You'll get most of the concept and key fonctions in my script. Use that and help() to understand them and optimize it.
vec <- c(2, 3, 6, 0, 8, 5, 6, 5)
summ <- 0
new.vec <- NULL
iter <- 1
while(summ<20) {
selected <- sample(vec,1)
if(selected!=0) new.vec[iter] <- selected-1
summ <- sum(new.vec)
iter <- iter+1
}
Try this:
vec <- c(2, 3, 6, 0, 8, 5, 6, 5)
#just setting the seed for reproducibility
set.seed(19)
tabulate(sample(rep(seq_along(vec),vec),20))
#[1] 0 2 4 0 4 5 3 2

Variable sample upper value in R

I have the following matrix
m <- matrix(c(2, 4, 3, 5, 1, 5, 7, 9, 3, 7), nrow=5, ncol=2,)
colnames(x) = c("Y","Z")
m <-data.frame(m)
I am trying to create a random number in each row where the upper limit is a number based on a variable value (in this case 1*Y based on each row's value for for Z)
I currently have:
samp<-function(x){
sample(0:1,1,replace = TRUE)}
x$randoms <- apply(m,1,samp)
which work works well applying the sample function independently to each row, but I always get an error when I try to alter the x in sample. I thought I could do something like this:
samp<-function(x){
sample(0:m$Z,1,replace = TRUE)}
x$randoms <- apply(m,1,samp)
but I guess that was wishful thinking.
Ultimately I want the result:
Y Z randoms
2 5 4
4 7 7
3 9 3
5 3 1
1 7 6
Any ideas?
The following will sample from 0 to x$Y for each row, and store the result in randoms:
x$randoms <- sapply(x$Y + 1, sample, 1) - 1
Explanation:
The sapply takes each value in x$Y separately (let's call this y), and calls sample(y + 1, 1) on it.
Note that (e.g.) sample(y+1, 1) will sample 1 random integer from the range 1:(y+1). Since you want a number from 0 to y rather than 1 to y + 1, we subtract 1 at the end.
Also, just pointing out - no need for replace=T here because you are only sampling one value anyway, so it doesn't matter whether it gets replaced or not.
Based on #mathematical.coffee suggestion and my edited example this is the slick final result:
m <- matrix(c(2, 4, 3, 5, 1, 5, 7, 9, 3, 7), nrow=5, ncol=2,)
colnames(m) = c("Y","Z")
m <-data.frame(m)
samp<-function(x){
sample(Z + 1, 1)}
m$randoms <- sapply(m$Z + 1, sample, 1) - 1

Resources