Finding sum of number nearest to specific number - r

I have following vector of numbers in r
bay_no <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
bay_cont <- c(45,25,25,0,19,61,2,134,5,27,0,54,102,97,5,6,65,47,85,0)
count <- 3
bay_to_serve <- sum(bay_cont)/count
In above bay_cont vector I want to find sum which will be close to bay_to_serve in above case bay_to_serve = 268
Now, from (45 till 2) sum is 177 and (45 till 134) sum is 311,so 311 is closest to 268 then it should return the index of i.e 8 from bay_no
We will get one vector from bay_no = 1-8
Again starting from bay_cont from 5 till the sum close to 268
Desired output is
bay_no 1-8,9-14 and then remaining bay_nos
How can we do it in r?

Dunno if there is a smart way to do but I'd think of nested loops.
Your inner loop may look like this (Please note that I have no access to R right now, so I can't test it.):
old_sum = bay_count[1]
for(i in 2:length(by_cont)) {
new_sum <- sum (bay_count[1:i])
if (abs(bay_to_serve - new_sum) < abs(bay_to_serve - old_sum)) {
output <- paste("bay_no", paste(1,i, sep="-"), sep=" ") break
}else{
old_sum <- new_sum
}
}
This way, whenever the sum of the first X entries is smaller than the previous sum, it will break the loop and create an output string. Just add another loop around the first loop and one or to more if statements to run from j:length(by_cont), whereby j is first set to 1 and will be set to i+1 within the inner loop.

You can try:
res <- NULL
i = 1
while(i < length(bay_cont)){
tmp <- which.min(abs(cumsum(bay_cont[i:length(bay_cont)]) - bay_to_serve))
res <- append(res,tmp)
i = tmp + i
}
cumsum(res)
[1] 8 14 19
If you want to break ties specifically you can use rank together with which.min like follows:
which.min(rank(abs(cumsum(bay_cont[i:length(bay_cont)]) - bay_to_serve), ties.method = "last"))
Then I would create a matrix instead of pasting it together:
cbind(c(1, cumsum(res)[-length(cumsum(res))]+1), cumsum(res))
[,1] [,2]
[1,] 1 8
[2,] 9 14
[3,] 15 19
Of course you can paste it together as well:
apply(cbind(c(1, cumsum(res)[-length(cumsum(res))]+1), cumsum(res)), 1, paste, collapse="-")
[1] "1-8" "9-14" "15-19"

My solution uses a dirty for loop but yields the required indizes...
Hope that fits to you?
bay_no <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
bay_cont <- c(45,25,25,0,19,61,2,134,5,27,0,54,102,97,5,6,65,47,85,0)
count <- 3
bay_to_serve <- sum(bay_cont)/count
temp_sum=0
for (i in 1:(length(bay_cont)-1)) {
temp_sum=temp_sum+bay_cont[i]
if ( abs(bay_to_serve-temp_sum)<abs(bay_to_serve-(temp_sum +bay_cont[i+1]))) {
print(i)
temp_sum=0
}
}

I probably misunderstand the question, but it seems more easy to do this:
bay_no[ which.min(abs(cumsum(bay_cont) - bay_to_serve)) ]
To start at 5, omit elements 1:4 and add 4 to the which.min index
bay_no[ which.min(abs(cumsum(bay_cont[-(1:4)]) - bay_to_serve))+4 ]

Related

Randomly select values from a given number list to add to a certain value in r

If I have a set of values such as
c(1,2,5,6,7,15,19,20)
and I want to randomly select 2 values where the sum equals 20. From the above list possible samples that I would like to see would be
[19,1], [15,5]
How do I do this in R. Any help would be greatly appreciated.
This computes all possible combinations of your input vector, so if this is very long, this might be a problem.
getVal <- function(vec,val) {
comb = combn(vec, 2)
idx = colSums(comb) == val
if (sum(idx)) {
return(comb[,idx][,sample(sum(idx),1)])
}
return(FALSE)
}
vec = (c(1,4,6,9))
val = 10
getVal(vec,val)
>>[1] 1 9
val = 11
>>[1] FALSE
getVal(vec,val)
For a small vector of values you can do an exhaustive search by working out all the combinations of pairs in the values. Example:
> values = c(1,2,5,6,7,15,19,20)
> pairs = matrix(values[t(combn(length(values),2))],ncol=2)
That is a 2-column matrix of all pairs from values. Now sum the rows and look for the target value of 20:
> targets = apply(pairs,1,sum)==20
> pairs[targets,]
[,1] [,2]
[1,] 1 19
[2,] 5 15
The size of pairs increases such that if you have 100 values then pairs will have nearly 5000 rows.
You can do this with the sample()-functie and a while-loop. It isn't the prettiest solution but a simple to implement one for sure.
First you sample two values from the vector and store them in an object, like:
values <- c(1, 2, 5, 6, 7, 15, 19, 20)
randomTwo <- sample(values, 2)
Then you start you while-loop. This loop checks if sum of the two sampled values modulo 10 equals 0 (I assumed you meant modulo from the examples in your question, see https://en.wikipedia.org/wiki/Modulo_operation to see what it does). If the operation does not equal 0 the loop samples two new values until the operation does equal zero, and you get your two values.
Here's what it looks like:
while (sum(randomTwo) %% 10 != 0) {
randomTwo <- sample(values, 2)
}
Now this might take more iterations than checking all combo's, and it might take less, depending on chance. If you have just this small vector than it's a nice solution. Good luck!
In a way where you don't need to compute a inmense matrix (way faster):
findpairs=function(a,sum,num){
list=list()
aux=1
for (i in 1:length(a)){
n=FALSE
n=which((a+a[i])==sum)
if (length(n)){
for (j in n){
if (j!=i){
list[[aux]]=c(a[i],a[j])
aux=aux+1
}
}
}
}
return(sample(list[1:(length(list)/2),num))
}
a=c(1,2,5,6,19,7,15,20)
a=a[order(a)]
sum=20
findpairs(a,sum,2)
[[1]]
[1] 5 15
[[2]]
[1] 1 19
Issue is that it gives repetition.
edit
Solved. Just take half of the list as the other half will be the same pairs the other way around.

Monte Carlo Simulation with Replacement Based On Sum of A Column

I am trying to simulate an unlikely situation in a videogame using a Monte Carlo simulation. I'm extremely new at coding and thought this would be a fun situation to simulate.
There are 3 targets and they are being attacked 8 times independently. My problem comes with how to deal with the fact that one of the columns cannot be attacked more than 6 times, when there are 8 attacks.
I would like to take any attack aimed at column 2 select one of the other 2 columns at random to attack instead, but only if column 2 has been attacked 6 times already.
Here is my attempt to simulate with 5000 repeats, for example.
#determine number of repeats
trial <- 5000
#create matrix with a row for each trial
m <- matrix(0, nrow = trial, ncol = 3)
#The first for loop is for each row
#The second for loop runs each attack independently, sampling 1:3 at random, then adding one to that position of the row.
#The function that is called by ifelse() when m[trial, 2] > 6 = TRUE is the issue.
for (trial in 1:trial){
for (attack in 1:8) {
target <- sample(1:3, 1)
m[trial, target] <- m[trial, target] + 1
ifelse(m[trial, 2] > 6, #determines if the value of column 2 is greater than 6 after each attack
function(m){
m[trial, 2] <- m[trial, 2] - 1 #subtract the value from the second column to return it to 6
newtarget <- sample(c(1,3), 1) #select either column 1 or 3 as a new target at random
m[trial, newtarget] <- m[trial, newtarget] + 1 #add 1 to indicate the new target has been selected
m}, #return the matrix after modification
m) #do nothing if the value of the second column is <= 6
}
}
For example, if I have the matrix below:
> matrix(c(2,1,5,7,1,0), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 2 5 1
[2,] 1 7 0
I would like the function to look at the 2nd line of the matrix, subtract 1 from 7, and then add 1 to either column 1 or 3 to create c(2,6,0) or c(1,6,1). I would like to learn how to do this within the loop, but it could be done afterwards as well.
I think I am making serious, fundamental error with how to use function(x) or ifelse.
Thank you.
Here's an improved version of your code:
set.seed(1)
trial <- 5000
#create matrix with a row for each trial
m <- matrix(0, nrow = trial, ncol = 3)
#The first for loop is for each row
#The second for loop runs each attack independently, sampling 1:3 at random, then adding one to that position of the row.
#The function that is called by ifelse() when m[trial, 2] > 6 = TRUE is the issue.
for (i in 1:trial){
for (attack in 1:8) {
target <- sample(1:3, 1)
m[i, target] <- m[i, target] + 1
#determines if the value of column 2 is greater than 6 after each attack
if(m[i, 2] > 6){
#subtract the value from the second column to return it to 6
m[i, 2] <- m[i, 2] - 1
#select either column 1 or 3 as a new target at random
newtarget <- sample(c(1,3), 1)
#add 1 to indicate the new target has been selected
m[i, newtarget] <- m[i, newtarget] + 1
}
}
}
# Notice the largest value in column 2 is no greater than 6.
apply(m, 2, max)
set.seed is used to make the results reproducible (usually just used for testing). The ifelse function has a different purpose than the normal if-else control flow. Here's an example:
x = runif(100)
ifelse(x < 0.5, 0, x)
You'll notice any element in x that is less than 0.5 is now zero. I changed your code to have an if block. Notice that m[i, 2] > 6 returns a single TRUE or FALSE whereas in the small example above, x < 0.5 a vector of logicals is returned. So ifelse can take a vector of logicals, but the if block requires there be only a single logical.
You were on the right track with using function, but it just isn't necessary in this case. Often, but not always, you'll define a function like this:
f = function(x)
x^2
But just returning the value doesn't mean what you want is changed:
x = 5
f(5) # 25
x # still 5
For more on this, look up function scope in R.
Lastly, I changed the loop to be i in 1:trial instead of trial in 1:trial. You probably wouldn't notice any issues in your case, but it is better practice to use a separate variable than that which makes up the range of the loop.
Hope this helps.
P.S. R isn't really known for it's speed when looping. If you want to make things goes faster, you'll typically need to vectorize your code.

adding values to the vector inside for loop in R

I have just started learning R and I wrote this code to learn on functions and loops.
squared<-function(x){
m<-c()
for(i in 1:x){
y<-i*i
c(m,y)
}
return (m)
}
squared(5)
NULL
Why does this return NULL. I want i*i values to append to the end of mand return a vector. Can someone please point out whats wrong with this code.
You haven't put anything inside m <- c() in your loop since you did not use an assignment. You are getting the following -
m <- c()
m
# NULL
You can change the function to return the desired values by assigning m in the loop.
squared <- function(x) {
m <- c()
for(i in 1:x) {
y <- i * i
m <- c(m, y)
}
return(m)
}
squared(5)
# [1] 1 4 9 16 25
But this is inefficient because we know the length of the resulting vector will be 5 (or x). So we want to allocate the memory first before looping. This will be the better way to use the for() loop.
squared <- function(x) {
m <- vector("integer", x)
for(i in seq_len(x)) {
m[i] <- i * i
}
m
}
squared(5)
# [1] 1 4 9 16 25
Also notice that I have removed return() from the second function. It is not necessary there, so it can be removed. It's a matter of personal preference to leave it in this situation. Sometimes it will be necessary, like in if() statements for example.
I know the question is about looping, but I also must mention that this can be done more efficiently with seven characters using the primitive ^, like this
(1:5)^2
# [1] 1 4 9 16 25
^ is a primitive function, which means the code is written entirely in C and will be the most efficient of these three methods
`^`
# function (e1, e2) .Primitive("^")
Here's a general approach:
# Create empty vector
vec <- c()
for(i in 1:10){
# Inside the loop, make one or elements to add to vector
new_elements <- i * 3
# Use 'c' to combine the existing vector with the new_elements
vec <- c(vec, new_elements)
}
vec
# [1] 3 6 9 12 15 18 21 24 27 30
If you happen to run out of memory (e.g. if your loop has a lot of iterations or vectors are large), you can try vector preallocation which will be more efficient. That's not usually necessary unless your vectors are particularly large though.

Putting generated data in a matrix format

I have one question about putting a simulated data in a matrix format, but I cannot suitably write its program in R, and constantly receive an error, I guess my "rep" definition and final "Matrix" expression are somehow wrong, but I do not know how to fix them. Here my specific question is:
I would like to produce a matrix contains generated values. I have 20000 generated values for x and y. As the output, I like to have a (2000 by 10) matrix that each column of the matrix contains the output of following for loop.
My R.code:
x=rnorm(2e4,5,6)
vofdiv=quantile(x,probs=seq(0,1,0.1))
y=rnorm(2e4,4,6)
Matrix=rep(NULL,2000)
for(i in 1:10)
{
Matrix[i]=y[(x>=vofdiv[i] & x<vofdiv[i+1])] #The i(th) col of matrix
}
Matrix # A 2000*10 Matrix, as the final output
I highly appreciate that someone helps me!
You have several problems here.
First of all, the correct way to define an empty matrix of size 2e4*10, would be
Matrix <- matrix(NA, 2e4, 10)
Although you could potentially create a matrix using your way(rep) and then use dim, something like
Matrix <- rep(NA, 2e5)
dim(Matrix) <- c(2e4, 10)
Second problem is, when trying to insert into a column in a matrix, you need to index it correctly, i.e.,
Matrix[, i] <-
instead of
Matrix[i] <-
The latter will index Matrix as if it was a vector (which is it basically is). In other words, it will convert a 2000*10 matrix to a 20000 length single vector and index it.
The third problem is, that when your loop reaches i = 11 and you are running x<vofdiv[i+1] you are always excluding the last values which are x == vofdiv[11], thus you are always getting less than 2000 values:
for(i in 1:10)
{
print(length(y[ (x >= vofdiv[i] & x < vofdiv[i+1])]))
}
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 2000
# [1] 1999 <----
Thus, it will give you an error if you will try to replace 2000 length vector with 1999 length one, because a matrix in R can't contain different dimensions for each column.
The workaround would be to add = to your last statement, such as
Matrix <- matrix(NA, 2e4, 10)
for(i in 1:10)
{
Matrix[, i] <- y[x >= vofdiv[i] & x <= vofdiv[i + 1]]
}

fill up a matrix one random cell at a time

I am filling a 10x10 martix (mat) randomly until sum(mat) == 100
I wrote the following.... (i = 2 for another reason not specified here but i kept it at 2 to be consistent with my actual code)
mat <- matrix(rep(0, 100), nrow = 10)
mat[1,] <- c(0,0,0,0,0,0,0,0,0,1)
mat[2,] <- c(0,0,0,0,0,0,0,0,1,0)
mat[3,] <- c(0,0,0,0,0,0,0,1,0,0)
mat[4,] <- c(0,0,0,0,0,0,1,0,0,0)
mat[5,] <- c(0,0,0,0,0,1,0,0,0,0)
mat[6,] <- c(0,0,0,0,1,0,0,0,0,0)
mat[7,] <- c(0,0,0,1,0,0,0,0,0,0)
mat[8,] <- c(0,0,1,0,0,0,0,0,0,0)
mat[9,] <- c(0,1,0,0,0,0,0,0,0,0)
mat[10,] <- c(1,0,0,0,0,0,0,0,0,0)
i <- 2
set.seed(129)
while( sum(mat) < 100 ) {
# pick random cell
rnum <- sample( which(mat < 1), 1 )
mat[rnum] <- 1
##
print(paste0("i =", i))
print(paste0("rnum =", rnum))
print(sum(mat))
i = i + 1
}
For some reason when sum(mat) == 99 there are several steps extra...I would assume that once i = 91 the while would stop but it continues past this. Can somone explain what I have done wrong...
If I change the while condition to
while( sum(mat) < 100 & length(which(mat < 1)) > 0 )
the issue remains..
Your problem is equivalent to randomly ordering the indices of a matrix that are equal to 0. You can do this in one line with sample(which(mat < 1)). I suppose if you wanted to get exactly the same sort of output, you might try something like:
set.seed(144)
idx <- sample(which(mat < 1))
for (i in seq_along(idx)) {
print(paste0("i =", i))
print(paste0("rnum =", idx[i]))
print(sum(mat)+i)
}
# [1] "i =1"
# [1] "rnum =5"
# [1] 11
# [1] "i =2"
# [1] "rnum =70"
# [1] 12
# ...
See ?sample
Arguments:
x: Either a vector of one or more elements from which to choose,
or a positive integer. See ‘Details.’
...
If ‘x’ has length 1, is numeric (in the sense of ‘is.numeric’) and
‘x >= 1’, sampling _via_ ‘sample’ takes place from ‘1:x’. _Note_
that this convenience feature may lead to undesired behaviour when
‘x’ is of varying length in calls such as ‘sample(x)’. See the
examples.
In other words, if x in sample(x) is of length 1, sample returns a random number from 1:x. This happens towards the end of your loop, where there is just one 0 left in your matrix and one index is returned by which(mat < 1).
The iteration repeats on level 99 because sample() behaves very differently when the first parameter is a vector of length 1 and when it is greater than 1. When it is length 1, it assumes you a random number from 1 to that number. When it has length >1, then you get a random number from that vector.
Compare
sample(c(99,100),1)
and
sample(c(100),1)
Of course, this is an inefficient way of filling your matrix. As #josilber pointed out, a single call to sample could do everything you need.
The issue comes from how sample and which do the sampling when you have only a single '0' value left.
For example, do this:
mat <- matrix(rep(1, 100), nrow = 10)
Now you have a matrix of all 1's. Now lets make two numbers 0:
mat[15]<-0
mat[18]<-0
and then sample
sample(which(mat<1))
[1] 18 15
by adding a size=1 argument you get one or the other
now lets try this:
mat[18]<-1
sample(which(mat<1))
[1] 3 13 8 2 4 14 11 9 10 5 15 7 1 12 6
Oops, you did not get [1] 15 . Instead what happens in only a single integer (15 in this case) is passed tosample. When you do sample(x) and x is an integer, it gives you a sample from 1:x with the integers in random order.

Resources