Recently, I learned how to write a loop that initializes some number, and then randomly generates numbers until the initial number is guessed (while recording the number of guesses it took) such that no number will be guessed twice:
# https://stackoverflow.com/questions/73216517/making-sure-a-number-isnt-guessed-twice
all_games <- vector("list", 100)
for (i in 1:100){
guess_i = 0
correct_i = sample(1:100, 1)
guess_sets <- 1:100 ## initialize a set
trial_index <- 1
while(guess_i != correct_i){
guess_i = sample(guess_sets, 1) ## sample from this set
guess_sets <- setdiff(guess_sets, guess_i) ## remove it from the set
trial_index <- trial_index + 1
}
## no need to store `i` and `guess_i` (as same as `correct_i`), right?
game_results_i <- data.frame(i, trial_index, guess_i, correct_i)
all_games[[i]] <- game_results_i
}
all_games <- do.call("rbind", all_games)
I am now trying to modify the above code to create the following two loops:
(Deterministic) Loop 1 will always guess the midpoint (round up) and told if their guess is smaller or bigger than the correct number. They will then re-take the midpoint (e.g. their guess and the floor/ceiling) until they reach the correct number.
(Semi-Deterministic) Loop 2 first makes a random guess and is told if their guess is bigger or smaller than the number. They then divide the difference by half and makes their next guess randomly in a smaller range. They repeat this process many times until they reach the correct number.
I tried to write a sketch of the code:
#Loop 2:
correct = sample(1:100, 1)
guess_1 = sample(1:100, 1)
guess_2 = ifelse(guess_1 > correct, sample(50:guess_1, 1), sample(guess_1:100, 1))
guess_3 = ifelse(guess_2 > correct, sample(50:guess_2, 1), sample(guess_2:100, 1))
guess_4 = ifelse(guess_4 > correct, sample(50:guess_3, 1), sample(guess_3:100, 1))
#etc
But I am not sure if I am doing this correctly.
Can someone please help me with this?
Thank you!
Example : Suppose I pick the number 68
Loop 1: first random guess = 51, (100-51)/2 + 51 = 75, (75-50)/2 + 50 = 63, (75 - 63)/2 + 63 = 69, (69 - 63)/2 + 63 = 66, etc.
Loop 2: first random guess = 53, rand_between(53,100) = 71, rand_between(51,71) = 65, rand(65,71) = 70, etc.
I don't think you need a for loop for this, you can create structures since the beginning, with sample, sapply and which:
## correct values can repeat, so we set replace to TRUE
corrects <- sample(1:100, 100, replace = TRUE)
## replace is by default FALSE in sample(), if you don't want repeated guesses
## sapply() creates a matrix
guesses <- sapply(1:100, function(x) sample(1:100, 100))
## constructing game_results_i equal to yours, but could be simplified
game_results_i <- data.frame(
i = 1:100,
trial_index = sapply(
1:100,
function(x) which(
## which() returns the index of the first element that makes the predicate true
guesses[, x] == corrects[x]
)
),
guess_i = corrects,
correct_i = corrects # guess_i and correct_i are obviously equal
)
Ok, let's see if now I match question and answer properly :)
If I got correctly your intentions, in both loops, you are setting increasingly finer lower and upper bounds. Each guess reduces the search space. However, this interpretation does not always match your description, please double check if it can be acceptable for your purposes.
I wrote two functions, guess_bisect for the deterministic loop_1 and guess_sample for loop_2:
guess_bisect <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- round((ub - lb) / 2) + lb
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- round((ub - lb) / 2) + lb
trial_index <- trial_index + 1
}
trial_index
}
guess_sample <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- sample((lb + 1):(ub - 1), 1)
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- sample((lb + 1):(ub - 1), 1)
trial_index <- trial_index + 1
}
trial_index
}
Obviously, guess_bisect always produces the same results with the same input, guess_sample changes randomly instead.
By plotting the results in a simple chart, it seems that the deterministic bisection is on the average much better, as the random sampling may become happen to pick improvements from the wrong sides. x-axis is the correct number, spanning 1 to 100, y-axis is the trial index, with guess_bisect you get the red curve, with many attempts of guess_sample you get the blue curves.
I have the following code for a random walk, in which I start from i and add up cumulatively for each line.
However, I need to limit my random walk on each line. One way I thought of doing this, would be from the index j (where the value in the position is less than or equal to 0 or greater than or equal to t) of each line replace with null.
simulate_binomial = function(cenarios, rodadas, p){
return(matrix(data=rbinom(cenarios*rodadas, 1, p), nrow=cenarios, ncol=rodadas))
}
i = 2
t = 10
p = 0.8
max_walk = 100
samples = simulate_binomial(1000, max_walk, p)
samples[samples==0] = -1
walk = t(apply(cbind(i, samples), 1, cumsum))
walk1 = apply(walk, 1, function(x) (which((x <= 0) | (x >= t))[1]))
So my walk1 would be the indices of each line that would have a value less than or equal to zero or greater than or equal to t. However, I don't know how to assign null for this index onwards in the line.
My intention is to assign null so that I can plot precisely without this null part and see the effect of the ruin on each line / "scenario".
Can anyone help me plz?
You can change your last apply to :
walk1 <- t(apply(walk, 1, function(x) {
inds <- (which((x <= 0) | (x >= t))[1])
x[(inds+1):length(x)] <- NA
x
}))
everyone!
how to generate a vector which satisfy some conditions?
Problem: generate a vector a such that length(a)=400000 which is made up of 8 elements:0, 5, 10, 50, 500, 5000, 50000, 300000. Each element appears a set number of times, namely 290205, 100000, 8000, 1600, 160, 32, 2, 1, respectively. Further, a is blocked into 4,000 "groups" of 100 consecutive elements; call them a_k, k=1,...,4000. These groups must satisfy the following:
The sum of every group exceeds 150, i.e. sum_i a_k_i>150 for all k.
The elements 5, 10 and 50 appear between 25 and 29 times in each group, i.e. for all k, the set {i|a_i_k in (5,10,50)} has magnitude between 25 and 29.
0 never appears more than 8 times in a row in any group.
I have tried this many times, but it does not seem to work:
My current code is as follows:
T <- 4*10^(5) # data size
x <- c(0, 5, 10, 50, 500, 5000, 50000, 300000) #seed vector
t <- c(290205, 100000, 8000, 1600, 160, 32, 2, 1) #frequency
A <- matrix(0, 4000, 100) #4000 groups
k <- rep(0, times = 8) #record the number of seeds
for(m in 1:4000) {
p <- (t - k)/(T - 100*(m - 1)) #seed probability
A[, m] <- sample(x, 100, replace = TRUE, prob = p) #group m
sm <- 0
i <- 0
for(j in 1:92) {
if(sum(A[m,j:j + 8])==0){
if(A[m,j] > 0 & A[m,j] < 500) {i <- i+1}
sm <- sm+A[100*m+j]
}
else j <- 0
}
if (sm >= 150 & i > 24 & i < 30 & j != 0) {
m <- m + 1
for (n in seq_len(x)) {
k[n] <- sum(A[, m+1] == x[n]) + k[n]
}
}
}
How about just doing it by construction? For example:
amat<-matrix(rep(c(rep(rep(c(0,5),c(8,3)),8),
rep(c(0,NA),c(8,4))),4000),nrow=100)
amat[97:100,1:2205]<-c(rep(10,3),0)
amat[97:98,2206:4000]<-c(5,5)
amat[99:100,2206:2897]<-c(10,10)
amat[99:100,2898]<-c(5,50)
amat[99:100,2899:3307]<-c(5,50)
amat[99:100,3308:3902]<-c(50,50)
amat[which(is.na(amat))]<-rep(c(10,500,5000,5e4,3e5),c(1,160,32,2,1))
a<-c(amat)
This satisfies all your conditions:
Element counts:
>sapply(c(0,5,10,50,500,5000,50000,300000),function(x)length(which(a==x)))
[1] 290205 100000 8000 1600 160 32 2 1
Group sums:
> table(colSums(amat)>=150)
TRUE
4000
5,10,50 frequency:
> table(sapply(1:4000,function(x)abs(sum(amat[,x] %in% c(5,10,50))-27)<=2))
TRUE
4000
Runs of 0:
> table(sapply(1:4000,function(x)max(rle(amat[,x])$lengths[rle(amat[,x])$values==0])<=8))
#If this is slow, we can just use max(rle(amax[,x]))<=8
# because there aren't many valid groups with strings of 9+
# non-0 elements
TRUE
4000
if in fact we're never allowed to have strings of 9 0s, we'll need to make a slight adjustment to groups 2:2206, because, e.g. a[100:108]==0
I can start it off and maybe someone can help get to the next step. My approach is to start with the constraints and let sample work out the numbers.
set.seed(77)
choose <- c(0,5,10,50,500,5000,50000,300000)
freqs <- c(290205,100000,8000,1600,160,32,2,1)
probs <- freqs/sum(freqs)
check.sum <- function(vec) sum(vec) >= 150
check.interval <- function(vec) abs(sum(vec %in% c(5,10,50))-27)<=2
check.runs <- function(vec, runmax=8) max(rle(vec)$lengths[rle(vec)$values==0]) <= runmax
check.all <- function(vector) {
logicals <- c(check.sum(vector),
check.runs(vector),
check.runs(vector)
)
return(all(logicals))
}
nums <- NULL
res <- list()
for(i in 1:4000) {
nums <- numeric(100)
while(!check.all(nums)) {nums <- sample(choose, 100, replace=T,prob=probs)}
res[i] <- list(nums)
}
str(res)
List of 4000
$ : num [1:100] 1e+01
So this gets you a list of 4,000 groups of 100 numbers that fit the constraints. It only took about two seconds of system time.
Next step is for someone to get a way to build something similar except eliminate 300000 once it is used, and 50000 once it is used twice and so on.
Inspired by #plafort's approach, I've come up with the following that seems to work very quickly and should be capable of generating all vectors satisfying your conditions:
elts<-c(0,5,10,50,500,5000,50000,300000)
freq<-c(290205,100000,8000,1600,160,32,2,1)
ngrp<-4000L
grp.cond1<-function(x)sum(x)>=150
grp.cond2<-function(x)abs(sum(x %in% c(5,10,50))-27)<=2
grp.cond3<-function(x)max(rle(x)$lengths[rle(x)$values==0])<=8
check.all<-function(mat){
all(sapply(1:ncol(mat),function(y)grp.cond1(mat[,y])),
sapply(1:ncol(mat),function(y)grp.cond2(mat[,y])),
sapply(1:ncol(mat),function(y)grp.cond3(mat[,y])))}
while(!check.all(amat)){amat<-matrix(sample(rep(elts,freq)),ncol=ngrp)}
a<-c(amat)
I've also written the code in a way that should be easy to generalize to other element sets/counts, group numbers, and group-wise conditions.
Unfortunately it seems these conditions are pretty stringent, and it may take a long time to generate an acceptable a. I let the while loop run ~1300 times with no success...
Thanks for everyone! I have figured out my problem.
rm(list = ls())
media <- matrix(rep(rep(c(0,5,NA),c(72,25,3)),4000),nrow=100)
media[98:100,1:2400] <-c(10,10,10)
media[98:99,2401:3200] <-c(50,10)
media[98:99,3201:4000] <-c(50,0)
media[100,2401:4000] <-rep(c(0,500,5000,50000,300000),c(1405,160,32,2,1))
obj1 <- matrix(0,100L,4000)
obj2 <-obj1
grp.cond<-function(x) max(rle(x)$lengths[rle(x)$values==0])<=8
elts<-c(0,5,10,50,500,5000,50000,300000)
for(i in 1:4000){
freq<-c(sapply(elts, function(x) length(which(media[,i]==x))))
while(!grp.cond(obj1[,i])){obj1[,i]<-c(sample(rep(elts,freq)))}
i<-i+1
}
elts1<-c(1:4000)
freq1<-rep(1,times=4000)
a1<-sample(rep(elts1,freq1))
for(i in 1:4000){obj2[,i]<-obj1[,a1[i]]}
a <- c(obj2)