I am writing a Monte Carlo simulation to check how many times y was not immediately next to another y. I conjured up a vector of 40 x's and 10 y's placed at random position in the vector. My goal is to calculate the probabilities of not having any adjacent y's in the vector. Here is what I tried:
nrep = 100000
count = 0
for (i in 1:nrep) {
x = sample(c(rep('x', 40), c(rep('y', 10))))
if (x[i]!=x[i+1] && x[i+1]!=x[i+2]) count = count + 1
}
print(count/nrep)
The result is a very small number, which doesn't seem to make sense to me.
The if part is not correct. We can use head/tail to check for consecutive elements and see if there are any two consecutive 'y's in one iteration.
nrep = 100000
count = 0
set.seed(2020)
for (i in 1:nrep) {
x = sample(rep(c('x', 'y'), c(40, 10)))
if(any(head(x, -1) == 'y' & tail(x, -1) == 'y')) count = count + 1
}
count/nrep
#[1] 0.891
Related
Recently, I learned how to write a loop that initializes some number, and then randomly generates numbers until the initial number is guessed (while recording the number of guesses it took) such that no number will be guessed twice:
# https://stackoverflow.com/questions/73216517/making-sure-a-number-isnt-guessed-twice
all_games <- vector("list", 100)
for (i in 1:100){
guess_i = 0
correct_i = sample(1:100, 1)
guess_sets <- 1:100 ## initialize a set
trial_index <- 1
while(guess_i != correct_i){
guess_i = sample(guess_sets, 1) ## sample from this set
guess_sets <- setdiff(guess_sets, guess_i) ## remove it from the set
trial_index <- trial_index + 1
}
## no need to store `i` and `guess_i` (as same as `correct_i`), right?
game_results_i <- data.frame(i, trial_index, guess_i, correct_i)
all_games[[i]] <- game_results_i
}
all_games <- do.call("rbind", all_games)
I am now trying to modify the above code to create the following two loops:
(Deterministic) Loop 1 will always guess the midpoint (round up) and told if their guess is smaller or bigger than the correct number. They will then re-take the midpoint (e.g. their guess and the floor/ceiling) until they reach the correct number.
(Semi-Deterministic) Loop 2 first makes a random guess and is told if their guess is bigger or smaller than the number. They then divide the difference by half and makes their next guess randomly in a smaller range. They repeat this process many times until they reach the correct number.
I tried to write a sketch of the code:
#Loop 2:
correct = sample(1:100, 1)
guess_1 = sample(1:100, 1)
guess_2 = ifelse(guess_1 > correct, sample(50:guess_1, 1), sample(guess_1:100, 1))
guess_3 = ifelse(guess_2 > correct, sample(50:guess_2, 1), sample(guess_2:100, 1))
guess_4 = ifelse(guess_4 > correct, sample(50:guess_3, 1), sample(guess_3:100, 1))
#etc
But I am not sure if I am doing this correctly.
Can someone please help me with this?
Thank you!
Example : Suppose I pick the number 68
Loop 1: first random guess = 51, (100-51)/2 + 51 = 75, (75-50)/2 + 50 = 63, (75 - 63)/2 + 63 = 69, (69 - 63)/2 + 63 = 66, etc.
Loop 2: first random guess = 53, rand_between(53,100) = 71, rand_between(51,71) = 65, rand(65,71) = 70, etc.
I don't think you need a for loop for this, you can create structures since the beginning, with sample, sapply and which:
## correct values can repeat, so we set replace to TRUE
corrects <- sample(1:100, 100, replace = TRUE)
## replace is by default FALSE in sample(), if you don't want repeated guesses
## sapply() creates a matrix
guesses <- sapply(1:100, function(x) sample(1:100, 100))
## constructing game_results_i equal to yours, but could be simplified
game_results_i <- data.frame(
i = 1:100,
trial_index = sapply(
1:100,
function(x) which(
## which() returns the index of the first element that makes the predicate true
guesses[, x] == corrects[x]
)
),
guess_i = corrects,
correct_i = corrects # guess_i and correct_i are obviously equal
)
Ok, let's see if now I match question and answer properly :)
If I got correctly your intentions, in both loops, you are setting increasingly finer lower and upper bounds. Each guess reduces the search space. However, this interpretation does not always match your description, please double check if it can be acceptable for your purposes.
I wrote two functions, guess_bisect for the deterministic loop_1 and guess_sample for loop_2:
guess_bisect <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- round((ub - lb) / 2) + lb
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- round((ub - lb) / 2) + lb
trial_index <- trial_index + 1
}
trial_index
}
guess_sample <- function(correct, n = 100) {
lb <- 0
ub <- n + 1
trial_index <- 1
guess <- sample((lb + 1):(ub - 1), 1)
while (guess != correct) {
# cat(lb, ub, guess, "\n") # uncomment to print the guess iteration
if (guess < correct)
lb <- guess
else
ub <- guess
guess <- sample((lb + 1):(ub - 1), 1)
trial_index <- trial_index + 1
}
trial_index
}
Obviously, guess_bisect always produces the same results with the same input, guess_sample changes randomly instead.
By plotting the results in a simple chart, it seems that the deterministic bisection is on the average much better, as the random sampling may become happen to pick improvements from the wrong sides. x-axis is the correct number, spanning 1 to 100, y-axis is the trial index, with guess_bisect you get the red curve, with many attempts of guess_sample you get the blue curves.
I have the following code for a random walk, in which I start from i and add up cumulatively for each line.
However, I need to limit my random walk on each line. One way I thought of doing this, would be from the index j (where the value in the position is less than or equal to 0 or greater than or equal to t) of each line replace with null.
simulate_binomial = function(cenarios, rodadas, p){
return(matrix(data=rbinom(cenarios*rodadas, 1, p), nrow=cenarios, ncol=rodadas))
}
i = 2
t = 10
p = 0.8
max_walk = 100
samples = simulate_binomial(1000, max_walk, p)
samples[samples==0] = -1
walk = t(apply(cbind(i, samples), 1, cumsum))
walk1 = apply(walk, 1, function(x) (which((x <= 0) | (x >= t))[1]))
So my walk1 would be the indices of each line that would have a value less than or equal to zero or greater than or equal to t. However, I don't know how to assign null for this index onwards in the line.
My intention is to assign null so that I can plot precisely without this null part and see the effect of the ruin on each line / "scenario".
Can anyone help me plz?
You can change your last apply to :
walk1 <- t(apply(walk, 1, function(x) {
inds <- (which((x <= 0) | (x >= t))[1])
x[(inds+1):length(x)] <- NA
x
}))
I wrote this code to check for 3 or more consecutive faces in a simulation of 100000 iterations, with five rolls of a fair die. I think it is in the right track, but I am missing something. I keep getting a missing value error:
nrep = 500000
count = 0
for (i in 1:nrep) {
roll = sample(6, 5)
print(roll)
if (roll[i] == roll[i+1] & roll[i+1] == roll[i+2]) count = count + 1
}
print(count)
Please advise on a correction using base R only.
Adding to my comment, you can use the function rle() to compute the lengths and values of runs of equal values in a vector. You can do something like the following
nrep = 500000
count = 0
for (i in 1:nrep) {
roll = sample(6, 5, replace = TRUE)
roll_rle = rle(roll)
if (any(roll_rle$lengths >= 3)) {
print(roll)
count = count + 1
}
}
recently I am trying to mimic a game.
I am going to throw 2 dice at the same time. If the sum of 2 dice is greater than or equals to 10, I win 1 point.
If it is lower than 10, I lose 1 point. I will do this for 1000 times.
At the very beginning, I draw 2000 random samples with set.seed (1234)
set.seed(1234)
d = sample(c(1:6), size = 2000, replace = T)
d
And then, I turn it into a matrix, and sum each row
a = matrix(d, nrow=1000, ncol=2, byrow=T)
t = rowSums(a)
t
Now, I have 1000 elements (sum of two dice each time). I would like to create a vector X to calculate the point that I can get.
However, how can I apply if statement to create vector X in this time?
Thank you very much
Do you mean this?
X <- ifelse(t>=10,1,-1)
or
X <- 2*(t>=10)-1
Using case_when
library(dplyr)
case_when(t >= 10 ~ 1, TRUE ~ -1)
You could assign a temporary variable and assign points by comparing the values.
tmp <- t
t[tmp >= 10] <- 1
t[tmp < 10] <- -1
Or without a temporary variable.
t1 <- c(-1, 1)[(t >= 10) + 1]
I am working on Spike Trains and my code to get a spike train like this:
for 20 trials is written below. The image is representational for 5 trials.
fr = 100
dt = 1/1000 #dt in milisecond
duration = 2 #no of duration in s
nBins = 2000 #10msSpikeTrain
nTrials = 20 #NumberOfSimulations
MyPoissonSpikeTrain = function(p, fr= 100) {
p = runif(nBins)
q = ifelse(p < fr*dt, 1, 0)
return(q)
}
set.seed(1)
SpikeMat <- t(replicate(nTrials, MyPoissonSpikeTrain()))
plot(x=-1,y=-1, xlab="time (s)", ylab="Trial",
main="Spike trains",
ylim=c(0.5, nTrials+1), xlim=c(0, duration))
for (i in 1: nTrials)
{
clip(x1 = 0, x2= duration, y1= (i-0.2), y2= (i+0.4))
abline(h=i, lwd= 1/4)
abline(v= dt*which( SpikeMat[i,]== 1))
}
Each trial has spikes occuring at random time points. Now what I am trying to work towards, is getting a random sample time point that works for all 20 trials and I want to get the vector consisting of length of the intervals this point falls into, for each trial. The code to get the time vector for the points where the spikes occur is,
A <- numeric()
for (i in 1: nTrials)
{
ISI <- function(i){
spike_times <- c(dt*which( SpikeMat[i, ]==1))
ISI1vec <- c(diff(spike_times))
A <- c(A, ISI1vec)
return(A)}
}
Then you call ISI(i) for whichever trial you wish to see the Interspike interval vector for. A visual representation of what I want is:
I want to get a vector that has the lengths of the interval where this points fall into, for each trial. I want to figure out it's distribution as well, but that's for later. Can anybody help me figure out how to code my way to this? Any help is appreciated, even if it's just about how to start/where to look.
Your data
set.seed(1)
SpikeMat <- t(replicate(nTrials, MyPoissonSpikeTrain()))
I suggest transforming your sparse matrix data into a list of indices where spikes occur
L <- lapply(seq_len(nrow(SpikeMat)), function(i) setNames(which(SpikeMat[i, ] == 1), seq_along(which(SpikeMat[i, ] == 1))))
Grab random timepoint
set.seed(1)
RT <- round(runif(1) * ncol(SpikeMat))
# 531
Result
distances contains the distances to the 2 nearest spikes - each element of the list is a named vector where the values are the distances (to RT) and their names are their positions in the vector. nearest_columns shows the original timepoint (column number) of each spike in SpikeMat.
bookend_values <- function(vec) {
lower_val <- head(sort(vec[sign(vec) == 1]), 1)
upper_val <- head(sort(abs(vec[sign(vec) == -1])), 1)
return(c(lower_val, upper_val))
}
distances <- lapply(L, function(i) bookend_values(RT-i))
nearest_columns <- lapply(seq_along(distances), function(i) L[[i]][names(distances[[i]])])
Note that the inter-spike interval of the two nearest spikes that bookend RT can be obtained with
sapply(distances, sum)