Squid game Episode 7 with simulation - r

Last night I saw the episode 7 of the Squid game tv series. The episode has a game with binomial distribution in the bridge.
Specifically there are 16 players and a bridge with 18 pair of glasses (one pure glass and one safe glass).If one player happened to choose the pure glass then the glass couldn't stand the weight of the player and the glass broke. The next player had the advantage that he/she was starting from the position that the last player had and continues the binomial search.At the end 3 players happened to cross the bridge.
So i was wondering: It is like, I have 16 euros in my pocket and I play head or tails with p = 1/2. Every time I bet on heads. If the coin flip is head then I earn 0 and if is tails I lose 1 euro. What is the probability of hitting 18 times (consecutive or not) heads and to be left 3 euros in my pocket.
I tried to simulate this problem in R:
squid_bridge = function(a,n,p) {
players = a
while (position > 0 & position < n) {
jump = sample(c(0,1),1,prob=c(1-p,p))
position = position + jump
}
if (position == 0)
return(1)
else
return(0)
}
n = 18
trials = 100000
a = 16
p = 1/2
set.seed(1)
simlist = replicate(trials, squid_bridge(a, n, p))
It does not seem to work. Any help?

Here is a Monte Carlo experiment in R returning the distribution of the number of failures.
apply(apply(matrix(rgeom(16*1e6,.5)+1,nc=16),1,cumsum)>18,1,mean)
#with details:
#rgeom(16*1e6,.5)+1 for 16 x 10⁶ geometric simulations when
#the outcome is the number of attempts till "success",
# "success" included
#,1,cumsum) for the number of steps till 16th "success"
#)>18 for counting the cases when a player manages to X the bridge
#1,mean) for finding the probability of each of the players to X
This is not a Binomial but a truncated Negative Binomial experiment in that the number of new steps made by each player is a Geometric Geom(1/2) variate unless the 18 steps have been made. The average number of survivors is thus
sum(1-pnbinom(size=1:16,q=17:2,prob=.5))
#Explanation:
#pnbinom is the Negative Binomial cdf
#with size the number of "successes"
#q the integer at which the cdf is computed
#prob is the Negative Binomial probability parameter
#Because nbinom() is calibrated as the number of attempts
#before "success", rather than until "success", the value of
#q decreases by one for each player in the game
whose value is 7.000076, rather than 16-18/2=7!

Here is how I think you can model the game in R. The first version is similar to what you have: there's a 50% chance of guessing correctly and if the guess is correct, the players advance a tile. Otherwise they do not, and the number of players decrements by 1. If the number of players reaches 0, or they advance to the end, the game ends. This is shown in squid_bridge1().
squid_bridge1 <- function(players, n, prob) {
if (players == 0 | n == 18) {
# All players have died or we have reached the end
return(players)
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died
return(squid_bridge1(players - 1, n, prob))
}
if (jump == 1 & n < 18) {
# Player lives and advances 1 space
return(squid_bridge1(players, n + 1, prob))
}
}
However, this does not accurately depict the game since a wrong guess gives the remaining players additional information. If a player chooses wrong, the probability of the next guess being correct is not 50%, it's 100%. However, after that point the probability of a correct guess decreases to 50%. This can be accounted for with another argument to keep track of the correctness of the previous guess.
squid_bridge2 <- function(players, n, prob, previous) {
if (players == 0 | n == 18) {
# The game ends if there are no players or they have reached the end
return(players)
}
if (previous == 0) {
# The previous guess was wrong, but now the players know where to go next
return(squid_bridge2(players, n + 1, prob, previous = 1))
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died
return(squid_bridge2(players - 1, n, prob, previous = 0))
}
if (jump == 1 & n < 18) {
# Move is correct. Advance 1 space
return(squid_bridge2(players, n + 1, prob, previous = 1))
}
}
However, there's a catch. It wasn't quite that simple in the show, and players fell for reasons other than an incorrect guess (being pushed, jumping on purpose, etc.). I don't know what a reasonable probability of doing something like this is, but it is likely low, let's say 10%.
not_stupid <- function() {
x <- runif(1, 0, 1)
if (x <= 0.1) {
return(FALSE)
} else {
return(TRUE)
}
}
Since emotions spike just before each move, we will test this prior to each move.
squid_bridge3 <- function(players, n, prob, previous) {
if (players == 0 | n == 18) {
# The game is over because there are no players left or they reached the end
return(players)
}
if (previous == 0) {
# The previous guess was wrong, but now the players know where to go next
return(squid_bridge3(players, n + 1, prob, previous = 1))
}
if (!not_stupid()) {
return(squid_bridge3(players - 1, n, prob, previous = 1))
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died because of either choosing wrong or a self-inflicted loss
return(squid_bridge3(players - 1, n, prob, previous = 0))
}
if (jump == 1 & n < 18) {
# Move is correct. Advance 1 space
return(squid_bridge3(players, n + 1, prob, previous = 1))
}
}
Then running some simulations:
set.seed(123)
trials <- 10000
players <- 16
squid1 <- replicate(trials, squid_bridge1(players, 0, 0.5))
squid2 <- replicate(trials, squid_bridge2(players, 0, 0.5, 1))
squid3 <- replicate(trials, squid_bridge3(16, 0, 0.5, 1))
df <- tibble(squid1 = squid1,
squid2 = squid2,
squid3 = squid3) %>%
pivot_longer(cols = c(squid1, squid2, squid3))
ggplot(data = df,
aes(x = value)) +
geom_histogram(bins = 10,
binwidth = 1,
fill = "cornflowerblue",
color = "black") +
facet_wrap(~name,
nrow = 3) +
xlab("# of players to make it to the end") +
scale_x_continuous(breaks = seq(0, 16, by = 1),
labels = seq(0, 16, by = 1))
As you can see below, the first situation is heavily skewed to the left. Since the players are essentially "blindly guessing" at each tile, it is unlikely that any will make it to the end. However, after accounting for the information gained from a wrong guess, it averages somewhere around 7 players making it. By adding in a random chance of falling for another reason, the distribution skews to the left some.
Average for first situation: 1.45
Average for second situation: 7.01
Average for third situation: 4.99
To answer the question of the probability of only 3 players making it, I get ~ 10.8% for the last case
Edit: As requested, here is the code to generate the plots. I also fixed the various functions that had some naming issues (went through a few different names when I made them). It looks like it resulted in a slight bug for the 3rd function, but I have fixed it throughout.

○ △ □
##########
# Game ○ △ □
##########
squidd7<-function(N_Fields,N_Players,p_new_field){
Players<-data.frame(id = 1:N_Players, alive=rep(1,N_Players),Field=0)
for(i in 1:N_Players){
while (Players[i,"alive"]==TRUE && max(Players$Field)< N_Fields) {
Players[i,"Field"]=Players[i,"Field"]+1 # Jump onto the next Field
Players[i,"alive"]=rbinom(1,1,p_new_field)# Fall or repeat
}
Players[i+1,"Field"]=Players[i,"Field"] # next player starts where prior player died
}
Players<-Players[1:N_Players,] # cosmetic because i do i+1 in the prior line
# Print me some messages
if(!any(Players$alive>0)){
cat("Players loose!")
} else {
cat(" \n After", max(Players$Field),"goal was reached! ")
cat("Players",Players[Players$alive==1,"id"], "survive")
}
return(Players)
}
squidd7(18,16,0.5)
###########
# simulation ○ △ □
###########
results<-data.frame(matrix(0, nrow = 100, ncol = 20))
for(x in 1:ncol(results)){
for (i in 1:nrow(results)) {
Players<-squidd7(x+7,16,0.5)
results[i,x]<-sum(Players$alive)
}
}
###########
## Results ○○□□○ △ □
sdt<-apply(results,2,sd) # standart devation
mn<-apply(results,2,mean) # ○ △ □
boxplot(results,xlab ="n Steps ",names = 8:27,ylab="N Survivors of 16 ")
points(mn,type="l")
points(sdt,type="l")
colors<-colorRampPalette(c(rgb(0,1,0,0.4),
rgb(1,1,0,0.4),
rgb(1,0,0,0.4)), alpha = TRUE)(21)
plot(density(results$X1),type="n",xlim=c(-1,17),ylim=c(0,0.30),
main="○ △ □ ",
sub="○ △ □ ○ △ □ ○ △ □",
xlab="number of survivors")
for( i in 1:21){
polygon(density(results[,i]),col= colors[i])
}
legend(15,0.31,title="Steps",legend=8:28,fill=colors,border = NA,
y.intersp = 0.5,
cex = 0.8, text.font = 0.3)

well to simulate this game. you will need 50%/50% chance 16 times. meaning all you have to code is 50% chance of not losing and run it 16 times than if you lose you it will do a -1 form a varible of 18. that will create a perfect digital recreation of squidgame bridge

If I'm understanding correctly, I believe the other answers are complicating the simulation.
We can simulate draws from a binomial distribution of size 18. Every 1 kills someone (assuming there is anyone left to kill). Thus we can calculate the number of survivors by subtracting the number of 1s drawn from the number of players, truncated at 0 (any negative results are counted as 0, via pmax()).
set.seed(47)
n_sim = 1e4
survivors = pmax(0, 16 - rbinom(n_sim, size = 18, prob = 0.5))
mean(survivors)
# [1] 7.009
The mean seems to approach Xi'an's answer of 7.000076 as n_sim increases. At 500M simulations (which still runs rather quickly with this method!) we get a mean of 7.000073.
Plotting these results, they appear basically identical to cazman's squid2 scenario.
ggplot(data.frame(survivors), aes(x = survivors)) + geom_bar() +
scale_y_continuous(limits = c(0, 6000))

Related

How to find n average number of trials before criteria are met with differing probabilities per outcome?

I've spent a few days trying to figure this out and looking up tutorials, but everything I've found so far seems like it's close to what I need but don't give the results I need.
I have a device that produces a single letter, A-F. For simplicity's sake, you can think of it like a die with letters. It will always produce one and only one letter each time it is used. However it has one major difference: each letter can have a differing known probability of being picked:
A: 25%
B: 5%
C: 20%
D: 15%
E: 20%
F: 15%
These probabilities remain constant throughout all attempts.
Additionally, I have a specific combination I must accrue before I am "successful":
As needed: 1
Bs needed: 3
Cs needed: 0
Ds needed: 1
Es needed: 2
Fs needed: 3
I need find the average number of letter picks (i.e. rolls/trials/attempts) that have to happen for this combination of letters to be accrued. It's completely fine for any individual outcome to have more than the required number of letters, but success is only counted when each letter has been chosen at least its minimum amount of times.
I've looked at plenty of tutorials for multinomial probability distribution and similar things, but I haven't found anything that explains how to find average number of trials for a scenario like this. Please kindly explain answers clearly as I'm not a wiz with statistics.
In addition to Severin's answer that logically looks good to me but might be costly to evaluate (i.e. infinite sum of factorials).
Let me provide some intuition that should give a good approximation.
Considering each category at a time. Refer this math stackexchange question/ answer. Expected number of tosses in which you would get the k number of successes for each category (i) can be calculated as k(i)/ P(i):
Given,
p(A): 25% ; Expected number of tosses to get 1 A = 1/ 0.25 = 4
p(B): 5% ; Expected number of tosses to get 3 B's = 3/ 0.05 = 60
p(C): 20% ; Expected number of tosses to get 0 C = 0/ 0.20 = 0
p(D): 15% ; Expected number of tosses to get 1 D = 1/ 0.15 = 6.67 ~ 7
p(E): 20% ; Expected number of tosses to get 2 E's = 2/ 0.20 = 10
p(F): 15% ; Expected number of tosses to get 3 F's = 3/ 0.15 = 20
you get an idea that getting 3 B's is your bottleneck, you can expect on average 60 tosses for your scenario to play out.
Well, minimum number of throws is 10. Average would be infinite sum
A=10•P(done in 10)+11•P(done in 11)+12•P(done in 12) + ...
For P(done in 10) we could use multinomial
P(10)=Pm(1,3,0,1,2,3|probs), where probs=[.25, .05, .20, .15, .20, .15]
For P(11) you have one more throw which you could distribute like this
P(11)=Pm(2,3,0,1,2,3|probs)+Pm(1,4,0,1,2,3|probs)+Pm(1,3,0,2,2,3|probs)+
Pm(1,3,0,1,3,3|probs)+Pm(1,3,0,1,2,4|probs)
For P(12) you have to distribute 2 more throws. Note, that there are combinations of throws which are impossible to get, like Pm(2,3,0,2,2,3|probs), because you have to stop earlier
And so on and so forth
Your process can be described as a Markov chain with a finite number of states, and an absorbing state.
The number of steps before reaching the absorbing state is called the hitting time. The expected hitting time can be calculated easily from the transition matrix of the Markov chain.
Enumerate all possible states (a, b, c, d, e, f). Consider only a finite number of states, because "b >= 3" is effectively the same as "b = 3", etc. The total number of states is (1+1)*(3+1)*(0+1)*(2+1)*(3+1) = 192.
Make sure that in your enumeration, starting state (0, 0, 0, 0, 0, 0) comes first, with index 0, and absorbing state (1, 3, 0, 1, 2, 3) comes last.
Build the transition matrix P. It's a square matrix with one row and column per state. Entry P[i, j] in the matrix gives the probability of going from state i to state j when rolling a die. There should be at most 6 non-zero entries per row.
For example, if i is the index of state (1, 0, 0, 1, 2, 2) and j the index of state (1, 1, 0, 1, 2, 2), then P[i, j] = probability of rolling face B = 0.05. Another example: if i is the index of state (1,3,0,0,0,0), then P[i,i] = probability of rolling A, B or C = 0.25+0.05+0.2 = 0.5.
Call Q the square matrix obtained by removing the last row and last column of P.
Call I the identity matrix of the same dimensions as Q.
Compute matrix M = (I - Q)^-1, where ^-1 is matrix inversion.
In matrix M, the entry M[i, j] is the expected number of times that state j will be reached before the absorbing state, when starting from state i.
Since our experiment starts in state 0, we're particularly interested in row 0 of matrix M.
The sum of row 0 of matrix M is the expected total number of states reached before the absorbing state. That is exactly the answer we seek: the number of steps to reach the absorbing state.
To understand why this works, you should read a course on Markov chains! Perhaps this one: James Norris' course notes on Markov chains. The chapter about "hitting times" (which is the name for the number of steps before reaching target state) is chapter 1.3.
Below, an implementation in python.
from itertools import product, accumulate
from operator import mul
from math import prod
import numpy as np
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def get_expected_n_trials(targets, dice_weights):
states = list(product(*(range(n+1) for n in targets)))
base = list(accumulate([n+1 for n in targets[:0:-1]], mul, initial=1))[::-1]
lookup = dict(map(reversed, enumerate(states)))
P = np.zeros((len(states), len(states)))
for i, s in enumerate(states):
a,b,c,d,e,f = s
for f, p in enumerate(dice_weights):
#j = index of state reached from state i when rolling face f
j = i + base[f] * (s[f] < targets[f])
j1 = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]
if (j != j1):
print(i, s, f, ' --> ' , j, j1)
assert(j == j1)
P[i,j] += p
Q = P[:-1, :-1]
I = np.identity(len(states)-1)
M = np.linalg.inv(I - Q)
return M[0,:].sum()
print(get_expected_n_trials(targets, dice_weights))
# 61.28361802372382
Explanations of code:
First we build the list of states using Cartesian product itertools.product
For a given state i and die face f, we need to calculate j = state reached from i when adding f. I have two ways of calculating that, either as j = i + base[f] * (s[f] < targets[f]) or as j = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]. Because I'm paranoid, I calculated it both ways and checked that the two ways gave the same result. But you only need one way. You can remove lines j1 = ... to assert(j == j1) if you want.
Matrix P begins filled with zeroes, and we fill up to six cells per row with P[i, j] += p where p is probability of rolling face f.
Then we compute matrices Q and M as I indicated above.
We return the sum of all the cells on the first row of M.
To help you better understand what is going on, I encourage you to examine the values of all variables. For instance you could replace return M[0, :].sum() with return states, base, lookup, P, Q, I, M and then write states, base, lookup, P, Q, I, M = get_expected_n_trials(targets, dice_weights) in the python interactive shell, so that you can look at the variables individually.
A Monte-Carlo simulation:
Actually roll the die until we hit the requirements;
Count how many rolls we did;
Repeat experiment 1000 times to get the empirical average value.
Implementation in python:
from collections import Counter
from random import choices
from itertools import accumulate
from statistics import mean, stdev
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def avg_n_trials(targets, dice_weights, n_experiments=1000):
dice_faces = range(len(targets))
target_state = Counter(dict(enumerate(targets)))
cum_weights = list(accumulate(dice_weights))
results = []
for _ in range(n_experiments):
state = Counter()
while not state >= target_state:
f = choices(dice_faces, cum_weights=cum_weights)[0]
state[f] += 1
results.append(state.total()) # python<3.10: sum(state.values())
m = mean(results)
s = stdev(results, xbar=m)
return m, s
m, s = avg_n_trials(targets, dice_weights, n_experiments=10000)
print(m)
# 61.4044

Renewal Function for Weibull Distribution

The renewal function for Weibull distribution m(t) with t = 10 is given as below.
I want to find the value of m(t). I wrote the following r code to compute m(t)
last_term = NULL
gamma_k = NULL
n = 50
for(k in 1:n){
gamma_k[k] = gamma(2*k + 1)/factorial(k)
}
for(j in 1: (n-1)){
prev = gamma_k[n-j]
last_term[j] = gamma(2*j + 1)/factorial(j)*prev
}
final_term = NULL
find_value = function(n){
for(i in 2:n){
final_term[i] = gamma_k[i] - sum(last_term[1:(i-1)])
}
return(final_term)
}
all_k = find_value(n)
af_sum = NULL
m_t = function(t){
for(k in 1:n){
af_sum[k] = (-1)^(k-1) * all_k[k] * t^(2*k)/gamma(2*k + 1)
}
return(sum(na.omit(af_sum)))
}
m_t(20)
The output is m(t) = 2.670408e+93. Does my iteratvie procedure correct? Thanks.
I don't think it will work. First, lets move Γ(2k+1) from denominator of m(t) into Ak. Thus, Ak will behave roughly as 1/k!.
In the nominator of the m(t) terms there is t2k, so roughly speaking you're computing sum with terms
100k/k!
From Stirling formula
k! ~ kk, making terms
(100/k)k
so yes, they will start to decrease and converge to something but after 100th term
Anyway, here is the code, you could try to improve it, but it breaks at k~70
N <- 20
A <- rep(0, N)
# compute A_k/gamma(2k+1) terms
ps <- 0.0 # previous sum
A[1] = 1.0
for(k in 2:N) {
ps <- ps + A[k-1]*gamma(2*(k-1) + 1)/factorial(k-1)
A[k] <- 1.0/factorial(k) - ps/gamma(2*k+1)
}
print(A)
t <- 10.0
t2 <- t*t
r <- 0.0
for(k in 1:N){
r <- r + (-t2)^k*A[k]
}
print(-r)
UPDATE
Ok, I calculated Ak as in your question, got the same answer. I want to estimate terms Ak/Γ(2k+1) from m(t), I believe it will be pretty much dominated by 1/k! term. To do that I made another array k!*Ak/Γ(2k+1), and it should be close to one.
Code
N <- 20
A <- rep(0.0, N)
psum <- function( pA, k ) {
ps <- 0.0
if (k >= 2) {
jmax <- k - 1
for(j in 1:jmax) {
ps <- ps + (gamma(2*j+1)/factorial(j))*pA[k-j]
}
}
ps
}
# compute A_k/gamma(2k+1) terms
A[1] = gamma(3)
for(k in 2:N) {
A[k] <- gamma(2*k+1)/factorial(k) - psum(A, k)
}
print(A)
B <- rep(0.0, N)
for(k in 1:N) {
B[k] <- (A[k]/gamma(2*k+1))*factorial(k)
}
print(B)
shows that
I got the same Ak values as you did.
Bk is indeed very close to 1
It means that term Ak/Γ(2k+1) could be replaced by 1/k! to get quick estimate of what we might get (with replacement)
m(t) ~= - Sum(k=1, k=Infinity) (-1)k (t2)k / k! = 1 - Sum(k=0, k=Infinity) (-t2)k / k!
This is actually well-known sum and it is equal to exp() with negative argument (well, you have to add term for k=0)
m(t) ~= 1 - exp(-t2)
Conclusions
Approximate value is positive. Probably will stay positive after all, Ak/Γ(2k+1) is a bit different from 1/k!.
We're talking about 1 - exp(-100), which is 1-3.72*10-44! And we're trying to compute it precisely summing and subtracting values on the order of 10100 or even higher. Even with MPFR I don't think this is possible.
Another approach is needed
OK, so I ended up going down a pretty different road on this. I have implemented a simple discretization of the integral equation which defines the renewal function:
m(t) = F(t) + integrate (m(t - s)*f(s), s, 0, t)
The integral is approximated with the rectangle rule. Approximating the integral for different values of t gives a system of linear equations. I wrote a function to generate the equations and extract a matrix of coefficients from it. After looking at some examples, I guessed a rule to define the coefficients directly and used that to generate solutions for some examples. In particular I tried shape = 2, t = 10, as in OP's example, with step = 0.1 (so 101 equations).
I found that the result agrees pretty well with an approximate result which I found in a paper (Baxter et al., cited in the code). Since the renewal function is the expected number of events, for large t it is approximately equal to t/mu where mu is the mean time between events; this is a handy way to know if we're anywhere in the neighborhood.
I was working with Maxima (http://maxima.sourceforge.net), which is not efficient for numerical stuff, but which makes it very easy to experiment with different aspects. At this point it would be straightforward to port the final, numerical stuff to another language such as Python.
Thanks to OP for suggesting the problem, and S. Pappadeux for insightful discussions. Here is the plot I got comparing the discretized approximation (red) with the approximation for large t (blue). Trying some examples with different step sizes, I saw that the values tend to increase a little as step size gets smaller, so I think the red line is probably a little low, and the blue line might be more nearly correct.
Here is my Maxima code:
/* discretize weibull renewal function and formulate system of linear equations
* copyright 2020 by Robert Dodier
* I release this work under terms of the GNU General Public License
*
* This is a program for Maxima, a computer algebra system.
* http://maxima.sourceforge.net/
*/
"Definition of the renewal function m(t):" $
renewal_eq: m(t) = F(t) + 'integrate (m(t - s)*f(s), s, 0, t);
"Approximate integral equation with rectangle rule:" $
discretize_renewal (delta_t, k) :=
if equal(k, 0)
then m(0) = F(0)
else m(k*delta_t) = F(k*delta_t)
+ m(k*delta_t)*f(0)*(delta_t / 2)
+ sum (m((k - j)*delta_t)*f(j*delta_t)*delta_t, j, 1, k - 1)
+ m(0)*f(k*delta_t)*(delta_t / 2);
make_eqs (n, delta_t) :=
makelist (discretize_renewal (delta_t, k), k, 0, n);
make_vars (n, delta_t) :=
makelist (m(k*delta_t), k, 0, n);
"Discretized integral equation and variables for n = 4, delta_t = 1/2:" $
make_eqs (4, 1/2);
make_vars (4, 1/2);
make_eqs_vars (n, delta_t) :=
[make_eqs (n, delta_t), make_vars (n, delta_t)];
load (distrib);
subst_pdf_cdf (shape, scale, e) :=
subst ([f = lambda ([x], pdf_weibull (x, shape, scale)), F = lambda ([x], cdf_weibull (x, shape, scale))], e);
matrix_from (eqs, vars) :=
(augcoefmatrix (eqs, vars),
[submatrix (%%, length(%%) + 1), - col (%%, length(%%) + 1)]);
"Subsitute Weibull pdf and cdf for shape = 2 into discretized equation:" $
apply (matrix_from, make_eqs_vars (4, 1/2));
subst_pdf_cdf (2, 1, %);
"Just the right-hand side matrix:" $
rhs_matrix_from (eqs, vars) :=
(map (rhs, eqs),
augcoefmatrix (%%, vars),
[submatrix (%%, length(%%) + 1), col (%%, length(%%) + 1)]);
"Generate the right-hand side matrix, instead of extracting it from equations:" $
generate_rhs_matrix (n, delta_t) :=
[delta_t * genmatrix (lambda ([i, j], if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))];
"Generate numerical right-hand side matrix, skipping over formulas:" $
generate_rhs_matrix_numerical (shape, scale, n, delta_t) :=
block ([f, F, numer: true], local (f, F),
f: lambda ([x], pdf_weibull (x, shape, scale)),
F: lambda ([x], cdf_weibull (x, shape, scale)),
[genmatrix (lambda ([i, j], delta_t * if i = 1 and j = 1 then 0
elseif j > i then 0
elseif j = i then f(0)/2
elseif j = 1 then f(delta_t*(i - 1))/2
else f(delta_t*(i - j))), n + 1, n + 1),
transpose (makelist (F(k*delta_t), k, 0, n))]);
"Solve approximate integral equation (shape = 3, t = 1) via LU decomposition:" $
fpprintprec: 4 $
n: 20 $
t: 1;
[AA, bb]: generate_rhs_matrix_numerical (3, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Iterative solution of approximate integral equation (shape = 3, t = 1):" $
xx: bb;
for i thru 10 do xx: AA . xx + bb;
xx - (AA.xx + bb);
xx_iterative: xx;
"Should find iterative and LU give same result:" $
xx_diff: xx_iterative - xx_by_lu[1];
sqrt (transpose(xx_diff) . xx_diff);
"Try shape = 2, t = 10:" $
n: 100 $
t: 10 $
[AA, bb]: generate_rhs_matrix_numerical (2, 1, n, t/n);
xx_by_lu: linsolve_by_lu (ident(n + 1) - AA, bb, floatfield);
"Baxter, et al., Eq. 3 (for large values of t) compared to discretization:" $
/* L.A. Baxter, E.M. Scheuer, D.J. McConalogue, W.R. Blischke.
* "On the Tabulation of the Renewal Function,"
* Econometrics, vol. 24, no. 2 (May 1982).
* H(t) is their notation for the renewal function.
*/
H(t) := t/mu + sigma^2/(2*mu^2) - 1/2;
tx_points: makelist ([float (k/n*t), xx_by_lu[1][k, 1]], k, 1, n);
plot2d ([H(u), [discrete, tx_points]], [u, 0, t]), mu = mean_weibull(2, 1), sigma = std_weibull(2, 1);

R: rgeom(n,p) when n and p is not fixed

Situation: Assume a bag contains 1 blue ball and 1 red ball. At each turn, a ball is chosen randomly, then returned back along with another (new) ball of the same colour (So after n turns, there will always be n+2 balls in the bag).
The probability of drawing the first red on the nth turn is
I want to write a loop that simulates the number of turns taken until the first red ball is drawn from the bag using the rgeom(n,p) code. But since n is unknown and p changes every turn, I am confused how to modify it. I tried the following code, but it won't print anything:
k=0
success = 0
while(success <= 1){
k = k + 1
if (rgeom(n, 1/(n+1) == 1)
success = success + 1
}
k
How can I write the loop?
If I run with a couple of small changes I got some value for k:
set.seed(123)
k=0
success = 0
while(success < 1){ # rather than <=1, which includes 0
k = k + 1
if (rgeom(n, 1/(n+1)) == 1) # missed closing parenthesis
success = success + 1
}
k
I got k=8
but there are lots of warnings!!

For Loop in R replacing Object Values at each iteration

I am struggling to figure out how to create a for loop in which some initial objects (u, l, h, and y) and their values are updated and reported at the end of each iteration of the loop. And that the loop takes into account the values of the prior iteration as the basis (for example after updating the above objects, the runif function takes the updated values of u and l in drawing a q. I keep getting the same result repeated with no variation, and I am unsure as to what might be the best way to resolve this.
Apologies in advance as I am fairly new to R and coding in general.
reset = {
l = 0.1 #lower bound of belief in theta
u = 0.9 #upper bound of belief in theta
h = 0.2 #lower legal threshold, below which an action is not liable
y = 0.8 #upper legal threshold, above which an action is liable
}
### need 1-u <= h <= y <= 1-l for each t along every path of play
period = c(1:100) ## Number of periods in the iteration of the loop.
for (t in 1:length(period)) {
q = runif(1,min = l, max = u) ### 1 draw of q from a uniform distribution
q
probg = function(q,l,u){(u - (1-q))/(u-l)} ### probability of being found guilty given q in the ambiguous region
probg(q,l,u)
probi = function(q,l,u){1-probg(q,l,u)} ### probability of being found innocent given q in the ambiguous region
probi(q,l,u)
ruling = if(q>=y | probg(q,l,u) > 1){print("Guilty") ###Strict liability
} else if(q<=h | probi(q,l,u) > 1) {print("Innocent") ###Permissible
} else if(q>h & q<y) { ###Ambiguous region
discovery = sample(c('guilty','not guilty'), size=1, replace=TRUE, prob=c(probg(q,l,u),probi(q,l,u))) ### court discovering whether a particular ambiguous q is permissible or not
}
discovery
ruling
if(ruling == "not guilty") {u = 1-q} else if (ruling == "guilty") {l = 1-q} else (print("beliefs unchanged"))
if(ruling == "not guilty"){h = 1 - u} else if (ruling == "guilty") {y = 1 - l} else (print("legal threshold unchanged")) #### legal adjustment and updating of beliefs in ambiguous region after discovery of liability
probg(q,l,u)
probi(q,l,u)
modelparam = c(l,u,h,y)
show(modelparam)
}

Newton's Method in R Precision/Output

So, I'm supposed to write the code to execute Newton's Method to calculate the square root of any arbitrary number to a specified precision (tolerance).
Here is my code:
MySqrt <- function(x, eps = 1e-6, itmax = 100, verbose = TRUE) {
GUESS <- 11
myvector <- integer(0)
i <- 1
if (x < 0) {
stop("Square root of negative value")
}
else {
myvector[i] <- GUESS
while (i <= itmax) {
GUESS <- (GUESS + (x/GUESS)) * 0.5
myvector[i+1] <- GUESS
if (abs(GUESS-myvector[i]) < eps) {
break()
}
if (verbose) {
cat("Iteration: ", formatC(i, width = 1), formatC(GUESS, digits = 10, width = 12), "\n")
}
i <- i + 1
}
}
myvector[i]
}
eps is the tolerance. When I use the function to calculate the square root of, say, 21, I got this as an output:
> MySqrt(21, eps = 1e-1, verbose = TRUE)
Iteration: 1 6.454545455
Iteration: 2 4.854033291
Iteration: 3 4.59016621
I'm not sure if the function stops carrying out iterations when it is supposed to, however. Can someone verify if my code is correct? This would be greatly appreciated!
Your code is almost correct. It is iterating the correct number of times. The only bug is that you don't increment i until after the break statement, so you are not returning the most recent approximation. Instead you are returning the previous one.
In order to verify that it is stopping at the right time, you can move the tracing line up above the break. You can also add GUESS-myvector[i] to the trace, so you can watch it halt as soon as the difference gets small enough. If you do this and run the function, the fact that it is stopping at the right time, as well as the fact that it is returning the wrong value, will be obvious:
> MySqrt(21,eps=1e-1)
Iteration: 1 6.454545 -4.545455
Iteration: 2 4.854033 -1.600512
Iteration: 3 4.590166 -0.2638671
Iteration: 4 4.582582 -0.007584239
[1] 4.590166
While your code is (almost) correct, it is not written in very good R style. For example, unless you want to return the entire vector of estimates, there is no reason that you need to keep them all around. Also, rather than using a while loop, here it would make more sense to use a for loop. Here one possible improved version of your function:
MySqrt <- function(x, eps = 1e-6, itmax = 100, verbose = TRUE) {
GUESS <- 11
if (x < 0) {
stop("Square root of negative value")
}
for(i in 1:itmax){
nextGUESS <- (GUESS + (x/GUESS)) * 0.5
if (verbose)
cat("Iteration: ", i, nextGUESS, nextGUESS-GUESS, "\n")
if (abs(GUESS-nextGUESS) < eps)
break
GUESS<- nextGUESS
}
nextGUESS
}

Resources