How to repeat 1000 times this random walk simulation in R? [duplicate] - r

This question already has an answer here:
How to repeat 1000 times this random walk simulation in R?
(1 answer)
Closed 2 years ago.
I'm simulating a one-dimensional and symmetric random walk procedure:
$$y_t=y_{t-1}+\varepsilon_t$$
where white noise is denoted by $\varepsilon_t \sim N(0,1)$ in time period $t$. There is no drift in this procedure.
Also, RW is symmetric, because $Pr(y_i=+1)=Pr(y_i=-1)=0.5$.
Here's my code in R:
set.seed(1)
t=1000
epsilon=sample(c(-1,1), t, replace = 1)
y<-c()
y[1]<-0
for (i in 2:t) {
y[i]<-y[i-1]+epsilon[i]
}
par(mfrow=c(1,2))
plot(1:t, y, type="l", main="Random walk")
outcomes <- sapply(1:1000, function(i) cumsum(y[i]))
hist(outcomes)
I would like to simulate 1000 different $y_{it}$ series (i=1,...,1000;t=1,...,1000). (After that, I will check the probability of getting back to the origin ($y_1=0$) at $t=3$, $t=5$ and $t=10$.
Which function does allow me to do this kind of repetition with $y_t$ random walk time-series?

Try the following:
length_of_time_series <- 1000
num_replications <- 1000
errors <- matrix(rnorm(length_of_time_series*num_replications),ncol=num_replications)
rw <- apply(errors, 2, cumsum)
This creates 1000 random walks simultaneously by first defining a matrix filled with white noise error terms drawn from a standard normal distribution, and in the second step I calculate the cumulative sums, which should correspond to your random walk, assuming that y_0=0.
Note that I have ignored that your errors are either -1 or 1, since I am not sure that this is what you intended. Having said that, you can adjust the above code easily with the following line to create errors that are either 1 or -1:
errors2 <- ifelse(errors > 0, 1, -1)
If you really want to go the way of doing it repeatedly as opposed to doing it simultaneously, you can define a function that returns the random walk, and use replicate. Note that you should not hardcode the seed inside the function to avoid replicating the same random walk all the time.

Related

Random generated number are linear combination among them even if not specified

I am simulating some draws using random numbers. Unlikely, the generated numbers are not random as I would like. In fact, I obtain that there are some linear combinations.
In details, I have the following starting data:
start_vector = c(1,10,30,40,50,100) # length equal to 6
residual_of_model = 5
n = 1000 # Number of simulations
I try to simulate n observations from a random normal distribution for each of the start_vector elements, assuming it as a "random noise" to add to the original value (that is the one into start_vector):
out_vec <- matrix(NA, nrow = n, ncol = length(start_vector))
for (h_aux in 1:length(start_vector))
{
random_noise <- rnorm(n, 0, residual_of_model)
out_vec[,h_aux] <- as.numeric(start_vector[h_aux]) + random_noise
}
At this point, I obtain a matrix of size 6x1000. In theory, I assume all the columns and the rows in the matrix are linearly independent among them.
If I try to check it, using the findLinearCombos() function from the caret package I obtain that all the columns are indepent:
caret::findLinearCombos(out_vec)
If I try to evaluate the independence among the rows, using the following code:
caret::findLinearCombos(t(out_vec))
I obtain that all the rows from 7 to 1000 are a linear combination of the first 6 (the length of start_vector).
It is really strange in my opinion, I would like to not observe no dependencies at all since the rows are generated adding a random number using rnorm.
What am I missing? Is there some bug? Thanks in advance!

Calculating lm() within a loop

Objective: The overall objective of the problem is to calculate the confidence interval (CI) of various sample sizes (n=2,4..1024) of rnorm, 10,000 times and then count the number of times each one fails (this likely requires a counter and an if/else statement). Finally the results are to be plotted
I am trying to calculate CI of the means for several simulations of a sample sizes, however, I am first trying to break down the code for one specific sample size a = 8.
The problem I have is that I do not know how to generate a linear model for each row. Would anyone know how I can do this? Here is what I have so far:
a <- 8
n.sim.3 <- 10000
for ( i in a) {
r.mat <- matrix(rnorm(i*n.sim.3), nrow=n.sim.3, ncol = a)
lm.tmp <- apply(three.mat,1,lm(n.sim.3~1) # The lm command is where I'm stuck I don't think this is correct)
confint.tmp <- confint(lm.tmp)

Preventing a Gillespie SSA Stochastic Model From Running Negative

I have produce a stochastic model of infection (parasitic worm), using a Gillespie SSA. The model used the "GillespieSSA"package (https://cran.r-project.org/web/packages/GillespieSSA/index.html).
In short the code models a population of discrete compartments. Movement between compartments is dependent on user defined rate equations. The SSA algorithm acts to calculate the number of events produced by each rate equation for a given timestep (tau) and updates the population accordingly, process repeats up to a given time point. The problem is, the number of events is assumed Poisson distributed (Poisson(rate[i]*tau)), thus produces an error when the rate is negative, including when population numbers become negative.
# Parameter Values
sir.parms <- c(deltaHinfinity=0.00299, CHi=0.00586, deltaH0=0.0854, aH=0.5,
muH=0.02, SigmaW=0.1, SigmaM =0.8, SigmaL=104, phi=1.15, f = 0.6674,
deltaVo=0.0166, CVo=0.0205, alphaVo=0.5968, beta=52, mbeta=7300 ,muV=52, g=0.0096, N=100)
# Inital Population Values
sir.x0 <- c(W=20,M=10,L=0.02)
# Rate Equations
sir.a <- c("((deltaH0+deltaHinfinity*CHi*mbeta*L)/(1+CHi*mbeta*L))*mbeta*L*N"
,"SigmaW*W*N", "muH*W*N", "((1/2)*phi*f)*W*N", "SigmaM*M*N", "muH*M*N",
"(deltaVo/(1+CVo*M))*beta*M*N", "SigmaL*L*N", "muV*L*N", "alphaVo*M*L*N", "(aH/g)*L*N")
# Population change for even
sir.nu <- matrix(c(+0.01,0,0,
-0.01,0,0,
-0.01,0,0,
0,+0.01,0,
0,-0.01,0,
0,-0.01,0,
0,0,+0.01/230,
0,0,-0.01/230,
0,0,-0.01/230,
0,0,-0.01/230,
0,0,-0.01/32),nrow=3,ncol=11,byrow=FALSE)
runs <- 10
set.seed(1)
# Data Frame of output
sir.out <- data.frame(time=numeric(),W=numeric(),M=numeric(),L=numeric())
# Multiple runs and combining data and SSA methods
for(i in 1:runs){
sim <- ssa(sir.x0,sir.a,sir.nu,sir.parms, method="ETL", tau=1/12, tf=140, simName="SIR")
sim.out <- data.frame(time=sim$data[,1],W=sim$data[,2],M=sim$data[,3],L=sim$data[,4])
sim.out$run <- i
sir.out <- rbind(sir.out,sim.out)
}
Thus, rates are computed and the model updates the population values for each time step, with the data store in a data frame, then attached together with previous runs. However, when levels of the population get very low events can occur such that the number of events that occurs reducing a population is greater than the number in the compartment. One method is to make the time step very small, however this greatly increases the length of the simulation very long.
My question is there a way to augment the code so that as the data is created/ calculated at each time step any values of population numbers that are negative are converted to 0?
I have tried working on this problem, but only seem to be able to come up with methods that alter the values once the simulation is complete, with the negative values still causing issues in the runs themselves.
E.g.
if (sir.out$L < 0) sir.out$L == 0
Any help would be appreciated
I believe the problem is the method you set ("ETL") in the ssa function. The ETL method will eventually produce negative numbers. You can try the "OTL" method, based on Efficient step size selection for the tau-leaping simulation method- in which there are a few more parameters that you can tweak, but the basic command is:
ssa(sir.x0,sir.a,sir.nu,sir.parms, method="OTL", tf=140, simName="SIR")
Or the direct method, which will not produce negative number whatsoever:
ssa(sir.x0,sir.a,sir.nu,sir.parms, method="D", tf=140, simName="SIR")

Correct way of drawing random number for a simulation

I am trying to generate random numbers for a simulation (the example below uses the uniform distribution for simplicity). Why would these two methods produce different average values (a: 503.2999, b: 497.5372) when sampled 10k times with the same seed number:
set.seed(2)
a <- runif(10000, 1, 999)
draw <- function(x) {
runif(1, 1, 999)
}
b <- sapply(1:10000, draw)
print(c(mean(a), mean(b)))
In my model, the random number for the first method would be referenced within a simulation using a[sim_number] while in the second instance, the runif function would be placed inside the simulation function itself. Is there a correct way of doing it?
For completeness, the answer is that you need to set the seed before each random draw if you want them to be the same.

Generate Poisson process using R

I want to generate a process where in every step there is a realisation of a Poisson random variable, this realisation should be saved and then it should be realize the next Poisson random variable and add it to the sum of all realisations before. Furthermore there should be a chance that in every step this process stops. Hope that makes sense to you guys... Any thought is appreciated!
More compactly, pick a single geometrically distributed random number for the total number of steps achieved before stopping, then use cumsum to sum that many Poisson deviates:
stopping.prob <- 0.3 ## for example
lambda <- 3.5 ## for example
n <- rgeom(1,1-stopping.prob)+1 ## constant probability per step of stopping
cumsum(rpois(n,lambda))
You are very vague on the parameters of your simulation but how's this?
Lambda for random Poisson number.
lambda <- 5
This is the threshold value when the function exits.
th <- 0.999
Create a vector of length 1000.
bin <- numeric(1000)
Run the darn thing. It basically rolls a "dice" (values generated are between 0 and 1). If the values is below th, it returns a random Poisson number. If the value is above th (but not equal), the function stops.
for (i in 1:length(bin)) {
if (runif(1) < th) {
bin[i] <- rpois(1, lambda = lambda)
} else {
stop("didn't meet criterion, exiting")
}
}
Remove zeros if any.
bin <- bin[bin != 0]
You can use cumsum to cumulatively sum values.
cumsum(bin)

Resources