I am trying to simulate fBm with its integral representation. I know that there are faster methods out there, but i would like to play around with the kernel function inside the integral.
enter link description here
My appraoch was to simulate the stochastic integral cummulatively.
set.seed(100)
dt = 0.01
T = seq(-50,10,0.1)
n=length(T)
m=length(T)
Gamma = matrix(0, n, m)
H=0.9
exponent <- function(a, pow) (abs(a)^pow)*sign(a)
for(j in 1:length(T)){
zeile = numeric(length(T)) #resetting our path for each j
y = sqrt(dt)*rnorm(n=1, mean = 0, sd=1) #normal distributred r.V.
zeile[1] = (max(exponent(T[j] - T[1],H-0.5),0) - max(exponent(-T[1],H-0.5),0))*y #first entry of one path
for(i in 1:(length(T)-1)){
y1 = sqrt(dt)*rnorm(n=1, mean = 0, sd=1)
y2 = sqrt(dt)*rnorm(n=1, mean = 0, sd=1)
zeile[i+1] = zeile[i] + max(exponent(T[j] - T[i],H-0.5),0)*y1 - max(exponent(-T[i],H-0.5),0)*y2
}
Gamma[j,] = zeile
}
normalV = rnorm(length(T),mean =0,sd=1)
path = Gamma%*%normalV
plot(T, path, type="l")
T is the interval over which we will plot. The first for loop is there to go over each time point and fix it in the second for loop in which we fill up one row of the matrix. After we have our matrix, I thought I have to multiply it by a N(0,1) vektor in order to get one path of fBm. Clearly I do not.
Related
I've been simulating a population dynamics model, and added in some environmental stochasticity by making the value of one of the parameters time-varying.
(To be more specific, I made thermal performance curves that relate the temperature of the system to the growth rates of two species within the system. I then randomly sampled some temperatures to create a vector of those temperatures and their corresponding growth rates. I then set up the simulation such that the value of the growth rate parameters could change with the temperature of the system over time.)
Now, I want to add some demographic stochasticity to my system using the Gillespie algorithm, and specifically, the GillespieSSA package in R. I've run into trouble trying to integrate my existing environmentally stochastic implementation with the arguments the ssa functions takes.
This is my environmentally stochastic implementation:
ri <- QuadEqn_1(temp = temp_sequence); rj <- QuadEqn_2(temp = temp_sequence) # Where QuadEqn_1 and _2 are the thermal performance curves that give the growth rate, when given the temperature of the system, "temp_sequence", which is a result of a random draw not shown here
k <- 0.001; p <- 1 ; o <- 1000
parms <- list(ri = ri, rj = rj, k = k, p = p, o = o)
Antia_3sp_Model <- function(t,y,p1){
tt <- floor(t) + 1
Pi <- y[1]; Pj <- y[2]; I <- y[3]
ri <- p1$ri[tt]; rj <- p1$rj[tt]; k <- p1$k; p <- p1$p; o <- p1$o # This is the line that allows the values of the growth rate parameters to change with time, tt
dPi = ri*Pi - k*Pi*I # The model
dPj = rj*Pj - k*Pj*I
dI = p*I*(Pi/(Pi + o) + Pj/(Pj + o))
list(c(dPi,dPj,dI))
}
N0 <- c(Pi = 1, Pj = 0, I = 1) # Initial values of the state variables
TT <- round(seq(0.1, 50, 0.1), 1)
eventdat <- data.frame(var = "Pj", time = 1, value = 1, method = "rep") # Allows one species to be introduced at different time points
results <- lsoda(N0, TT, Antia_3sp_Model, p = parms, events=list(data=eventdat), verbose = TRUE)
Using the GillespieSSA package requires a propensity vector:
a <- c("ri*Pi",
"k*Pi*I",
"rj*Pj",
"k*Pj*I",
"p*I*(Pi/(Pi + o)",
"p*I*(Pj/(Pj + o)")
And a state change matrix:
nu <- matrix(c(+1, -1, 0, 0, 0, 0,
0, 0, +1, -1, 0, 0,
0, 0, 0, 0, +1, +1), nrow = 3, byrow = TRUE)
And is implemented using the ssa function, which should look something like this:
TestOutput <- ssa(x0, a, nu, parms1, TT, method, simName...)
So, I guess my question is: given that I pass GillespieSSA the model through the propensity vector and the state change matrix, how can I include a time-varying parameter, as I have in my environmentally stochastic implementation?
I'm pretty lost on this one, so any suggestions would be greatly appreciated :)
I am trying to simulate the transmission of viruses in a population using the function ode from the deSolve package. The basic of my model is a SIR model and I posted a much simpler demo of my model here, which consists of only three states S(susceptible), I(infectious) and R(recovered). Each state is represented by a m*n matrix in my code, since I have m age groups and n subpopulations in my population.
The problem is: during the simulation period, there will be several vaccination activities that transfer people in state S to state I. Each vaccination activity is characterized by a begin date, an end date, its coverage rate and duration. What I want to do is once the time t falls into the interval of begin date and end date of one vaccination activity, the code calculates the effective vaccination rate (also a m*n matrix, based on coverage rate and duration) and times it with S (m*n matrix), to get a matrix of people transited to state I. Right now, I am using if() to decide if time t is between a begin date and a end date:
#initialize the matrix of effective vaccination rate
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
Here, irrate_matrix is the m*n effective vaccination rate matrix, m = 2 is the number of age groups, n = 2 is the number of subpopulations, tbegin = c(5, 20, 35) is the begin date of 3 vaccination activities, tend = c(8, 23, 38) is the end date of 3 vaccination activities, covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) is the coverage rate of each vaccination for each subpopulation (e.g., covir[1] = 0.35 is the coverage rate of the first vaccination for subpopulation1, while covir[4] = 0.225 is the coverage rate of the first vaccination for subpopulation2) and duration = c(4, 4, 4) is the duration of each vaccination (in days).
After calculating irrate_matrix, I take it into derivatives and therefore I have:
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
I want to do a simulation from day 0 to day 50, by 1-day step, thus:
times = seq(0, 50, 1)
The current issue with my code is: every time the time t comes to a time point close to a tbegin[i] or tend[i], the simulation becomes much slower since it iterates at this time point for much more rounds than at any other time point. For example, once the time t comes to tbegin[1] = 5, the model iterates at time point 5 for many rounds. I attached screenshots from printing out those iterations (screenshot1 and screenshot2). I find this is why my bigger model takes a very long running time now.
I have tried using the "events" function of deSolve mentioned by tpetzoldt in this question stackoverflow: change the value of a parameter as a function of time. However, I found it's inconvenient for me to change a matrix of parameters and change it every time there is a vaccination activity.
I am looking for solutions regarding:
How to change my irrate_matrix to non-zero matrix when there is a vaccination activity and let it be zero matrix when there is no vaccination? (it has to be calculated for each vaccination)
At the same time, how to make the code run faster by avoiding iterating at any tbegin[i] or tend[i] for many rounds? (I think I should not use if() but I do not know what I should do with my case)
If I need to use "forcing" or "events" function, could you please also tell me how to have multiple "forcing"/"events" in the model? Right now, I have had an "events" used in my bigger model to introduce a virus to the population, as:
virusevents = data.frame(var = "I1", time = 2, value = 1, method = "add")
Any good idea is welcome and directly providing some codes is much appreciated! Thank you in advance!
For reference, I post the whole demo here:
library(deSolve)
##################################
###(1) define the sir function####
##################################
sir_basic <- function (t, x, params)
{ # retrieve initial states
S = matrix(data = x[(0*m*n+1):(1*m*n)], nrow = m, ncol = n)
I = matrix(data = x[(1*m*n+1):(2*m*n)], nrow = m, ncol = n)
R = matrix(data = x[(2*m*n+1):(3*m*n)], nrow = m, ncol = n)
with(as.list(params), {
N = as.matrix(S + I + R)
# print out current iteration
print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
derivatives <- c(dS, dI, dR)
list(derivatives)
})
}
##################################
###(2) characterize parameters####
##################################
m = 2 # the number of age groups
n = 2 # the number of sub-populations
tbegin = c(5, 20, 35) # begin dates
tend = c(8, 23, 38) # end dates
duration = c(4, 4, 4) # duration
covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) # coverage rates
b = 0.0006 # daily birth rate
mu = 0.0006 # daily death rate
gammaS = 0.05 # transition rate from I to R
parameters = c(m = m, n = n,
tbegin = tbegin, tend = tend, duration = duration, covir = covir,
b = b, mu = mu, gammaS = gammaS)
##################################
#######(3) initial states ########
##################################
inits = c(
S = c(20000, 40000, 10000, 20000),
I = rep(0, m*n),
R = rep(0, m*n)
)
##################################
#######(4) run simulations########
##################################
times = seq(0, 50, 1)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
plot(traj)
Element wise operations are the same for matrices and vectors, so the as.matrix conversions are redundant, as no true matrix multiplication is used. Same with the rep: the zero is recycled anyway.
In effect, CPU time reduces already to 50%. In contrast, use of an external forcing with approxTime instead of the inner if and for made the model slower (not shown).
Simplified code
sir_basic2 <- function (t, x, params)
{ # retrieve initial states
S = x[(0*m*n+1):(1*m*n)]
I = x[(1*m*n+1):(2*m*n)]
R = x[(2*m*n+1):(3*m*n)]
with(as.list(params), {
N = S + I + R
# print out current iteration
#print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = 0, nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t >= tbegin[i] & t <= tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1) * length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = b*N - irrate_matrix*S - mu*S
dI = irrate_matrix*S - gammaS*I - mu*I
dR = gammaS*I - mu*R
list(c(dS, dI, dR))
})
}
Benchmark
Each model version is run 10 times. Model sir_basic is the original implementation, where print line was disabled for a fair comparison.
system.time(
for(i in 1:10)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
)
system.time(
for(i in 1:10)
traj2 <- ode(func = sir_basic2,
y = inits,
parms = parameters,
times = times)
)
plot(traj, traj2)
summary(traj - traj2)
I observed another considerable speedup, when I use method="adams" instead of the default lsoda solver, but this may differ for your full model.
I first want to sample 100 gamma distributed numbers where shape = 2 and scale = 1/2. I wrote down the log-likelyhood function and negated it since I'm using a minimization tool to maximize. I also tried using optim but to no avail. both optim and nlm gave me different answers. This is my code thus far:
N = 100
shape = 2
scale = 1/2
Data <- rgamma(SampSize, shape, scale)
LogL = function (x){
k = x[1]
gamma = x[2]
(-1)*(N*x[1]*log(x[2])+(x[1]-1)*sum(log(Data))-x[2]*sum(Data))
}
nlm(LogL,c(1.5,1))
logL <- function (x) -sum(dgamma(Data, x[1], x[2], log = TRUE))
N = 100
shape = 2
scale = 1/2
Data <- rgamma(N, shape, scale)
optim(c(1.5, 1), logL)$par
nlm(logL, c(1.5, 1))$estimate
I know there are many ways to calculate the arc length of curve, but I am looking for an efficient way to calculate the arc length of a piecewise spline through irregularly spaced points.
The actual curve I'm trying to find the length of is quite complex (contour line) so here is a quick example using a circle where the actual arclength is known to be 2*pi:
# Generate "random" data
set.seed(50)
theta = seq(0, 2*pi, length.out = 50) + runif(50, -0.05, 0.05)
theta = c(0, theta[theta >=0 & theta <= 2*pi], 2*pi)
data = data.frame(x = cos(theta), y = sin(theta))
# Bezier Curve fit
library("bezier")
bezierArcLength(data, t1=0, t2=1)$arc.length
# Calculate arc length using euclidean distance
library("dplyr")
data$eucdist = sqrt((data$x - lag(data$x))^2 + (data$y - lag(data$y))^2)
print(paste("Euclidean distance:", sum(data$eucdist[-1])))
print(paste("Actual distance:", 2*pi))
# Output
Bezier distance: 5.864282
Euclidean distance: 6.2779
Actual distance: 6.2831
The closest thing I have found is https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/arclength but I would have to parameterise my data to be some function(t) ...spline(data, t)... to use arclength. I tried this, but the fitted spline ran along the middle of the circle rather than along the circumference.
Another alternative I have been (unsuccessfully) trying is fit piecewise splines and determine the length of each spline.
Any help would be much appreciated!
EDIT: Added alternate method using the Bezier package, but the arc length found is even worse than just using the Euclidean method.
In lieu of community answers, I've cobbled together a solution which seems to work for what I was after! I'll leave my code here in case anyone has the same question and comes across this.
# Libraries
library("bezier")
library("pracma")
library("dplyr")
# Very slow for loops, sorry! Didn't write it as an apply function
output = data.frame()
for (i in 1:100) {
# Generate "random" data
# set.seed(50)
theta = seq(0, 2*pi, length.out = 50) + runif(50, -0.1, 0.1)
theta = sort(theta)
theta = c(0, theta[theta >=0 & theta <= 2*pi], 2*pi)
data = data.frame(x = cos(theta), y = sin(theta))
# Bezier Curve fit
b = bezierArcLength(data, t1=0, t2=1)$arc.length
# Pracma Piecewise cubic
t = atan2(data$y, data$x)
t = t + ifelse(t < 0, 2*pi, 0)
csx <- cubicspline(t, data$x)
csy <- cubicspline(t, data$y)
dcsx = csx; dcsx$coefs = t(apply(csx$coefs, 1, polyder))
dcsy = csy; dcsy$coefs = t(apply(csy$coefs, 1, polyder))
ds <- function(t) sqrt(ppval(dcsx, t)^2 + ppval(dcsy, t)^2)
s = integral(ds, t[1], t[length(t)])
# Calculate arc length using euclidean distance
data$eucdist = sqrt((data$x - lag(data$x))^2 + (data$y - lag(data$y))^2)
e = sum(data$eucdist[-1])
# Use path distance as parametric variable
data$d = c(0, cumsum(data$eucdist[-1]))
csx <- cubicspline(data$d, data$x)
csy <- cubicspline(data$d, data$y)
dcsx = csx; dcsx$coefs = t(apply(csx$coefs, 1, polyder))
dcsy = csy; dcsy$coefs = t(apply(csy$coefs, 1, polyder))
ds <- function(t) sqrt(ppval(dcsx, t)^2 + ppval(dcsy, t)^2)
d = integral(ds, data$d[1], data$d[nrow(data)])
# Actual value
a = 2*pi
# Append to result
output = rbind(
output,
data.frame(bezier=b, cubic.spline=s, cubic.spline.error=(s-a)/a*100,
euclidean.dist=e, euclidean.dist.error=(e-a)/a*100,
dist.spline=d, dist.spline.error=(d-a)/a*100))
}
# Summary
apply(output, 2, mean)
# Summary output
bezier cubic.spline cubic.spline.error euclidean.dist euclidean.dist.error dist.spline dist.spline.error
5.857931e+00 6.283180e+00 -7.742975e-05 6.274913e+00 -1.316564e-01 6.283085683 -0.001585570
I still don't quite understand what bezierArcLength does, but I'm very happy with my solution using cubicspline from the pracma package as it is a lot more accurate.
Other solutions are still more than welcome!
I would like to write a code to solve this kind of equations:
For that I wrote the code below, however it does not solve the problem. Do you have any ideas about the possibility to solve this kind of integrals in R?
t_0 = 15
mu = 0.1
lambda = 0.8
f = function(x1,x2) exp(mu*(x1+x2))*dexp(log(lambda)*(x1+x2))
f_comp = function(x2) f(x1,x2)
f_1 = function(x1) {integrate(f_comp,upper = t_0, lower = x1)}
result = integrate(f = f_1, lower = 0, upper = t_0)$value
--------- edit:
Given the answer below, I adapt the code to my example, but I still think is not the correct one, at least the value 0 for the integral does not make sense.
integrate(function(x1) {
sapply(x1, function(x1){
integrate(function(x2) exp(mu*(x1+x2))*dexp(log(lambda)*(x1+x2)), lower = x1, upper = t_0)$value
})
}, 0, t_0)
by the way, I would like to buid a general procedure for that (that is why I just not calculate the integral by hand). That is not only double integrals, but also n-tuples integrals, so I need a general procedure for this kind of calculations.
Make a picture of the domain of integration. This is a simplex (a triangle) with vertices (0,0), (0,t0), (t0,t0). To evaluate an integral on a simplex, the SimplicialCubature package is the way to go.
t0 = 15
mu = 0.1
lambda = 0.8
library(SimplicialCubature)
f <- function(xy){
x <- xy[1]; y <- xy[2]
exp(-mu*(x+y)) * (1-exp(-lambda*(x+y)))
}
S <- cbind(c(0,0), c(0,t0), c(t0,t0))
adaptIntegrateSimplex(f, S)$integral
# 29.55906
integrate(function(x1) {
sapply(x1, function(x1){
integrate(function(x2) exp(-mu*(x1+x2))*(1-exp(-lambda*(x1+x2))), lower = x1,
upper = t0)$value
})
}, 0, t0)$value
# 29.55906