Plotting ruin in R - r

I'm trying to recreate something similar to an image in modern actuarial risk theory using R: https://www.academia.edu/37238799/Modern_Actuarial_Risk_Theory (page 89)
Click here for image
In my case, the drops are of size based on an exponential distribution with parameter 1/2000 and they are spaced apart with Poisson inter arrival times which means they are distributed exponentially with a rate parameter of 0.25 (in my model)
The value of U is given by an initial surplus plus a premium income (c) per unit time (for an amount of time determined by the inter arrival distribution) minus a claim amount which would be random from the exponential distribution mentioned above.
I have a feeling a loop will need to be used and this is what I have so far:
lambda <- 0.25
EX <- 2000
theta <- 0.5
c <- lambda*EX*(1+theta)
x <- rexp(1, 1/2000)
s <- function(t1){for(t1 in 1:10){v <- c(rep(rexp(t1,1/2000)))
print(sum(v))}}
u <- function(t){10000+c*t}
plot(u, xlab = "t", xlim = c(-1,10), ylim = c(0,20000))
abline(v=0)
for(t1 in 1:10){v <- c(rep(rexp(t1,1/2000)))
print(sum(v))}
The end goal is to run this simulation say 10,000 times over a 10 year span and use it as a visible representation as the rate of ruin for an insurance company.
Any help appreciated.

I think you're looking for something like this, all wrapped up in a neat function which by default draws the plot, but if wanted simply returns "ruin" or "safe" so you can run it in simulation:
simulate_ruin <- function(lambda = 0.25, EX = 2000,
theta = 0.5, initial_amount = 10000,
max_time = 10, draw = TRUE) {
income_per_year <- lambda * EX * (1 + theta)
# Simulate a Poisson process. Include the initial time 0,
# and replicate every other time point so we have values "before" and
# "after" each drop
times <- c(0, rep(cumsum(rexp(1000, lambda)), each = 2))
times <- c(times[times < max_time], max_time)
# This would be our income if there were no drops (a straight line)
total_without_drops <- initial_amount + (income_per_year * times)
# Now simulate some drops.
drop_size <- rexp((length(times) - 1) / 2, 1/2000)
# Starting from times[3], we apply our cumulative drops every second index:
payout_total <- rep(c(0, cumsum(drop_size)), each = 2)
total <- total_without_drops - payout_total
if(draw) {
plot(times, total, type = "l", ylim = c(-1000, 20000))
abline(h = 0, lty = 2)
} else {
if(any(total < 0))
return("ruin")
else
return("safe")
}
}
So we can call it once for a simulation:
simulate_ruin()
And again for a different simulation
simulate_ruin()
And table the results of 10,000 simulations to find the rate of ruin, which turns out to be around 3%
table(sapply(1:10000, function(x) simulate_ruin(draw = FALSE)))
#>
#> ruin safe
#> 305 9695
Created on 2022-04-06 by the reprex package (v2.0.1)

Related

Simulating ODE model for different initial conditions in R

I have a model, and I want to generate random initial conditions, run the model, and save the output so that each simulation is a replicate. But I have a hard time interpreting and implementing loops (and I also know they are not always the best to use in R), so I'm struggling.
My ultimate goal is to iterate the simulation across 10 different random initial conditions, and save the output of the ODE including a column for simulation number.
First I have my random initial conditions:
library(deSolve)
states <- c(r=runif(1, min=0.1, max=25), # resource state variable
c=runif(1, min=0.1, max=10)) # consumer state variable
Then I have my parameters and model:
parameters <- c(g=5, # resource growth rate )
K=25, # resource carrying capacity
a=1, # consumer attack rate
h=1, # consumer handling time
e=0.9, # consumer conversion efficiency
m=0.5, # consumer mortality rate
avgrain = 1500, # average rainfall
A = 1000,
w = 0.6,
phi = 8.5,
ropt1 = 1500, # optimal rainfall for resource growth
s1 = 1000, # standard deviation for plant growth rate as a function of rainfall
ropt2 = 1000, # optimal rainfall for herbivore attack (feeding) rate
s2 = 500, # standard deviation for herbivore attack rate as a function of rainfall
avgtemp = 20, # average temperature
A_temp = 7,
w_temp = 0.5,
phi_temp = 0.5,
topt1 = 13, # optimal temperature for resource growth
ts1 = 10 # standard deviation for plant growth rate as a function of temperature
)
model <- function(t, states, parameters) {
with(as.list(c(states, parameters)), {
# rainfall time series function
rain <- avgrain + (A*sin((w*t)+phi)) # rainfall function
# temperature time series function
temp = avgtemp + (A_temp*sin((w_temp*t)+phi_temp))
# dynamic g and a equations
dg_both <- (exp(-(rain - ropt1)^2/(s1^2))) + (exp(-(temp - topt1)^2/(ts1^2)))
da = exp(-(rain - ropt2)^2/(s2^2))
# rate of change of state variables
dr <- dg_both*r*(1-(r/K)) - ((c*da*r)/(1+(da*h*r)))
dc <- ((c*e*da*r)/(1+(da*h*r)))- c*m
# return rate of change
list(c(dr, dc), rain=rain, temp=temp, dg_both=dg_both, da=da)
})
}
times <- seq(0, 200, by = 1)
out <- ode(y = states, times = times, func = model, parms = parameters, method="lsoda")
Would I do this with a for loop? Thank you in advance!
Here one of the other approaches, mentioned by #Ben Bolker. Here we use replicate instead of a loop. This has the advantage, that we don't need to create a list() for the results beforehand.
N <- 10
res <- replicate(N, ode(y = c(r = runif(1, min = 0.1, max = 25),
c = runif(1, min = 0.1, max = 10)),
times = times, func = model,
parms = parameters, method="lsoda"),
simplify = FALSE)
plot(out, res)
As an additional goody, we can also plot the results using deSolve's built-in plotting function. This works of course also with res in Ben's approach. The resulting data structure can then be simplified to something like a matrix or array, either with do.call(rbind, res) as in Ben's example, or with option simplify directly in replicate.
Yes, a for loop will be fine. There are lots of other slightly fancier ways to do this (replicate or lapply from base R, purrr::map_dfr from tidyverse ...), but they won't save you any time or memory — they're just a slightly more elegant way to do the same thing.
set.seed(101)
N <- 10
res <- list()
for (i in 1:N) {
## pick new initial conditions
cur_states <- c(r=runif(1, min=0.1, max=25),
c=runif(1, min=0.1, max=10))
## run model and attach index column to the matrix
res[[i]] <-
cbind(run = i,
ode(y = cur_states, times = times, func = model,
parms = parameters, method="lsoda")
)
}
## combine individual runs into one long matrix
res_final <- do.call(rbind,res)

How to use a for loop over negative (and positive) values in R?

I am trying to use a for-loop over a range of positive and negative values and then plot the results. However, I'm having trouble getting R not to plot the correct values, since the negative values seem to screw up the index.
More precisely, the code I am running is:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
The resulting graph should look something like this
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
...but there's clearly something wrong with my code that is shifting the positive values on the graph over to the left.
Any help on how to fix this would be much appreciated!
Yep, as you suspected the negative indices are causing problems. R doesn't know how to store something as the "negative first" object in a vector, so it just drops them. Instead, try using seq_along to produce a vector of all positive indices and looping over those instead:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in seq_along(t)) {
for (r in R) {
# Generate 1 observation from N(x,1)
# Now we ask for the value of t at index x rather than t directly
y = rnorm(1, t[x], 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
which produces the following plot:
Not sure why you want to simulate the vectorized function pnrom() using for loops, still correcting the mistakes in your code (check the comments):
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# no need to take average since you have a single observation
# Test this observation using the test we found in part a
rejection[r] = ifelse(y >= 1 + pnorm(.95), 1, 0)
}
# Calculate the average rejection frequency across the 20 samples
# `R` vector index starts from 1, transform your x values s.t., negative values become positive
avg_rej_freq[x-min(t)+1] = mean(rejection)
}
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')
Not sure why the for loops etc, what you are doing can be collapsed into a one line. The rest of the code taken from #Sandipan Dey:
R <- 20
t <- seq(from = -10, to = 10, by = 1)
#All the for-loops collapsed into this one line:
avg_rej_freq <- rowMeans(matrix(rnorm(R * length(t), t), 21) >= 1 + pnorm(.95))
rej_prob <- function(x) 1 - pnorm(1 - x + qnorm(0.95))
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')

Realistic age structured model using ODE from the deSolve package

I am trying to simulate a realistic age structured model where all individuals could shift into the following age group at the end of the time step (and not age continuously at a given rate) using ODE from the deSolve package.
Considering for example a model with two states Susceptible (S) and Infectious (I), each state being divided in 4 age groups (S1, S2, S3, S4, and I1, I2, I3, I4), all individuals in S1 should go into S2 at the end of the time step, those in S2 should go into S3, and so on.
I tried to make this in two steps, the first by solving the ODE, the second by shifting individuals into the following age group at the end of the time step, but without success.
Below is one of my attempts :
library(deSolve)
times <- seq(from = 0, to = 100, by = 1)
n_agecat <- 4
#Initial number of individuals in each state
S_0 = c(999,rep(0,n_agecat-1))
I_0 = c(1,rep(0,n_agecat-1))
si_initial_state_values <- c(S = S_0,
I = I_0)
# Parameter values
si_parameters <- c(beta = 0.01) #contact rate assuming random mixing
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
n_agegroups <- 4
S <- state[1:n_agegroups]
I <- state[(n_agegroups+1):(2*n_agegroups)]
# Total population
N <- S+I
# Force of infection
lambda <- beta * I/N
# Solving the differential equations
dS <- -lambda * S
dI <- lambda * S
# Trying to shift all individuals into the following age group
S <- c(0,S[-n_agecat])
I <- c(0,I[-n_agecat])
return(list(c(dS, dI)))
})
}
output <- as.data.frame(ode(y = si_initial_state_values,
times = times,
func = si_model,
parms = si_parameters))
Any guidance will be much appreciated, thank you in advance!
I had a look at your model. Implementing the shift in an event function works, in principle, but the main model has still several problems:
die out: if the age groups are shifted per time step and the first element is just filled with zero, everything is shifted to the end within 4 time steps and the population dies out.
infection: in your case, the infected can only infect the same age group, so you need to summarize over the "age" groups before calculating lambda.
Finally, what is "age" group? Do you want the time since infection?
To sum up, there are several options: I would personally prefer a discrete model for such a simulation, i.e. difference equations, a age structured matrix model or an individual-based model.
If you want to keep it an ODE, I recommend to let the susceptible together as one state and to implement only the infected as stage structured.
Here a quick example, please check:
library(deSolve)
times <- seq(from = 0, to = 100, by = 1)
n_agegroups <- 14
n_agecat <- 14
# Initial number of individuals in each state
S_0 = c(999) # only one state
I_0 = c(1, rep(0,n_agecat-1)) # several stages
si_initial_state_values <- c(S = S_0,
I = I_0)
# Parameter values
si_parameters <- c(beta = 0.1) # set contact parameter to a higher value
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2:(n_agegroups + 1)]
# Total population
N <- S + sum(I)
# Force of infection
#lambda <- beta * I/N # old
lambda <- beta * sum(I) / N # NEW
# Solving the differential equations
dS <- -lambda * S
dI <- lambda * S
list(c(dS, c(dI, rep(0, n_agegroups-1))))
})
}
shift <- function(t, state, p) {
S <- state[1]
I <- state[2:(n_agegroups + 1)]
I <- c(0, I[-n_agecat])
c(S, I)
}
# output time steps (note: ode uses automatic simulation steps!)
times <- 1:200
# time step of events (i.e. shifting), not necessarily same as times
evt_times <- 1:200
output <- ode(y = si_initial_state_values,
times = times,
func = si_model,
parms = si_parameters,
events=list(func=shift, time=evt_times))
## default plot function
plot(output, ask=FALSE)
## plot totals
S <- output[,2]
I <- rowSums(output[, -(1:2)])
par(mfrow=c(1,2))
plot(times, S, type="l", ylim=c(0, max(S)))
lines(times, I, col="red", lwd=1)
## plot stage groups
matplot(times, output[, -(1:2)], col=rainbow(n=14), lty=1, type="l", ylab="S")
Note: This is just a technical demonstration, not a valid stage structured SIR model!

R - deSolve package (ode function): change a matrix of parameters in SIR model according to time

I am trying to simulate the transmission of viruses in a population using the function ode from the deSolve package. The basic of my model is a SIR model and I posted a much simpler demo of my model here, which consists of only three states S(susceptible), I(infectious) and R(recovered). Each state is represented by a m*n matrix in my code, since I have m age groups and n subpopulations in my population.
The problem is: during the simulation period, there will be several vaccination activities that transfer people in state S to state I. Each vaccination activity is characterized by a begin date, an end date, its coverage rate and duration. What I want to do is once the time t falls into the interval of begin date and end date of one vaccination activity, the code calculates the effective vaccination rate (also a m*n matrix, based on coverage rate and duration) and times it with S (m*n matrix), to get a matrix of people transited to state I. Right now, I am using if() to decide if time t is between a begin date and a end date:
#initialize the matrix of effective vaccination rate
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
Here, irrate_matrix is the m*n effective vaccination rate matrix, m = 2 is the number of age groups, n = 2 is the number of subpopulations, tbegin = c(5, 20, 35) is the begin date of 3 vaccination activities, tend = c(8, 23, 38) is the end date of 3 vaccination activities, covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) is the coverage rate of each vaccination for each subpopulation (e.g., covir[1] = 0.35 is the coverage rate of the first vaccination for subpopulation1, while covir[4] = 0.225 is the coverage rate of the first vaccination for subpopulation2) and duration = c(4, 4, 4) is the duration of each vaccination (in days).
After calculating irrate_matrix, I take it into derivatives and therefore I have:
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
I want to do a simulation from day 0 to day 50, by 1-day step, thus:
times = seq(0, 50, 1)
The current issue with my code is: every time the time t comes to a time point close to a tbegin[i] or tend[i], the simulation becomes much slower since it iterates at this time point for much more rounds than at any other time point. For example, once the time t comes to tbegin[1] = 5, the model iterates at time point 5 for many rounds. I attached screenshots from printing out those iterations (screenshot1 and screenshot2). I find this is why my bigger model takes a very long running time now.
I have tried using the "events" function of deSolve mentioned by tpetzoldt in this question stackoverflow: change the value of a parameter as a function of time. However, I found it's inconvenient for me to change a matrix of parameters and change it every time there is a vaccination activity.
I am looking for solutions regarding:
How to change my irrate_matrix to non-zero matrix when there is a vaccination activity and let it be zero matrix when there is no vaccination? (it has to be calculated for each vaccination)
At the same time, how to make the code run faster by avoiding iterating at any tbegin[i] or tend[i] for many rounds? (I think I should not use if() but I do not know what I should do with my case)
If I need to use "forcing" or "events" function, could you please also tell me how to have multiple "forcing"/"events" in the model? Right now, I have had an "events" used in my bigger model to introduce a virus to the population, as:
virusevents = data.frame(var = "I1", time = 2, value = 1, method = "add")
Any good idea is welcome and directly providing some codes is much appreciated! Thank you in advance!
For reference, I post the whole demo here:
library(deSolve)
##################################
###(1) define the sir function####
##################################
sir_basic <- function (t, x, params)
{ # retrieve initial states
S = matrix(data = x[(0*m*n+1):(1*m*n)], nrow = m, ncol = n)
I = matrix(data = x[(1*m*n+1):(2*m*n)], nrow = m, ncol = n)
R = matrix(data = x[(2*m*n+1):(3*m*n)], nrow = m, ncol = n)
with(as.list(params), {
N = as.matrix(S + I + R)
# print out current iteration
print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
derivatives <- c(dS, dI, dR)
list(derivatives)
})
}
##################################
###(2) characterize parameters####
##################################
m = 2 # the number of age groups
n = 2 # the number of sub-populations
tbegin = c(5, 20, 35) # begin dates
tend = c(8, 23, 38) # end dates
duration = c(4, 4, 4) # duration
covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) # coverage rates
b = 0.0006 # daily birth rate
mu = 0.0006 # daily death rate
gammaS = 0.05 # transition rate from I to R
parameters = c(m = m, n = n,
tbegin = tbegin, tend = tend, duration = duration, covir = covir,
b = b, mu = mu, gammaS = gammaS)
##################################
#######(3) initial states ########
##################################
inits = c(
S = c(20000, 40000, 10000, 20000),
I = rep(0, m*n),
R = rep(0, m*n)
)
##################################
#######(4) run simulations########
##################################
times = seq(0, 50, 1)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
plot(traj)
Element wise operations are the same for matrices and vectors, so the as.matrix conversions are redundant, as no true matrix multiplication is used. Same with the rep: the zero is recycled anyway.
In effect, CPU time reduces already to 50%. In contrast, use of an external forcing with approxTime instead of the inner if and for made the model slower (not shown).
Simplified code
sir_basic2 <- function (t, x, params)
{ # retrieve initial states
S = x[(0*m*n+1):(1*m*n)]
I = x[(1*m*n+1):(2*m*n)]
R = x[(2*m*n+1):(3*m*n)]
with(as.list(params), {
N = S + I + R
# print out current iteration
#print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = 0, nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t >= tbegin[i] & t <= tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1) * length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = b*N - irrate_matrix*S - mu*S
dI = irrate_matrix*S - gammaS*I - mu*I
dR = gammaS*I - mu*R
list(c(dS, dI, dR))
})
}
Benchmark
Each model version is run 10 times. Model sir_basic is the original implementation, where print line was disabled for a fair comparison.
system.time(
for(i in 1:10)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
)
system.time(
for(i in 1:10)
traj2 <- ode(func = sir_basic2,
y = inits,
parms = parameters,
times = times)
)
plot(traj, traj2)
summary(traj - traj2)
I observed another considerable speedup, when I use method="adams" instead of the default lsoda solver, but this may differ for your full model.

r deSolve - plotting time evolution pde

suppose that we have a pde that describes the evolution of a variable y(t,x) over time t and space x, and I would like to plot its evolution on a three dimensional diagram (t,x,y). With deSolve I can solve the pde, but I have no idea about how to obtain this kind of diagram.
The example in the deSolve package instruction is the following, where y is aphids, t=0,...,200 and x=1,...,60:
library(deSolve)
Aphid <- function(t, APHIDS, parameters) {
deltax <- c (0.5, rep(1, numboxes - 1), 0.5)
Flux <- -D * diff(c(0, APHIDS, 0)) / deltax
dAPHIDS <- -diff(Flux) / delx + APHIDS * r
list(dAPHIDS )
}
D <- 0.3 # m2/day diffusion rate
r <- 0.01 # /day net growth rate
delx <- 1 # m thickness of boxes
numboxes <- 60
Distance <- seq(from = 0.5, by = delx, length.out = numboxes)
APHIDS <- rep(0, times = numboxes)
APHIDS[30:31] <- 1
state <- c(APHIDS = APHIDS) # initialise state variables
times <-seq(0, 200, by = 1)
out <- ode.1D(state, times, Aphid, parms = 0, nspec = 1, names = "Aphid")
"out" produces a matrix containing all the data that we need, t, y(x1), y(x2), ... y(x60). How can I produce a surface plot to show the evolution and variability of y in (t,x)?
The ways change a bit depending on using package. But you can do it with little cost because out[,-1] is an ideal matrix form to draw surface. I showed two examples using rgl and plot3D package.
out2 <- out[,-1]
AphID <- 1:ncol(out2)
library(rgl)
persp3d(times, AphID, out2, col="gray50", zlab="y")
# If you want to change color with value of Z-axis
# persp3d(times, AphID, out2, zlab="y", col=topo.colors(256)[cut(c(out2), 256)])
library(plot3D)
mat <- mesh(times, AphID)
surf3D(mat$x, mat$y, out2, bty="f", ticktype="detailed", xlab="times", ylab="AphID", zlab="y")

Resources