Linear combinations of Zygote.Grads - julia

I am building and training a neural network model with Flux, and I am wondering if there is a way to take linear combinations of Zygote.Grads types.
Here is a minimalistic example. This is how it is typically done:
m = hcat(2.0); b = hcat(-1.0); # random 1 x 1 matrices
f(x) = m*x .+ b
ps = Flux.params(m, b) # parameters to be adjusted
inputs = [0.3 1.5] # random 1 x 2 matrix
loss(x) = sum( f(x).^2 )
gs = Flux.gradient(() -> loss(inputs), ps) # the typical way
#show gs[m], gs[b] # 5.76, 3.2
But I want to do the same calculation by computing gradients at a deeper level, and then assembling it at the end. For example:
input1 = hcat(inputs[1, 1]); input2 = hcat(inputs[1, 2]); # turn each input into a 1 x 1 matrix
grad1 = Flux.gradient(() -> f(input1)[1], ps) # df/dp using input1 (where p is m or b)
grad2 = Flux.gradient(() -> f(input2)[1], ps) # df/dp using input2 (where p is m or b)
predicted1 = f(input1)[1]
predicted2 = f(input2)[1]
myGrad_m = (2 * predicted1 * grad1[m]) + (2 * predicted2 * grad2[m]) # 5.76
myGrad_b = (2 * predicted1 * grad1[b]) + (2 * predicted2 * grad2[b]) # 3.2
Above, I used the chain rule and linearity of the derivative to decompose the gradient of the loss() function:
d(loss)/dp = d( sum(f^2) ) / dp = sum( d(f^2)/dp ) = sum( 2*f * df/dp )
Then, I calculated df/dp using Zygote.gradient, and then combined the results at the end.
But notice that I had to combine m and b separately. This was fine because there were only 2 parameters.
However, if there were a 1000 parameters, I would want to do something like this, which is a linear combination of the Zygote.Grads:
myGrad = (2 * predicted1 * grad1) + (2 * predicted2 * grad2)
But, I get an error saying that the + and * operators are not defined for these types. How can I get this shortcut to work?

Just turn each */+ into .*/.+ (i.e. use broadcasting) or you can use map to apply a function to multiple Grads at once. This is described in the Zygote docs here. Note that in order for this to work, all the Grads must share the same keys (so they must correspond to the same parameters).

Related

Multi-parameter optimization in R

I'm trying to estimate parameters that will maximize the likelihood of a certain event. My objective function looks like that:
event_prob = function(p1, p2) {
x = ((1-p1-p2)^4)^67 *
((1-p1-p2)^3*p2)^5 *
((1-p1-p2)^3*p1)^2 *
((1-p1-p2)^2*p1*p2)^3 *
((1-p1-p2)^2*p1^2) *
((1-p1-p2)*p1^2*p2)^2 *
(p1^3*p2) *
(p1^4)
return(x)
}
In this case, I'm looking for p1 and p2 [0,1] that will maximize this function. I tried using optim() in the following manner:
aaa = optim(c(0,0),event_prob)
but I'm getting an error "Error in fn(par, ...) : argument "p2" is missing, with no default".
Am I using optim() wrong? Or is there a different function (package?) I should be using for multi-parameter optimization?
This problem can in fact be solved analytically.
The objective function simplifies to
F(p1,p2) = (1-p1-p2)^299 * p1^19 * p2^11
which is to be maximised over the region
C = { (p1,p2) | 0<=p1, 0<=p2, p1+p2<=1 }
Note that F is 0 if p1=0 or p2 =0 or p1+p2 = 1, while if none of those are true then F is positive. Thus the maximum of F occurs in the interior of C
Taking the log
f(p1,p2) = 299*log(1-p1-p2) + 19*log(p1) + 11*log(p2)
In fact it is as easy to solve the more general problem: maximise f over C where
f( p1,..pN) = b*log( 1-p1-..-pn) + Sum{ a[j]*log(p[j])}
where b and each a[j] is positive and
C = { (p1,..pN) | 0<pj, j=1..N and p1+p2+..pN<1 }
The critical point occurs where all the partial derivatives of f are zero, which is at
-b/(1-p1-..-pn) + a[j]/p[j] = 0 j=1..N
which can be written as
b*p[j] + a[j]*(p1+..p[N]) = a[j] j=1..N
or
M*p = a
where M = b*I + a*Ones', and Ones is a vector with each component 1
The inverse of M is
inv(M) = (1/b)*(I - a*Ones'/(b + Ones'*a))
Thus the unique critical point is
p^ = inv(M)*a
= a/(b + Sum{i|a[i]})
Since there is a maximum, and only one critical point, the critical point must be the maximum.
Based on Erwin Kalvelagen's comment: Redefine your function event_prob:
event_prob = function(p) {
p1 = p[1]
p2 = p[2]
x = ((1-p1-p2)^4)^67 *
((1-p1-p2)^3*p2)^5 *
((1-p1-p2)^3*p1)^2 *
((1-p1-p2)^2*p1*p2)^3 *
((1-p1-p2)^2*p1^2) *
((1-p1-p2)*p1^2*p2)^2 *
(p1^3*p2) *
(p1^4)
return(x)
}
You may want to set limits to ensure that p1 and p2 fulfill your constraints:
optim(c(0.5,0.5),event_prob,method="L-BFGS-B",lower=0,upper=1)

deSolve ODE Not Working with Differential Equations (Calculates NA)

I have been using method deSolve::ode45 which has been working until I made a few necessary changes to my equations. Does anyone know why the ODE solver is not working? I have tried running with ode45 as well as the default ode method and neither work. Please let me know if any further explanation would be helpful.
I have checked over the differential equations and I am confident they are correct.
The equations used are as follows:
CCHFModel = function(t,x,params)
{
# get SIR values
SH <- x[1]
EH <- x[2]
IA <- x[3]
IS <- x[4]
RH <- x[5]
ST <- x[6]
IT <- x[7]
SC <- x[9]
IC <- x[10]
RC <- x[11]
# Load values ----
# Beta values
betaHHA = params["betaHHA"]
betaHHS = params["betaHHS"]
betaTH = params["betaTH"]
betaCH = params["betaCH"]
betaTC = params["betaTC"]
betaCT = params["betaCT"]
betaTT = params["betaTT"]
# Gamma value
gamma = params["gamma"]
# death rates
muH = params["muH"]
muT = params["muT"]
muC = params["muC"]
# birth rates
piH = params["piH"]
piT = params["piT"]
piC = params["piC"]
# incubation
deltaHS = params["deltaHS"]
deltaHA = params["deltaHA"]
# recovery rate
alphaA = params["alphaA"]
alphaS = params["alphaS"]
alphaC = params["alphaC"]
# total population
NH = (SH + IA + IS + EH + RH) + (piH * SH) - (muH * SH)
NT = (ST + IT) + (piT * ST) - (muT * ST)
NC = (SC + IC + RC) + (piC * SC) - (muH * SC)
# tick carrying Capacity
# KT = NC * 130 # 130 ticks per carrier max
#computations ----
dSHdt <- (piH * NH) - (betaHHA * IA + betaHHS * IS + betaCH * IC + betaTH * IT)*(SH/NH) - (muH * SH)
dEHdt <- (betaHHA * IA + betaHHS * IS + betaCH * IC + betaTH * IT)*(SH/NH) - ((deltaHA + muH)*EH)
dIAdt <- (deltaHA * EH) - ((alphaA + muH + deltaHS) * IA)
dISdt <- (deltaHS * IA) - ((alphaS + muH + gamma) * IS)
dRHdt <- alphaA * IA + alphaS * IS - muH*RH
dSTdt <- (piT * NT) - (betaTT * IT + betaCT * IC)*(ST/NT) - (muT * ST)
dITdt <- (betaTT * IT + betaCT * IC)*(ST/NT) - (muT * IT)
dSCdt <- (piC * NC) - (betaTC * IT)*(SC/NC) - (muC * SC)
dICdt <- (betaTC * IT)*(SC/NC) - ((alphaC +muC) * IC)
dRCdt <- (alphaC * IC) - (muC * RC)
# return results
list(c(dSHdt, dEHdt, dIAdt, dISdt, dRHdt, dSTdt, dITdt, dSCdt, dICdt, dRCdt))
}
I run the ODE solver using:
defaultParms = c(betaHHA = .0413,
betaHHS = .0413,
betaTH = .2891,
betaCH = .0826,
betaTC = (1/365),
betaCT = 59/365,
betaTT = ((1/(365 * 2)) * .04) * 280,
gamma = 1/10,
muH = (1/(365 * 73)),
muT = (1/(365 * 2)),
muC = (1/(11 * 365)),
piH = 1.25/(73 * 365),
piT = 4.5/730,
piC = 1/(11 * 365),
deltaHS = 1/3,
deltaHA = 1/2,
alphaA = 1/17,
alphaS = 1/17,
alphaC = 1/7)
# time to start solution
t = seq(from = 0, to = 365, by = 0.1)
#initialize initial conditions
initialConditions = c(SH = 10000, EH = 5, IA = 5, IS = 10, RH = 2, ST = 80000, IT = 50, SC = 30000, IC = 5, RC = 1)
dataSet = ode(y = initialConditions, times = t, func = CCHFModel, parms = defaultParms)%>%
as.data.frame()
After running this all the output following the initial conditions is NA.
This is due to a typo - you misnumbered the translation of input values in the first section of your code (i.e., you skipped x[8]. I will go through two (hopefully) useful exercises, first explaining how I debugged this and then showing how to rewrite your function to make it less error-prone ...
debugging
Try running the gradient function for t=0, x=<initial conditions>:
CCHFModel(0,initialConditions, defaultParms)
## piH betaHHA deltaHA deltaHS alphaA piT
## -15.02882327 12.62349834 0.53902803 0.07805607 0.88227788 385.31052332
## betaTT piC betaTC alphaC
## 0.85526763 NA NA NA
Hmm, we already see we have a problem. Why are the last three elements of the computed gradients NA?
add browser() near the end of the function (before the dsCdt <- ... line) so we can take a closer look. Redefine the function, and try computing the gradient again.
When we get there and print out some of the quantities involved in the computation we see that both NC and RC are NA ... we can also see that an NA value of RC will cause NC to be NA, so let's check the definition of RC ...
aha! RC is defined as x[11], but length(initialConditions) is only 10 ... and a closer look shows that we missed x[8]. Redefining properly gives non-NA values throughout (I don't know if they're correct, but at least they're not NA).
error-proofing (1)
Although using [] or [[]] to extract elements of a vector usually give equivalent answers, you should always use [[]] when you want to extract a single element (scalar) from a vector. Here's why:
initialConditions[11] ## NA
initialConditions[[11]] ## Error in x[[11]] : subscript out of bounds
If you use [], the NA propagates through your code and you have to hunt down the original source. If you use [[]], R fails right away and tells you where the problem is. An additional benefit is that [] propagates the names of the vector elements in a way that doesn't usually make sense (take a look at the names of the output in "debugging/1" above ...)
error-proofing (2)
You can avoid all of the tedious and error-prone unpacking of the parameter and state vectors by replacing the unpacking code (everything before the computation of total populations) with
comb <- c(as.list(x), as.list(params))
attach(comb)
on.exit(detach(comb))
Provided that your parameter and state vectors are properly named (and there are no names that overlap between them), this will create a named list and allow looking up of the elements by name within your function; on.exit(detach(comb)) makes sure that everything gets cleaned up properly at the end. (You will see recommendations to use with() to do this; I prefer the strategy here because it makes debugging within the function [if necessary] easier. But as #tpetzoldt notes in comments, you should always pair attach(...) with on.exit(detach(...)); otherwise things get very confusing and messy ...)
At the end of the function I would use
g <- c(dSHdt, dEHdt, dIAdt, dISdt, dRHdt, dSTdt, dITdt, dSCdt, dICdt, dRCdt)
names(g) <- names(x)
list(g)
to make sure the gradient vector is properly labeled, which makes troubleshooting easier.

Parameter estimation of multiple datasets in julia DifferentialEquations

I have been looking and I could not find a direct way of using the DifferentialEquations parameter estimation in julia to fit multiple datasets. So, let's say we have this simple differential equation with two parameters:
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
We have experimental datasets of u[1] vs t. Each dataset has a different value of p[2] and/or different initial conditions. p[1] is the parameter we want to estimate.
I can do this by solving the differential equation in a for loop that iterates over the different initial conditions and p[2] values, storing the solutions in an array and create a loss-function against the experimental data. I wonder if there is a way of doing this in fewer lines of code, using, for instance,DiffEqBase.problem_new_parameters to set the conditions of each dataset. This is a very common situation when fitting models to experimental data but I could not find a good example in the documentation.
Thank you in advance,
Best regards
EDIT 1
The case expressed above is just a simplified example. To make it a practical case we could create some fake experimental data from the following code:
using DifferentialEquations
# ODE function
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)
# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)
# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)
# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)
sol1, sol2 and sol3 are now our experimental data, each dataset using a different combination of initial conditions and p[2] (which represents some experimental variable (e.g., temperature, flow...)
The objective is to estimate the value of p[1] using the experimental data sol1, sol2 and sol3 letting DiffEqBase.problem_new_parameters or another alternative iterate over the experimental conditions.
What you can do is create a MonteCarloProblem that solves all three of the problems at once:
function prob_func(prob,i,repeat)
i < 3 ? u0 = [1.0] : u0 = [2.0]
i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)
#time sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
num_monte = 3)
Then create a loss function that compares each of the solutions against the 3 datasets and adds their loss together.
loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])
Finally, you need to tell it how to relate the optimization parameter(s) to the problem it's solving. Here, our MonteCarloProblem holds a prob that it's pulling p[1] from whenever it's generating a problem. The value that we want to optimize is that p[1], so:
function my_problem_new_parameters(prob,p)
prob.prob.p[1] = p[1]
prob
end
Now our objective is exactly those pieces together:
obj = build_loss_objective(prob,Tsit5(),loss,
prob_generator = my_problem_new_parameters,
num_monte = 3,
saveat = t)
Now let's throw that to Optim.jl's Brent method:
using Optim
res = optimize(obj,0.0,1.0)
Results of Optimization Algorithm
* Algorithm: Brent's Method
* Search Interval: [0.000000, 1.000000]
* Minimizer: 3.000000e-01
* Minimum: 2.004680e-20
* Iterations: 10
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 11
It found that the overall best value is 0.3 which is the parameter we used to generate the data.
Here's the code in full:
using DifferentialEquations
# ODE function
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)
# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)
data1 = Array(sol1)
# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)
data2 = Array(sol2)
# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)
data3 = Array(sol3)
function prob_func(prob,i,repeat)
i < 3 ? u0 = [1.0] : u0 = [2.0]
i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)
# Just to show what this looks like
sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
num_monte = 3)
loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])
function my_problem_new_parameters(prob,p)
prob.prob.p[1] = p[1]
prob
end
obj = build_loss_objective(prob,Tsit5(),loss,
prob_generator = my_problem_new_parameters,
num_monte = 3,
saveat = t)
using Optim
res = optimize(obj,0.0,1.0)

hand over function current loop number

Since it's a short question I'll leave out regular background information (if you need, I'll add it).
Finally there is a data frame called Coefficients
Serial_number Fixed_effects_beta_0 Fixed_effects_beta_1 Fixed_effects_beta_2 Fixed_effects_beta_3 Random_effects_beta_0 Random_effects_beta_1 Random_effects_beta_2 Random_effects_beta_3 p0_fixed p1_fixed p2_fixed p3_fixed p0_random p1_random p2_random p3_random Fitted_Voltage
1 912009913 1.238401 13.19572 -0.08379988 1.366747 -0.039642999 -0.40767221 -0.25476169 -0.11315457 -11.92334 0.1177605 -0.0003777831 4.328852e-07 0.56414753 -0.006946270 2.736287e-05 -3.583906e-08 352.9476
(...)
and for each row I want to apply the function
inverse = function (f, lower = lower_limit, upper = 450) {
function (y) uniroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
# function (y) polyroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
}
function_to_observe = inverse((function(x=150)
exp(
exp(
sum(
Coefficients[running_row,"p0_fixed"] * x^0,
Coefficients[running_row,"p1_fixed"] * x^1,
Coefficients[running_row,"p2_fixed"] * x^2,
Coefficients[running_row,"p3_fixed"] * x^3
))
)
)
, 50, 450)
by making use of values stored in each row and in certain columns of the data frame as follows:
for(i in 1:nrow(Coefficients)){
Coefficients[i,"Fitted_Voltage"]<- function_to_observe(150)
}
Unfortunately this does not work since Coefficients[i,"Fitted_Voltage"]<- function_to_observe(150) does not take care of the different rows of Coefficients.
What's a remedy? Whyever I cannot do the following:
for(i in 1:nrow(Coefficients)){
Coefficients[i,"Fitted_Voltage"]<- inverse((function(x=150)
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))
)
)
, 50, 450)
}
This yields:
Error in x[[jj]][iseq] <- vjj :
incompatible types (from closure to double) in subassignment type fix
Thanks a lot in advance for any help!
# Update:
With the help of mathdotrandom I tried a bit and get the following:
lower_limit<- 0
function_to_observe<- inverse((function(x=150)
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))))
, 50, 550
)
inverse = function (f, lower = lower_limit, upper = 450) {
function (y) uniroot((function (x) f(x) - y), lower = lower_limit, upper = upper)[1]
}
for(i in 1:nrow(Coefficients)){
Coefficients[i, "Fitted_Voltage"]<- function_to_observe(150)
}
Coefficients["Fitted_Voltage"]
which yields reasonable values:
Fitted_Voltage
1 352.9476
2 352.9476
3 352.9476
4 352.9476
5 352.9476
6 352.9476
7 352.9476
8 352.9476
9 352.9476
10 352.9476
11 352.9476
12 352.9476
13 352.9476
14 352.9476
15 352.9476
Though I do not understand the syntax I guess this is correct since it does what it should.
function(x=150) does not run the function but sets x as a default parameter of 150. So you try to put a function definition into your data.frame. Thats why it complains about the type closure(function). Easiest is to give the function a name and define it outside of the for loop and then call it.
If you really want to use it as lambda function checkout this question and lebatsnok answer: lambda-like functions in R?
The inverse function should not return a function but a number. The uniroot function expects a function, so f should be a function as you did. R will actually lookup the value of i or running_row from above if you don't put it as a parameter.
Coefficients <- data.frame("Fitted_Voltage"=c(0,0), "p0_fixed"=c(10^-1, 10^-2),
"p1_fixed"=c(10^-2, 10^-3), "p2_fixed"=c(10^-3, 10^-4),
"p3_fixed"=c(10^-4, 10^-5))
f <- function(x=150)exp(exp(sum(Coefficients[running_row,"p0_fixed"] * x^0,
Coefficients[running_row,"p1_fixed"] * x^1,
Coefficients[running_row,"p2_fixed"] * x^2,
Coefficients[running_row,"p3_fixed"] * x^3)))
inverse = function (f, lower_limit, upper = 450) {
y = (f(lower_limit) + f(upper))/2
uniroot(function(x)(f(x)-y), lower = lower_limit, upper = upper)[1]
}
for(running_row in 1:nrow(Coefficients)){
Coefficients[i, "Fitted_Voltage"] <- inverse(f,-1,1)
}
But your function is always positive because you used exp and exp(x) >0 forall x, so uniroot can not find a zero of that function. Also polyroot can only find zeros of polynomials but you are using an exponentail function. Are you sure that your function should look like: e^(e^(c_0 + c_1*x + c_2*x^2 + c_3*x^3))?
I subtract a value in inverse to make it have a root but i dont know if this makes any sense in your context. Also because of double exponential the function gets big really fast, so even for small Coefficients it returns Infinity for lower limit 50 and upper 450, so i needed to do -1 and 1 as limits to get some results. But this should be somehow similar to how you want it.
Following mathdotrandom's suggestion. You can define function outside. Try this:
inner.f <- function(x=150, i){
exp(
exp(
sum(
Coefficients[i,"p0_fixed"] * x^0,
Coefficients[i,"p1_fixed"] * x^1,
Coefficients[i,"p2_fixed"] * x^2,
Coefficients[i,"p3_fixed"] * x^3
))
)
}
then (if you want x to be set to 150)
Coefficients[i,"Fitted_Voltage"]<- inverse(inner.f(150, i), 50, 450)

Solving a system of differential equations in R

I have a simple flux model in R. It boils down to two differential equations that model two state variables within the model, we'll call them A and B. They are calculated as simple difference equations of four component fluxes flux1-flux4, 5 parameters p1-p5, and a 6th parameter, of_interest, that can take on values between 0-1.
parameters<- c(p1=0.028, p2=0.3, p3=0.5, p4=0.0002, p5=0.001, of_interest=0.1)
state <- c(A=28, B=1.4)
model<-function(t,state,parameters){
with(as.list(c(state,parameters)),{
#fluxes
flux1 = (1-of_interest) * p1*(B / (p2 + B))*p3
flux2 = p4* A #microbial death
flux3 = of_interest * p1*(B / (p2 + B))*p3
flux4 = p5* B
#differential equations of component fluxes
dAdt<- flux1 - flux2
dBdt<- flux3 - flux4
list(c(dAdt,dBdt))
})
I would like to write a function to take the derivative of dAdt with respect to of_interest, set the derived equation to 0, then rearrange and solve for the value of of_interest. This will be the value of the parameter of_interest that maximizes the function dAdt.
So far I have been able to solve the model at steady state, across the possible values of of_interest to demonstrate there should be a maximum.
require(rootSolve)
range<- seq(0,1,by=0.01)
for(i in range){
of_interest=i
parameters<- c(p1=0.028, p2=0.3, p3=0.5, p4=0.0002, p5=0.001, of_interest=of_interest)
state <- c(A=28, B=1.4)
ST<- stode(y=y,func=model,parms=parameters,pos=T)
out<- c(out,ST$y[1])
Then plotting:
plot(out~range, pch=16,col='purple')
lines(smooth.spline(out~range,spar=0.35), lwd=3,lty=1)
How can I analytically solve for the value of of_interest that maximizes dAdt in R? If an analytical solution is not possible, how can I know, and how can I go about solving this numerically?
Update: I think this problem can be solved with the deSolve package in R, linked here, however I am having trouble implementing it using my particular example.
Your equation in B(t) is just-about separable since you can divide out B(t), from which you can get that
B(t) = C * exp{-p5 * t} * (p2 + B(t)) ^ {of_interest * p1 * p3}
This is an implicit solution for B(t) which we'll solve point-wise.
You can solve for C given your initial value of B. I suppose t = 0 initially? In which case
C = B_0 / (p2 + B_0) ^ {of_interest * p1 * p3}
This also gives a somewhat nicer-looking expression for A(t):
dA(t) / dt = B_0 / (p2 + B_0) * p1 * p3 * (1 - of_interest) *
exp{-p5 * t} * ((p2 + B(t) / (p2 + B_0)) ^
{of_interest * p1 * p3 - 1} - p4 * A(t)
This can be solved by integrating factor (= exp{p4 * t}), via numerical integration of the term involving B(t). We specify the lower limit of the integral as 0 so that we never have to evaluate B outside the range [0, t], which means the integrating constant is simply A_0 and thus:
A(t) = (A_0 + integral_0^t { f(tau; parameters) d tau}) * exp{-p4 * t}
The basic gist is B(t) is driving everything in this system -- the approach will be: solve for the behavior of B(t), then use this to figure out what's going on with A(t), then maximize.
First, the "outer" parameters; we also need nleqslv to get B:
library(nleqslv)
t_min <- 0
t_max <- 10000
t_N <- 10
#we'll only solve the behavior of A & B over t_rng
t_rng <- seq(t_min, t_max, length.out = t_N)
#I'm calling of_interest ttheta
ttheta_min <- 0
ttheta_max <- 1
ttheta_N <- 5
tthetas <- seq(ttheta_min, ttheta_max, length.out = ttheta_N)
B_0 <- 1.4
A_0 <- 28
#No sense storing this as a vector when we'll only ever use it as a list
parameters <- list(p1 = 0.028, p2 = 0.3, p3 = 0.5,
p4 = 0.0002, p5 = 0.001)
From here, the basic outline is:
Given the parameter values (in particular ttheta), solve for BB over t_rng via non-linear equation solving
Given BB and the parameter values, solve for AA over t_rng by numerical integration
Given AA and your expression for dAdt, plug & maximize.
derivs <-
sapply(tthetas, function(th){
#append current ttheta
params <- c(parameters, ttheta = th)
#declare a function we'll use to solve for B (see above)
b_slv <- function(b, t)
with(params, b - B_0 * ((p2 + b)/(p2 + B_0)) ^
(ttheta * p1 * p3) * exp(-p5 * t))
#solving point-wise (this is pretty fast)
# **See below for a note**
BB <- sapply(t_rng, function(t) nleqslv(B_0, function(b) b_slv(b, t))$x)
#this is f(tau; params) that I mentioned above;
# we have to do linear interpolation since the
# numerical integrator isn't constrained to the grid.
# **See below for note**
a_int <- function(t){
#approximate t to the grid (t_rng)
# (assumes B is monotonic, which seems to be true)
# (also, if t ends up negative, just assign t_rng[1])
t_n <- max(1L, which.max(t_rng - t >= 0) - 1L)
idx <- t_n:(t_n+1)
ts <- t_rng[idx]
#distance-weighted average of the local B values
B_app <- sum((-1) ^ (0:1) * (t - ts) / diff(ts) * BB[idx])
#finally, f(tau; params)
with(params, (1 - ttheta) * p1 * p3 * B_0 / (p2 + B_0) *
((p2 + B_app)/(p2 + B_0)) ^ (ttheta * p1 * p3 - 1) *
exp((p4 - p5) * t))
}
#a_int only works on scalars; the numeric integrator
# requires a version that works on vectors
a_int_v <- function(t) sapply(t, a_int)
AA <- exp(-params$p4 * t_rng) *
sapply(t_rng, function(tt)
#I found the subdivisions constraint binding in some cases
# at the default value; no trouble at 1000.
A_0 + integrate(a_int_v, 0, tt, subdivisions = 1000L)$value)
#using the explicit version of dAdt given as flux1 - flux2
max(with(params, (1 - ttheta) * p1 * p3 * BB / (p2 + BB) - p4 * AA))})
Finally, simply run `tthetas[which.max(derivs)]` to get the maximizer.
Note:
This code is not optimized for efficiency. There are a few places where there are some potential speed-ups:
probably faster to run the equation solver recursively, as it'll converge faster with better initial guesses -- using the previous value instead of the initial value is surely better
Will be faster to simply use Riemann sums to integrate; the tradeoff is in accuracy, but should be fine if you have a dense enough grid. One beauty of Riemann is you won't have to interpolate at all, and numerically they're simple linear algebra. I ran this with t_N == ttheta_N == 1000L and it ran within a few minutes.
Probably possible to vectorize a_int directly instead of just sapplying on it, which concomitant speed-up by more direct appeal to BLAS.
Loads of other small stuff. Pre-compute ttheta * p1 * p3 since it's re-used so much, etc.
I didn't bother including any of that stuff, though, because you're honestly probably better off porting this to a faster language -- Julia is my own pet favorite, but of course R speaks well with C++, C, Fortran, etc.

Resources