Determine proportion of cheating students using MCMC implemented in Turing.jl - julia

I am trying to solve the following problem using Julia:
Determine the frequency of students cheating during an exam. If we let N be the total number of students who took the exam, and assuming each student is interviewed post-exam using the following privacy retaining algorithm:
"In the interview process for each student, the student flips a coin, hidden from the interviewer. The student agrees to answer honestly if the coin comes up heads. Otherwise, if the coin comes up tails, the student (secretly) flips the coin again, and answers “Yes, I did cheat” if the coin flip lands heads, and “No, I did not cheat”, if the coin flip lands tails. This way, the interviewer does not know if a “Yes” was the result of a guilty plea, or a Heads on a second coin toss. Thus privacy is preserved and the researchers receive honest answers."
I have written the following code to determine the posterior distribution of cheating students:
const N = 100
const YES_ANSWERS = 35
const SAMPLES = 1000000
using DataStructures
using Distributions
function simulate(N, YES_ANSWERS, SAMPLES)
d = DefaultDict{Int64, Int64}(0)
for i in 1:SAMPLES
cheating = rand(0:N)
not_cheating = N - cheating
cheating_honest_yes = rand(Binomial(cheating, 0.5))
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes = rand(Binomial(cheating_second_toss, 0.5))
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss = rand(Binomial(not_cheating, 0.5))
not_cheating_random_yes = rand(Binomial(not_cheating_second_toss, 0.5))
total_yes = cheating_yes_answers + not_cheating_random_yes
if total_yes == YES_ANSWERS
d[cheating] += 1
end
end
d
end
const N = 100
const YES_ANSWERS = 35
const SAMPLES = 1000000
d = simulate(N, YES_ANSWERS, SAMPLES)
using Plots
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
g = scatter(cheaters, probability)
gui(g)
Essentially I draw and count and calculate the probabilities
I then tried to use Turing.jl but got stuck:
using DataStructures
using Distributions
using Turing
#model function CheatingDistribution(N)
cheating ~ DiscreteUniform(0, N)
not_cheating = N - cheating
cheating_honest_yes ~ Binomial(cheating, 0.5)
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes ~ Binomial(cheating_second_toss, 0.5)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss ~ Binomial(not_cheating, 0.5)
not_cheating_random_yes ~ Binomial(not_cheating_second_toss, 0.5)
total_yes = cheating_yes_answers + not_cheating_random_yes
total_yes
end
I cannot infer cheating from total_yes.
That said, I could use this model to reproduce the results that I got with the previous code:
using DataStructures
using Distributions
using Turing
#model function CheatingDistribution(N)
cheating ~ DiscreteUniform(0, N)
not_cheating = N - cheating
cheating_honest_yes ~ Binomial(cheating, 0.5)
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes ~ Binomial(cheating_second_toss, 0.5)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss ~ Binomial(not_cheating, 0.5)
not_cheating_random_yes ~ Binomial(not_cheating_second_toss, 0.5)
total_yes = cheating_yes_answers + not_cheating_random_yes
total_yes
end
function simulate(N, actual_yes_answers, n_samples)
d = DefaultDict{Int64, Int64}(0)
cheatDist = CheatingDistribution(N)
for _ in 1:n_samples
result = rand(cheatDist)
total_yes = result.cheating_honest_yes + result.cheating_random_yes + result.not_cheating_random_yes
if total_yes == actual_yes_answers
d[result.cheating] += 1
end
end
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
return cheaters, probability
end
const N = 100 # Number of students
const YES_ANSWERS = 35 # Number of students that answered yes (both cheating and honest students)
const SAMPLES = 1000000
cheaters, probability = simulate(N, YES_ANSWERS, SAMPLES)
using Plots
g = scatter(cheaters, probability)
gui(g)
I then wrote a new version where I sum up the probabilities and get the same result except that it is more accurate, especially when yes answers in interviews are low. For example 3 students answered yes I cheated. The following is the code using probabilities:
using DataStructures
using Distributions
function simulate(N, YES_ANSWERS)
d = DefaultOrderedDict{Int64, Float64}(0)
for cheating in 0:N
p1 = 1.0 / N
not_cheating = N - cheating
for cheating_honest_yes in 0:cheating
p2 = p1 * pdf(Binomial(cheating, 0.5), cheating_honest_yes)
cheating_second_toss = cheating - cheating_honest_yes
for cheating_random_yes in 0:cheating_second_toss
p3 = p2 * pdf(Binomial(cheating_second_toss, 0.5), cheating_random_yes)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
for not_cheating_second_toss in 0:not_cheating
p4 = p3 * pdf(Binomial(not_cheating, 0.5), not_cheating_second_toss)
for not_cheating_random_yes in 0:not_cheating_second_toss
p5 = p4 * pdf(Binomial(not_cheating_second_toss, 0.5), not_cheating_random_yes)
total_yes = cheating_yes_answers + not_cheating_random_yes
if total_yes == YES_ANSWERS
d[cheating] += p5
end
end
end
end
end
end
d
end
const N = 100
const YES_ANSWERS = 35
d = simulate(N, YES_ANSWERS)
using Plots
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
g = bar(cheaters, probability)
gui(g)
I got this toy problem from the book "Baysian Methods for Hackers". The following is how the solution look like:
Note that the bin size in the solution is 10 while I am showing the PMF. So I am showing densities one tenth of the solution above.
My question: How do I solve this problem using MCMCChain.jl?

Related

How can I plot frequency response for a vibratory system with nonlinear differential equations?

I have a system of nonlinear differential equations for a 3 degree of freedom vibratory system.
system of differential equations
First I want to plot y, y_L and y_R against time (for a given value for Omega) and then I want to plot the domains (max values of y, y_L and y_R) against various amounts of Omega.
Unfortunately, I am not good at Octave. I have written the following code in Octave (based on a sample given by one of the users), but it ends with this error: "anonymous function bodies must be single expressions".
I would be grateful if anyone can help me.
Here is the code:
Me = 4000;
me = 20;
c = 2000;
c1 = 700;
c2 = 700;
k = 20000;
k1 = 250000;
k2 = 20000;
a0 = 0.01;
om = 25;
mu1 = (c+2*c2)/(Me);
mu2 = (c2)/(Me);
mu3 = (c1+c2)/(me);
mu4 = (c2)/(me);
w12 = (2*k2)/(Me);
w22 = (k1+k2)/(me);
a1 = (k2)/(me);
a2 = (k)/(Me);
F0 = (k1*a0)/(Me);
couplode = #(t,y) [y(2); mu4*y(4) - mu3*y(2) - w22*y(1) + a1*y(3) + F0*cos(om*t); y(4); mu2*(y(2)+y(6)) - mu1*y(4) - w12*y(3) + 0.5*w12*(y(1)+y(5)) + a2((y(3)).^3; y(6); mu4*y(4) - mu3*y(6) - w22*y(5) + a1*y(3) + F0*cos(om*t)];
[t,y] = ode45(couplode, [0 0.49*pi], [1;1;1;1;1;1]*1E-8);
figure(1)
plot(t, y)
grid
str = {'$$ \dot{y_L} $$', '$$ y_L $$', '$$ \dot{y} $$', '$$ y $$', '$$ \dot{y_R} $$', '$$ y_R $$'};
legend(str, 'Interpreter','latex', 'Location','NW')
You have a strange term rather at the end of the vector definition
... + a2((y(3)).^3
You certainly meant
... + a2*y(3).^3
You get better visibility and easier debugging by breaking that into separate lines
couplode = #(t,y) [ y(2);
mu4*y(4)-mu3*y(2)-w22*y(1)+a1*y(3)+F0*cos(om*t);
y(4);
mu2*(y(2)+y(6)) - mu1*y(4) - w12*y(3) + 0.5*w12*(y(1)+y(5)) + a2*y(3).^3;
y(6);
mu4*y(4)-mu3*y(6)-w22*y(5)+a1*y(3)+F0*cos(om*t)];
At least in this form, spaces or no spaces makes no difference. In general in matlab/octave [a +b -c] is the same as [a, +b, -c], so one has to be careful that the expression is not interpreted as matrix row. Spaces on both sites of the operation sign switches back to the single-expression interpretation.

Parameter estimation of multiple datasets in julia DifferentialEquations

I have been looking and I could not find a direct way of using the DifferentialEquations parameter estimation in julia to fit multiple datasets. So, let's say we have this simple differential equation with two parameters:
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
We have experimental datasets of u[1] vs t. Each dataset has a different value of p[2] and/or different initial conditions. p[1] is the parameter we want to estimate.
I can do this by solving the differential equation in a for loop that iterates over the different initial conditions and p[2] values, storing the solutions in an array and create a loss-function against the experimental data. I wonder if there is a way of doing this in fewer lines of code, using, for instance,DiffEqBase.problem_new_parameters to set the conditions of each dataset. This is a very common situation when fitting models to experimental data but I could not find a good example in the documentation.
Thank you in advance,
Best regards
EDIT 1
The case expressed above is just a simplified example. To make it a practical case we could create some fake experimental data from the following code:
using DifferentialEquations
# ODE function
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)
# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)
# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)
# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)
sol1, sol2 and sol3 are now our experimental data, each dataset using a different combination of initial conditions and p[2] (which represents some experimental variable (e.g., temperature, flow...)
The objective is to estimate the value of p[1] using the experimental data sol1, sol2 and sol3 letting DiffEqBase.problem_new_parameters or another alternative iterate over the experimental conditions.
What you can do is create a MonteCarloProblem that solves all three of the problems at once:
function prob_func(prob,i,repeat)
i < 3 ? u0 = [1.0] : u0 = [2.0]
i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)
#time sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
num_monte = 3)
Then create a loss function that compares each of the solutions against the 3 datasets and adds their loss together.
loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])
Finally, you need to tell it how to relate the optimization parameter(s) to the problem it's solving. Here, our MonteCarloProblem holds a prob that it's pulling p[1] from whenever it's generating a problem. The value that we want to optimize is that p[1], so:
function my_problem_new_parameters(prob,p)
prob.prob.p[1] = p[1]
prob
end
Now our objective is exactly those pieces together:
obj = build_loss_objective(prob,Tsit5(),loss,
prob_generator = my_problem_new_parameters,
num_monte = 3,
saveat = t)
Now let's throw that to Optim.jl's Brent method:
using Optim
res = optimize(obj,0.0,1.0)
Results of Optimization Algorithm
* Algorithm: Brent's Method
* Search Interval: [0.000000, 1.000000]
* Minimizer: 3.000000e-01
* Minimum: 2.004680e-20
* Iterations: 10
* Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
* Objective Function Calls: 11
It found that the overall best value is 0.3 which is the parameter we used to generate the data.
Here's the code in full:
using DifferentialEquations
# ODE function
f1 = function (du,u,p,t)
du[1] = - p[1]*p[2] * u[1]
end
# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)
# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)
data1 = Array(sol1)
# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)
data2 = Array(sol2)
# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)
data3 = Array(sol3)
function prob_func(prob,i,repeat)
i < 3 ? u0 = [1.0] : u0 = [2.0]
i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)
# Just to show what this looks like
sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
num_monte = 3)
loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])
function my_problem_new_parameters(prob,p)
prob.prob.p[1] = p[1]
prob
end
obj = build_loss_objective(prob,Tsit5(),loss,
prob_generator = my_problem_new_parameters,
num_monte = 3,
saveat = t)
using Optim
res = optimize(obj,0.0,1.0)

Gradient descent implementation is not working in Julia

I am trying to Implement gradient Descent algorithm from scratch to find the slope and intercept value for my linear fit line.
Using the package and calculating slope and intercept, I get slope = 0.04 and intercept = 7.2 but when I use my gradient descent algorithm for the same problem, I get slope and intercept both values = (-infinity,-infinity)
Here is my code
x= [1,2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20]
y=[2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20,21]
function GradientDescent()
m=0
c=0
for i=1:10000
for k=1:length(x)
Yp = m*x[k] + c
E = y[k]-Yp #error in predicted value
dm = 2*E*(-x[k]) # partial derivation of cost function w.r.t slope(m)
dc = 2*E*(-1) # partial derivate of cost function w.r.t. Intercept(c)
m = m + (dm * 0.001)
c = c + (dc * 0.001)
end
end
return m,c
end
Values = GradientDescent() # after running values = (-inf,-inf)
I have not done the math, but instead wrote the tests. It seems you got a sign error when assigning m and c.
Also, writing the tests really helps, and Julia makes it simple :)
function GradientDescent(x, y)
m=0.0
c=0.0
for i=1:10000
for k=1:length(x)
Yp = m*x[k] + c
E = y[k]-Yp
dm = 2*E*(-x[k])
dc = 2*E*(-1)
m = m - (dm * 0.001)
c = c - (dc * 0.001)
end
end
return m,c
end
using Base.Test
#testset "gradient descent" begin
#testset "slope $slope" for slope in [0, 1, 2]
#testset "intercept for $intercept" for intercept in [0, 1, 2]
x = 1:20
y = broadcast(x -> slope * x + intercept, x)
computed_slope, computed_intercept = GradientDescent(x, y)
#test slope ≈ computed_slope atol=1e-8
#test intercept ≈ computed_intercept atol=1e-8
end
end
end
I can't get your exact numbers, but this is close. Perhaps it helps?
# 141 ?
datax = [1,2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20]
datay = [2,3,4,5,6,7,8,9,10,11,12,13,141,5,16,17,18,19,20,21]
function gradientdescent()
m = 0
b = 0
learning_rate = 0.00001
for n in 1:10000
for i in 1:length(datay)
x = datax[i]
y = datay[i]
guess = m * x + b
error = y - guess
dm = 2error * x
dc = 2error
m += dm * learning_rate
b += dc * learning_rate
end
end
return m, b
end
gradientdescent()
(-0.04, 17.35)
It seems that adjusting the learning rate is critical...

Calculate more time points from ode() of the "deSolve" R package without increasing runtime

I wrote a function: ODEsystem = function(t, states, parameters), which contains an ODE-system and solved it with the well documented R packages "deSolve" written by Karline Soetaert, Thomas Petzoldt and R. Woodrow Setzer. The documentation of the package is comprehensive and with many examples. It gives me confidence in their programming and memory optimization skills.
However, when solving the ODE-system with daily intervals instead of monthly intervals the time it takes to calculate the state-values for the specified moments increases tenfold. There might be a bit additional calculations for reaching the exact required moments in time, but for both cases roughly the same internal dynamic time steps should have been made. I did not expect such a large drop in runtime.
The call to ode() in “desolve” looks like this:
out <- as.data.frame(ode(states, t=times, func=ODEsystem), parms=parameters, method="ode45"))
I used two variants for times
times = seq(0, 100*365, by=365/12) # 100 years, one time point per month
times = seq(0, 100*365, by=1) # 100 years, one time point per day
Calling with data points per month
user system elapsed
4.59 0.00 4.58
Calling with data points per month and cmpfun() on the function containing the ODEsystem
user system elapsed
4.39 0.00 4.38
Calling with data points per day
user system elapsed
44.41 0.00 44.46
Calling with data points per day and cmpfun() on the function containing the ODEsystem
user system elapsed
43.01 0.00 43.17
The runtime measured with system.time() increases with factor ten when switching from monthly intervals to daily intervals. Matters do not improve much by using cmpfun() on the function containing the ODE-system.
(The output "out" is only assigned when the function call to ode() is done. Thus pre-assigning "out" yields no performance gain.)
Question 1: I am looking for the reason why there is this decrease in runtime/performance.
(I expect it to be in the internals of the deSolve package.)
Question 2: Given the answer to Question 1, how can I improve the runtime without resorting to dynamic link libraries?
Pre-assign some memory for what will become “out” might help (using knowledge on the time steps in “times”), but I do not known which internal variable in ode() to affect.
#### Clear currrent lists from memory
rm(list=ls())
### Load libraries
# library(rootSolve);library(ggplot2);
library(base);library(deSolve);library(stringr);library(compiler);library(data.table);
#### constants
dpy=365;durX1 = 40*dpy;rH = 1/durX1;durX4 = 365/12;rX4 = 1/durX4;durX6 = 365/12;rX6 = 1/durX6;durX2 = 80;rX2 = 1/durX2;durX3 = 31;rX3 = 1/durX3;durX7 = 20*365/12;rX7 = 1/durX7;durX5 = 29;rX5 = 1/durX5;durX8 = 200;rX8 = 1/durX8;fS = 0.013;fR = 8/100;fL = .03;fP = .03;fF = .05;X1zero = 1000;UDdur = 365/12*5;rK = rX3*(1/UDdur);fD1 = .05;fD2 = .05;durbt = 4;bt = 1/durbt;LX11 = 14;rF = 1/LX11;durX11 = 5;rX11 = 1/durX11;iniX12 = 0;pH = 1;frac_Im = 0;durX9 = dpy*5;ini_X2 = 1;sp = .90;fpX1 = 5;NF = fpX1*X1zero;rT1 = fD1*rX4;rT2 = fD2*rX6;pX1 = 0*sp;pX2 = 1/80*sp;pX3 = .50*sp;pX4 = .5*sp;pX6 = .5*sp;pX7 = 1/100*sp;pX5 = pX3;pX9 = 0*sp;pX8 = 1*sp;rX9 = 1/durX9;
#### vector with parameters
parameters = c(rH, rX3, rX4, rX6, rX2, rX8, rX7, rX5, rK, rT1, rT2, bt, rF, NF, rX11, pX1, pX2, pX3, pX4, pX6, pX7, pX5, pX9, pX8, rX9, X1zero)
### States contains initial conditions
states = c( X1 =X1zero-1,X2=1,X3=0, X4=0, X5=0,X6=0, X7=0, X8=0, X9=0, X10=NF,X11=0,X12=0, X13 = 0)
### function with ODE system
ODEsystem = function(t,states,parameters){
with(as.list(c(states,parameters)),{
### functions
X1part = (pX2*X2 + pX3*X3 + pX4*X4 + pX6*X6 + pX7*X7 + pX5*X5 + pX9*X9 + pX8*X8); prob1 = bt * X12 / X1zero; lF = bt * X1part/X1zero; AD = rK*(X3+X5+X4+X6)+rT1*X4+rT2*X6;
### fluxes
J1 = prob1*X1; J2 = fS*rX2*X2; J3 = (1-fS)*rX2*X2; J4 = (1-fP)*rX3*X3 ; J5 = fP*rX3*X3; J6 = (1-fF)*rX4*X4; J7 = fF*rX4*X4; J8 = rX6*X6; J9 = fR*rX7*X7; J10 = rX5*X5; J11 = (1-fR)*(1-fL)*rX7*X7; J12 = (1-fR)*fL*rX7*X7; J13 = rX8*X8; J14 = rH*X3; J15 = rH*X1; J16 = rH*X2; J17 = rH*X4; J18 = rH*X6; J19 = rH*X5; J20 = rH*X8; J21 = rH*X7; J22 = rH*X9; J23 = rK*X3; J24 = rK*X4; J25 = rK*X6; J26 = rT2*X6; J27 = rH*X1zero; J28 = rT1*X4; J29 = AD; J30 = rK*X5; J31 = rF*X12; J32 = rF*X11; J33 = rF*X10; J34 = lF*X10; J35 = rX11*X11; J36 = rF*NF; J37 = rX9*X9; J38 = 0; J39 = 0; J40 = 0; J41 = 0; J42 = 0; J43 = 0; flux1=J4/X1zero*1e4*dpy; flux2=J12/X1zero*1e4*dpy;
# rate of change
dX1 = - J1 - J15 + J27 + J29 + J37
dX2 = + J1 - J2 - J3 - J16 - J40
dX3 = + J2 - J4 - J5 - J14 - J23 - J41
dX4 = + J4 - J6 - J7 - J17 - J24 - J28
dX5 = + J9 - J10 - J19 - J30 - J43
dX6 = + J7 - J8 + J10 - J18 - J25 - J26
dX7 = + J5 + J6 + J8 - J9 - J11 - J12 - J21
dX8 = + J12 - J13 - J20 - J42
dX9 = + J3 + J11 + J13 - J22 - J37 + J40 + J41 + J42 + J43
dX10 = - J33 - J34 + J36
dX11 = - J32 + J34 - J35
dX12 = - J31 + J35
dX13 = + J38 - J39
# return the rate of change
list(c(dX1,dX2,dX3,dX4,dX5,dX6,dX7,dX8,dX9,dX10,dX11,dX12,dX13),flux1,flux2,prob1)
})
}
## compiled version of ODE system function
cfODEsystem=cmpfun(ODEsystem)
#### time points to be calculated
times = seq(0, 100*365,by=365/12) # 100 year, time points per month
#times = seq(0, 100*365,by=1) # 100 year, time points per day
### calculations
system.time(out <- as.data.frame(ode(states, t=times, func=ODEsystem, parms=parameters, method="ode45")))
#system.time(out <- as.data.frame(ode(states, t=times, func=cfODEsystem, parms=parameters, method="ode45")))
### longitudinal plots of each variable, flux1 and 2 and prob1
for (i in seq(from=2, to=dim(out)[2], by=1) ) {
tempdata <- out[c("time",names(out)[i])]
tempdata$time= tempdata$time/365
templabel <-names(out)[i]
plot(tempdata,col = "black","l",xlab="time (years)",ylab=templabel,
xlim=c(0, max(tempdata$time)), ylim=c(0, signif(max(tempdata[2]),2)))
}
So thanks for writing this question, it prompted me to look into the deSolve internals and learn a bit (and also maybe speed my own code up).
Question 1
The ODE function is called a number of times to solve the function (maybe less than the number of timepoints), but then is also called once a timepoint to evaluate the additional algebraic equations. So if you add 30x the timepoints, you will always add to the runtime but by less than a factor of 30, due to things like setup and teardown.
Question 2
There are a few things you can do to speed things up without resorting to C code (though that is an excellent option)
Use a different solver (e.g. lsoda, which on my system is around 5x faster than ode45)
When using lsoda, increase the hmax to allow the adaptive timestepping more freedom to integrate ahead (another speedup)
rewrite the code to avoid the use of with, instead using array accesses (possibly to temporary named variables).

how can i calculate the polynomial that has the following asymptotes

how can i calculate the polynomial that has the tangent lines (1) y = x where x = 1, and (2) y = 1 where x = 365
I realize this may not be the proper forum but I figured somebody here could answer this in jiffy.
Also, I am not looking for an algorithm to answer this. I'd just like like to see the process.
Thanks.
I guess I should have mentioned that i'm writing an algorithm for scaling the y-axis of flotr graph
The specification of the curve can be expressed as four constraints:
y(1) = 1, y'(1) = 1 => tangent is (y=x) when x=1
y(365) = 1, y'(365) = 0 => tangent is (y=1) when x=365
We therefore need a family of curves with at least four degrees of freedom to match these constraints; the simplest type of polynomial is a cubic,
y = a*x^3 + b*x^2 + c*x + d
y' = 3*a*x^2 + 2*b*x + c
and the constraints give the following equations for the parameters:
a + b + c + d = 1
3*a + 2*b + c = 1
48627125*a + 133225*b + 365*c + d = 1
399675*a + 730*b + c = 0
I'm too old and too lazy to solve these myself, so I googled a linear equation solver to give the answer:
a = 1/132496, b = -731/132496, c = 133955/132496, d = -729/132496
I will post this type of question in mathoverflow.net next time. thanks
my solution in javascript was to adapt the equation of a circle:
var radius = Math.pow((2*Math.pow(365, 2)), 1/2);
var t = 365; //offset
this.tMax = (Math.pow(Math.pow(r, 2) - Math.pow(x, 2), 1/2) - t) * (t / (r - t)) + 1;
the above equation has the above specified asymptotes. it is part of a step polynomial for scaling an axis for a flotr graph.
well, you are missing data (you need another point to determine the polynomial)
a*(x-1)^2+b*(x-1)+c=y-1
a*(x-365)^2+b*(x-365)+c=y-1
you can solve the exact answer for b
but A depends on C (or vv)
and your question is off topic anyways, and you need to revise your algebra

Resources