I'm attempting to code the method described here to estimate production functions of metal manufacturers. I've done this in Python and Matlab, but am trying to learn Julia.
spain_clean.csv is a dataset of log capital (lnk), log labor (lnl), log output (lnva), and log materials (lnm) that I am loading. Lagged variables are denoted with an "l" before them.
Code is at the bottom. I am getting an error:
ERROR: LoadError: MethodError: no method matching parseNLExpr_runtime(::JuMP.Model, ::JuMP.GenericQuadExpr{Float64,JuMP.Variable}, ::Array{ReverseDiffSparse.NodeData,1}, ::Int32, ::Array{Float64,1})
I think it has to do with the use of vector sums and arrays going into the non-linear objective, but I do not understand Julia enough to debug this.
using JuMP # Need to say it whenever we use JuMP
using Clp, Ipopt # Loading the GLPK module for using its solver
using CSV # csv reader
# read data
df = CSV.read("spain_clean.csv")
acf = Model(solver=IpoptSolver())
#variable(acf, -10<= b0 <= 10) #
#variable(acf, -5 <= bk <= 5 ) #
#variable(acf, -5 <= bl <= 5 ) #
#variable(acf, -10<= g1 <= 10) #
const g = sum(df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl]))
const gllnk = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:llnk])
const gllnl = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:llnl])
const glphihat = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:lphihat])
#NLobjective(acf, Min, g* g + gllnk* gllnk + gllnl* gllnk + glphihat* glphihat)
status = solve(acf) # solves the model
println("Objective value: ", getobjectivevalue(acf)) # getObjectiveValue(model_name) gives the optimum objective value
println("b0 = ", getvalue(b0))
println("bk = ", getvalue(bk))
println("bl = ", getvalue(bl))
println("g1 = ", getvalue(g1))

No an expert in Julia, but I think a couple of things are wrong about your code.
first, constant are not supposed to change during iteration and you are making them functions of control variables. Second, what you want to use there are nonlinear expression instead of constants. so instead of the constants what you want to write is
N = size(df, 1)
#NLexpression(acf, g, sum(df[i, :phihat]-b0-bk* df[i, :lnk]-bl* df[i, :lnl]-g1* (df[i, :lphihat]-b0-bk* df[i, :llnk]-bl* df[i, :llnl]) for i=1:N))
#NLexpression(acf, gllnk, sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:llnk] for i=1:N))
#NLexpression(acf,gllnl,sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:llnl] for i=1:N))
#NLexpression(acf,glphihat,sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:lphihat] for i=1:N))
I tested this and it seems to work.


Setting initial (state) values for ODE system in compiled model (deSolve, Rcpp)

I am struggling with a probably minor problem while calling compiled ODEs to be solved
via the R package 'deSolve' and I seeking advice from more expert users.
I have a couple of ODE systems to be solved with 'deSolve'. I have defined the ODEs in separate C++ functions (one for each model) I am calling through R in conjunction with 'Rcpp'. The initial values of the system change if the function takes input from another model (so basically to have a cascade).
This works quite nicely, however, for one model I have to set the initial parameters for t < 2. I've tried to do this in the C++ function, but it does not seem to work.
Running code example
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
// change starting conditions for t < 2
if(t < 2) {
n2 = (n2 * m4) / N2;
m4 = n2;
ne = 0;
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
results <- deSolve::lsoda(
y = state,
times = 1:10,
func = set_ODE,
parms = parameters
The output reads (here only the first two rows):
time ne n2 m4
1 1 1.000000e+01 0.000000e+00 0.000000e+00
2 2 1.000000e+01 2.169236e-07 -1.084618e-11
Just in case: How to run this code example?
My example was tested using RStudio:
Copy the code into a file with the ending *.cpp
Click on the 'Source' button (or <shift> + <cmd> + <s>)
It should work also without RStudio present, but the packages 'Rcpp' and 'deSolve' must be installed and to compile the code it needs Rtools on Windows, GNU compilers on Linux and Xcode on macOS.
From my understanding, ne should be 0 for time = 1 (or t < 2). Unfortunately, the solver does not seem to consider what I have provided in the C++ function, except for the ODEs. If I change state in R to another value, however, it works. Somehow the if-condition I have defined in C++ is ignored, but I don't understand why and how I can calculate the initial values in C++ instead of R.
I was able to reproduce your code. It seems to me that this is indeed elegant, even if it does not leverage the full power of the solver. The reason is, that Rcpp creates an interface to the compiled model via an ordinary R function. So back-calls from the slovers (e.g. lsoda) to R are necessary in each time step. Such back-calls are not for the "plain" C/Fortran interface. Here communication between solver and model takes place at the machine code level.
With this informational, I can see that we don't need to expect initialization issues at the C/C++ level, but it looks like a typical case. As the model function is simply the derivative of the model (and only this). The integration is done by the solver "from outside". It calls the model always with the actual integration state, derived from the time step before (roughly speaking). Therefore, it is not possible to force the state variables to fixed values within the model function.
However, there are several options how to resolve this:
chaining of lsoda calls
use of events
The following shows a chained approach, but I am not yet sure about the initialization of the parameters in the first time segment, so may only be part of the solution.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export("set_ODE")]]
SEXP set_ODE(double t, NumericVector state, NumericVector parameters) {
List dn(3);
double tau2 = parameters["tau2"];
double Ae2_4 = parameters["Ae2_4"];
double d2 = parameters["d2"];
double N2 = parameters["N2"];
double n2 = state["n2"];
double m4 = state["m4"];
double ne = state["ne"];
dn[0] = n2*d2 - ne*Ae2_4 - ne/tau2;
dn[1] = ne/tau2 - n2*d2;
dn[2] = -ne*Ae2_4;
/*** R
state <- c(ne = 10, n2 = 0, m4 = 0)
parameters <- c(N2 = 5e17, tau2 = 1e-8, Ae2_4 = 5e3, d2 = 0)
## the following is not yet clear to me !!!
## especially as it is essentially zero
y1 <- c(ne = 0,
n2 = unname(state["n2"] * state["m4"]/parameters["N2"]),
m4 = unname(state["n2"]))
results1 <- deSolve::lsoda(
y = y,
times = 1:2,
func = set_ODE,
parms = parameters
## last time step, except "time" column
y2 <- results1[nrow(results1), -1]
results2 <- deSolve::lsoda(
y = y2,
times = 2:10,
func = set_ODE,
parms = parameters
## omit 1st time step in results2
results <- rbind(results1, results2[-1, ])
The code has also another potential problem as the parameters span several magnitudes from 1e-8 to 1e17. This can lead to numerical issues, as the relative precision of most software, including R covers only 16 orders of magnitude. Can this be the reason, why the results are all zero? Here it may help to re-scale the model.

Creating a stochastic SIR model in Julia

I am new to julia and want to create a Stochastic SIR model by following: http://epirecip.es/epicookbook/chapters/sir-stochastic-discretestate-continuoustime/julia
I have written my own interpretation which is nearly the same:
# Following the Gillespie algorthim:
# 1. Initialization of states & parameters
# 2. Monte-carlo step. Random process/step selection.
# 3. Update all states. e.g., I = I + 1 (increase of infected by 1 person). Note: only +/- by 1.
# 4. Repeat until stopTime.
# p - Parameter array: β, ɣ for infected rate and recovered rate, resp.
# initialState - initial states of S, I, R information.
# stopTime - Total run time.
using Plots, Distributions
function stochasticSIR(p, initialState, stopTime)
# Hold the states of S,I,R separately w/ a NamedTuple. See '? NamedTuple' in the REML for details
# Populate the data storage arrays with the initial data and initialize the run time
sirData = (dataₛ = [initialState[1]], dataᵢ = [initialState[2]], dataᵣ = [initialState[3]], time = [0]);
while sirData.time[end] < stopTime
if sirData.dataᵢ[end] == 0 # If somehow # of infected = 0, break the loop.
# Probabilities of each process (infection, recovery). p[1] = β and p[2] = ɣ
probᵢ = p[1] * sirData.dataₛ[end] * sirData.dataᵢ[end];
probᵣ = p[2] * sirData.dataᵣ[end];
probₜ = probᵢ + probᵣ; # Total reaction rate
# When the next process happens
k = rand(Exponential(1/probₜ));
push!(sirData.time, sirData.time[end] + k) # time step by k
# Probability that the reaction is:
# probᵢ, probᵣ resp. is: probᵢ / probₜ, probᵣ / probₜ
randNum = rand();
# Update the states by randomly picking process (Gillespie algo.)
if randNum < (probᵢ / probₜ)
push!(sirData.dataₛ, sirData.dataₛ[end] - 1);
push!(sirData.dataᵢ, sirData.dataᵢ[end] + 1);
push!(sirData.dataᵢ, sirData.dataᵢ[end] - 1);
push!(sirData.dataᵣ, sirData.dataᵣ[end] +1)
sirOutput = stochasticSIR([0.0001, 0.05], [999,1,0], 200)
#plot(hcat(sirData.dataₛ, sirData.dataᵢ, sirData.dataᵣ), sirData.time)
InexactError: Int64(2.508057234147307)
Stacktrace: [1] Int64 at .\float.jl:709 [inlined] [2] convert at
.\number.jl:7 [inlined] [3] push! at .\array.jl:868 [inlined] [4]
stochasticSIR(::Array{Float64,1}, ::Array{Int64,1}, ::Int64) at
.\In[9]:33 [5] top-level scope at In[9]:51
Could someone please explain why I receive this error? It does not tell me what line (I am using Jupyter notebook) and I do not understand it.
First error
You have to qualify your references to time as sirData.time
The error message is a bit confusing because time is a function in Base as well, so it is automatically in scope.
Second error
You need your data to be represented as Float64, so you have to explictly type your input array:
sirOutput = stochasticSIR([0.0001, 0.05], Float64[999,1,0], 200)
Alternatively, you can create the array with float literals: [999.0,1,0]. If you create an array with only literal integers, Julia will create an integer array.
I'm not sure StackOverflow is the best venue for this, as you seem to editing the original post as you go along with new errors.
Your current error at the time of writing (InexactError: Int(2.50805)) tells you that you are trying to create an integer from a Float64 floating point number, which you can't do without rounding explicitly.
I would highly recommend reading the Julia docs to get the hang of basic usage, and maybe use the Julia Discourse forum for more interactive back-and-forth debugging with the community.

Julia JuMP Multiavariate ML Estimation

I am trying to perform a ML-Estimation of a normally distributed variable in a linear regression setting in Julia using JuMP and the NLopt solver.
There exists a good working example here however if I try to estimate the regression parameters (slope) the code becomes quite tedious to write, in particular if the parameter space increases.
Maybe someone has an idea how to write it more concise. Here is my Code:
#type definition to store data
type data
#generate regression data
function Data( n = 1000 )
A = [ones(n) rand(n, 2)]
β = [2.1, 12.9, 3.7]
y = A*β + rand(Normal(), n)
ls = inv(A'A)A'y
err = y - A * ls
data(n, A, β, y, ls, err)
#initialize data
d = Data()
println( var(d.y) )
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#defVar( m, b[1:3] )
#defVar( m, σ >= 0, start = 1.0 )
#this is the working example.
#As you can see it's quite tedious to write
#and becomes rather infeasible if there are more then,
#let's say 10, slope parameters to estimate
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\cont. next line
-sum{(d.y[i]-d.A[i,1]*b[1] \\
-d.A[i,2]*b[2] \\
-d.A[i,3]*b[3])^2, i=1:d.n}/(2σ^2) )
#julia returns:
> slope: [2.14,12.85,3.65], variance: 1.04
#which is what is to be expected
#this is what I would like the code to look like:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n, j=1:3}/(2σ^2) )
#I also tried:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n}, j=1:3}/(2σ^2) )
#but unfortunately it returns:
> slope: [10.21,18.89,15.88], variance: 54.78
println( getValue(b), " ", getValue(σ^2) )
Any ideas?
As noted by Reza a working example is:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
i=1:d.n}/(2σ^2) )
The sum{} syntax is a special syntax that only works inside JuMP macros, and is the preferred syntax for sums.
So your example would be written as:
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#variable( m, b[1:3] )
#variable( m, σ >= 0, start = 1.0 )
#NLobjective(m, Max,
- sum{
sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,
i=1:d.n}/(2σ^2) )
where I've expanded it across multiple lines to be as clear as possible.
Reza's answer isn't technically wrong, but isn't idiomatic JuMP and won't be as efficient for larger models.
I didn't trace your code but anywhere, I wish that the following works for you:
sum([(d.y[i]-sum([d.A[i,j]*b[j] for j=1:3]))^2 for i=1:d.n])
as #IainDunning mentioned, JuMP package has a special syntax for summation inside it's macros, so the more efficient and abstract way to do this is:
sum{sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,i=1:d.n}

How to call numerical results to integrate a ODE using Runge-Kutta-4 in Python 3?

I'm trying to solve (for m_0) numerically the following ordinary differential equation:
dm0/dx=(((1-x)*(x*(2-x))**(1.5))/(k+x)**2)*(((x*(2-x))/3.0)*(dw/dx)**2 + ((8*(k+1))/(3*(k+x)))*w**2)
The values of w and dw/dx have been found already numerically using the Runge-Kutta 4th order and k is a factor that is fixed. I wrote a code where I call the values for w and dw/dx from an external file, then I organize them in an array, then I call the array in the function and then I run the integration. My outcome is not what it's expected :(, I don't know what is wrong. If anyone could give me a hand, it would be highly appreciated. Thank you!
from math import sqrt
from numpy import array,zeros,loadtxt
from printSoln import *
from run_kut4 import *
m = 1.15 # Just a constant.
k = 3.0*sqrt(1.0-(1.0/m))-1.0 # k in terms of m.
omegas = loadtxt("omega.txt",float) # Import values of w
domegas = loadtxt("domega.txt",float) # Import values of dw/dx
w = [] # Defines the array w to store the values w^2
s = 0.0
for s in omegas:
w.append(s**2) # Calculates the values w**2
omeg = array(w,float) # Array to store the value of w**2
dw = [] # Defines the array dw to store the values dw**2
t = 0.0
for t in domegas:
dw.append(t**2) # Calculates the values for dw**2
domeg = array(dw,float) # Array to store the values of dw**2
x = 1.0e-12 # Starting point of integration
xStop = (2.0 - k)/3.0 # Final point of integration
def F(x,y): # Define function to be integrated
F = zeros(1)
for i in domeg: # Loop to call w^2, (dw/dx)^2
for j in omeg:
F[0] = (((1.0-x)*(x*(2.0-x))**(1.5))/(k+x)**2)*((1.0/3.0)*x* (2.0-x)*domeg[i] + (8.0*(k+1.0)*omeg[j])/(3.0*(k+x)))
return F
y = array([((32.0*sqrt(2.0)*(k+1.0)*(x**2.5))/(15.0*(k**3)))]) # Initial condition for m_{0}
h = 1.0e-5 # Integration step
freq = 0 # Prints only initial and final values
X,Y = integrate(F,x,y,xStop,h) # Calls Runge-Kutta 4
printSoln(X,Y,freq) # Prints solution
Interpreting your verbal description, there is an ODE for omega, w'=F(x,w), and a coupled ODE for m0, m'=G(x,m,w,w'). The almost always optimal way to solve this is to treat it as system of ODE,
def ODEfunc(x,y)
w,m = y
dw = F(x,w)
dm = G(x,m,w,dw)
return np.array([dw, dm])
which you can then insert in the ODE solver of your choice, e.g., the fictitious
ODEintegrate(ODEfunc, xsamples, y0)

(in R) Why is result of ksvm using user-defined linear kernel different from that of ksvm using "vanilladot"?

I wanted to use user-defined kernel function for Ksvm in R.
so, I tried to make a vanilladot kernel and compare with "vanilladot" which is built in "kernlab" as practice.
I write my kernel as follow.
###vanilla kernel with class "kernel"
kfunction.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "kernel"
l<-0.1 ; C<-1/(2*l)
###use kfunction.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.k(), C = C)
x.s<-x[ind,] ; y.s<-y[ind]
I thouhgt result of this operation is eqaul to that of following.
However It dosn't.
###use "vanilladot"
l<-0.1 ; C<-1/(2*l)
tmp1<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel="vanilladot", C = C)
x.s<-x[ind1,] ; y.s<-y[ind1]
I think maybe this problem is related to kernel class.
When class is set to "kernel", this problem is occured.
However When class is set to "vanillakernel", the result of ksvm using user-defined kernel is equal to that of ksvm using "vanilladot" which is built in Kernlab.
###vanilla kernel with class "vanillakernel"
kfunction.v.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "vanillakernel"
# The only difference between kfunction.k and kfunction.v.k is "class(k)".
l<-0.1 ; C<-1/(2*l)
###use kfunction.v.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.v.k(), C = C)
x.s<-x[ind,] ; y.s<-y[ind]
I don't understand why the result is different from "vanilladot", when setting the class to "kernel".
Is there an error in my operation?
First, it seems like a really good question!
Now to the point. In the sources of ksvm we can find when is a line drawn between using user-defined kernel, and the built-ins:
if (type(ret) == "spoc-svc") {
if (!is.null(class.weights))
weightedC <- class.weights[weightlabels] * rep(C,
else weightedC <- rep(C, nclass(ret))
yd <- sort(y, method = "quick", index.return = TRUE)
xd <- matrix(x[yd$ix, ], nrow = dim(x)[1])
count <- 0
if (ktype == 4)
K <- kernelMatrix(kernel, x)
resv <- .Call("tron_optim", as.double(t(xd)), as.integer(nrow(xd)),
as.integer(ncol(xd)), as.double(rep(yd$x - 1,
2)), as.double(K), as.integer(if (sparse) xd#ia else 0),
as.integer(if (sparse) xd#ja else 0), as.integer(sparse),
as.integer(nclass(ret)), as.integer(count), as.integer(ktype),
as.integer(7), as.double(C), as.double(epsilon),
as.double(sigma), as.integer(degree), as.double(offset),
as.double(C), as.double(2), as.integer(0), as.double(0),
as.integer(0), as.double(weightedC), as.double(cache),
as.double(tol), as.integer(10), as.integer(shrinking),
PACKAGE = "kernlab")
reind <- sort(yd$ix, method = "quick", index.return = TRUE)$ix
alpha(ret) <- t(matrix(resv[-(nclass(ret) * nrow(xd) +
1)], nclass(ret)))[reind, , drop = FALSE]
coef(ret) <- lapply(1:nclass(ret), function(x) alpha(ret)[,
x][alpha(ret)[, x] != 0])
names(coef(ret)) <- lev(ret)
alphaindex(ret) <- lapply(sort(unique(y)), function(x)
x] != 0))
xmatrix(ret) <- x
obj(ret) <- resv[(nclass(ret) * nrow(xd) + 1)]
names(alphaindex(ret)) <- lev(ret)
svindex <- which(rowSums(alpha(ret) != 0) != 0)
b(ret) <- 0
param(ret)$C <- C
The important parts are two things, first, if we provide ksvm with our own kernel, then ktype=4 (while for vanillakernel, ktype=0) so it makes two changes:
in case of user-defined kernel, the kernel matrix is computed instead of actually using the kernel
tron_optim routine is ran with the information regarding the kernel
Now, in the svm.cpp we can find the tron routines, and in the tron_run (called from tron_optim), that LINEAR kernel has a separate optimization routine
if (param->kernel_type == LINEAR)
/* lots of code here */
while (Cpj < Cp)
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w,
Cpj, Cnj, param->eps, sii, param->shrinking,
/* lots of code here */
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
delete[] w;
Solver_B s;
s.Solve(l, BSVC_Q(*prob,*param,y), minus_ones, y, alpha, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
As you can see, the linear case is treated in the more complex, more detailed way. There is an inner optimization loop calling the solver many times. It would require really deep analysis of actual optimization being performed here, but at this step one can answer your question in a following way:
There is no error in your operation
kernlab's svm has a separate routine for training SVM with linear kernel, which is based on the type of kernel passed to the code, changing "kernel" to "vanillakernel" made the ksvm think it is actually working with vanillakernel, and so performed this separate optimization routine
It does not seem as a bug in fact, as the linear SVM is in fact very different from the kernelized version in terms of efficient optimization techniques. Amount of heuristic as well as numerical issues that has to be taken care of is really big. As a result, some approximations are required and can lead to the different results. While for the rich feature space (like those induced by RBF kernel) it should not really matter, for simple kernels line linear ones - this simplifications can lead to significant output changes.
