Generalizing the inputs of the nlsolve function in Julia - julia

This question has already been asked on another platform, but I haven't got an answer yet.
https://discourse.julialang.org/t/generalizing-the-inputs-of-the-nlsolve-function-in-julia/
After an extensive process usyng the SymPy in Julia, I generated a system of nonlinear equations. My system is allocated in a matrix NxS. Something like this(NN = 2, S = 2).
I would like to adapt the system to use the NLsolve package. I do some boondoggle for the case NN=1 and S =1. The system_equations2 function give me the nonlinear system, like the figure
using SymPy
using Plots
using NLsolve
res = system_equations2()
In order to simulate the output, I do this:
NN = 1
S = 1
p= [Sym("p$i$j") for i in 1:NN,j in 1:S]
res = [ Eq( -331.330122303069*p[i,j]^(1.0) + p[i,j]^(2.81818181818182) - 1895.10478893046/(p[i,j]^(-1.0))^(2.0),0 ) for i in 1:NN,j in 1:S]
resf = convert( Function, lhs( res[1,1] ) )
plot(resf, 0 ,10731)
Now
resf = convert( Function, lhs( res[1,1] ) )
# This for the argument in the nlsolve function
function resf2(p)
p = Tuple(p)[1]
r = resf(p)
return r
end
Now, I find the zeros
function K(F,p)
F[1] = resf2(p[1])
end
nlsolve(K , [7500.8])
I would like to generalize this price to any NN and any S. I believe there is a simpler way to do this.

Related

Defining a function with the solution of a differential equation in Julia

I want to use the solution of a differential equation into a piecewise function, but when plotting the result Julia returns the error MethodError: no method matching Float64(::LinearAlgebra.Transpose{Float64, Vector{Float64}}). With the code below, you can see the plot works fine until r=50, and that the error shows up for r>50. How can I use the result of the differential equation in a function?
using Plots, DifferentialEquations
const global Ωₘ = 0.23
const global Ωᵣ = 0.0001
const global Ωl = 0.67
const global H₀ = 70.
function lum(du,u,p,z)
du[1] = -u[1]/(1+z) + ((1/H(z))-1/H₀)/(1+z)
end
da0 = [0.]
zspan = (0.,10.)
lumprob = ODEProblem(lum,da0,zspan)
lumsol = solve(lumprob)
α1(r) = π- asin(3*sqrt(3)*sqrt(1-2/r)/r)
α2(r) = asin(3*sqrt(3)*sqrt(1-2/r)/r)
α3(r) = 3*sqrt(3)/r
α4(r) = 3*sqrt(3)/lumsol(r)
function αtot(r)
if 1.99<r<3.0
α1(r)
elseif 3.0<r<=10.0
α2(r)
elseif 10.0<r<=50.0
α3(r)
elseif 50.0<r
α4(r)
end
end
p2=plot(αtot,xlims=(2.0,51.0),xaxis=:log,xlabel="Rₒ/m",label="α")
In your case lumsol returns a 1-element vector but not a number, thus you need to change line
α4(r) = 3*sqrt(3)/lumsol(r)
to the
α4(r) = 3*sqrt(3)/lumsol(r)[1]
(or refactor the code to take into account that fact)
The function plot should take a function that takes a number and returns a number for the case, your αtot function returns a vector if 50 < r.

Julia - Optim - Gradient per Observation

I am developing an adhoc multinomial logistic model using Julia.
It works fine (although I am sure it could be improved!)
I have written the likelihood function and use Optim to estimate the parameters and the standard errors.
I would like now to develop some robust estimates. Coming from R, I would be using the sandwich package. I did not find any equivalent in Julia. So I could develop something specific I guess.
For this I would need the value of the gradient for each observation (row). I do not find a way to do that using Optim. (When I use gradient!(func, x), I get the sum of the gradients across rows, which is not what I am looking for)
Is there a way to do that using OnceDifferentiable or TwiceDifferentiable ?
Alternatively, is there a package equivalent to R Sandwich that would have escaped from my google researches?
The code I have developed so far:
LLIK_MNL = function(init::Array{Float64})
b = init
u1 = X1*b
u2 = X2*b
u3 = X3*b
umax = max.(u1, u2, u3)
umin = min.(u1, u2, u3)
ucensor = (umax + umin)/2
exu1 = exp.(u1 - ucensor)
exu2 = exp.(u2 - ucensor)
exu3 = exp.(u3 - ucensor)
sexu = exu1 .+ exu2 .+ exu3
Pr=(choice1 .* exu1 + choice2 .* exu2 + choice3 .* exu3) ./ sexu
LL = sum(log.(Pr))
return -LL
end
func = TwiceDifferentiable(var -> LLIK_MNL(var), beta_ini)
opt = optimize(func, beta_ini)
est_MNL = Optim.minimizer(opt)
numerical_hessian = hessian!(func, est_MNL)
var_cov_matrix = inv(numerical_hessian)
temp = diag(var_cov_matrix)
t_stats = est_MNL ./ sqrt.(temp)
pp = 2 * cdf.(Normal(), -abs.(t_stats))
hcat(est_MNL, sqrt.(temp), t_stats, round.(pp, 4))

Julia MethodError: no method matching parseNLExpr_runtime(

I'm attempting to code the method described here to estimate production functions of metal manufacturers. I've done this in Python and Matlab, but am trying to learn Julia.
spain_clean.csv is a dataset of log capital (lnk), log labor (lnl), log output (lnva), and log materials (lnm) that I am loading. Lagged variables are denoted with an "l" before them.
Code is at the bottom. I am getting an error:
ERROR: LoadError: MethodError: no method matching parseNLExpr_runtime(::JuMP.Model, ::JuMP.GenericQuadExpr{Float64,JuMP.Variable}, ::Array{ReverseDiffSparse.NodeData,1}, ::Int32, ::Array{Float64,1})
I think it has to do with the use of vector sums and arrays going into the non-linear objective, but I do not understand Julia enough to debug this.
using JuMP # Need to say it whenever we use JuMP
using Clp, Ipopt # Loading the GLPK module for using its solver
using CSV # csv reader
# read data
df = CSV.read("spain_clean.csv")
#MODEL CONSTRUCTION
#--------------------
acf = Model(solver=IpoptSolver())
#variable(acf, -10<= b0 <= 10) #
#variable(acf, -5 <= bk <= 5 ) #
#variable(acf, -5 <= bl <= 5 ) #
#variable(acf, -10<= g1 <= 10) #
const g = sum(df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl]))
const gllnk = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:llnk])
const gllnl = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:llnl])
const glphihat = sum((df[:phihat]-b0-bk* df[:lnk]-bl* df[:lnl]-g1* (df[:lphihat]-b0-bk* df[:llnk]-bl* df[:llnl])).*df[:lphihat])
#OBJECTIVE
#NLobjective(acf, Min, g* g + gllnk* gllnk + gllnl* gllnk + glphihat* glphihat)
#SOLVE IT
status = solve(acf) # solves the model
println("Objective value: ", getobjectivevalue(acf)) # getObjectiveValue(model_name) gives the optimum objective value
println("b0 = ", getvalue(b0))
println("bk = ", getvalue(bk))
println("bl = ", getvalue(bl))
println("g1 = ", getvalue(g1))
No an expert in Julia, but I think a couple of things are wrong about your code.
first, constant are not supposed to change during iteration and you are making them functions of control variables. Second, what you want to use there are nonlinear expression instead of constants. so instead of the constants what you want to write is
N = size(df, 1)
#NLexpression(acf, g, sum(df[i, :phihat]-b0-bk* df[i, :lnk]-bl* df[i, :lnl]-g1* (df[i, :lphihat]-b0-bk* df[i, :llnk]-bl* df[i, :llnl]) for i=1:N))
#NLexpression(acf, gllnk, sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:llnk] for i=1:N))
#NLexpression(acf,gllnl,sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:llnl] for i=1:N))
#NLexpression(acf,glphihat,sum((df[i,:phihat]-b0-bk* df[i,:lnk]-bl* df[i,:lnl]-g1* (df[i,:lphihat]-b0-bk* df[i,:llnk]-bl* df[i,:llnl]))*df[i,:lphihat] for i=1:N))
I tested this and it seems to work.

Julia JuMP Multiavariate ML Estimation

I am trying to perform a ML-Estimation of a normally distributed variable in a linear regression setting in Julia using JuMP and the NLopt solver.
There exists a good working example here however if I try to estimate the regression parameters (slope) the code becomes quite tedious to write, in particular if the parameter space increases.
Maybe someone has an idea how to write it more concise. Here is my Code:
#type definition to store data
type data
n::Int
A::Matrix
β::Vector
y::Vector
ls::Vector
err::Vector
end
#generate regression data
function Data( n = 1000 )
A = [ones(n) rand(n, 2)]
β = [2.1, 12.9, 3.7]
y = A*β + rand(Normal(), n)
ls = inv(A'A)A'y
err = y - A * ls
data(n, A, β, y, ls, err)
end
#initialize data
d = Data()
println( var(d.y) )
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#defVar( m, b[1:3] )
#defVar( m, σ >= 0, start = 1.0 )
#this is the working example.
#As you can see it's quite tedious to write
#and becomes rather infeasible if there are more then,
#let's say 10, slope parameters to estimate
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\cont. next line
-sum{(d.y[i]-d.A[i,1]*b[1] \\
-d.A[i,2]*b[2] \\
-d.A[i,3]*b[3])^2, i=1:d.n}/(2σ^2) )
#julia returns:
> slope: [2.14,12.85,3.65], variance: 1.04
#which is what is to be expected
#however:
#this is what I would like the code to look like:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n, j=1:3}/(2σ^2) )
#I also tried:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{sum{(d.y[i]-(d.A[i,j]*b[j]))^2, \\
i=1:d.n}, j=1:3}/(2σ^2) )
#but unfortunately it returns:
> slope: [10.21,18.89,15.88], variance: 54.78
solve(m)
println( getValue(b), " ", getValue(σ^2) )
end
ml()
Any ideas?
EDIT
As noted by Reza a working example is:
#setNLObjective( m, Max,-(d.n/2)*log(2π*σ^2) \\
-sum{(d.y[i]-sum{d.A[i,j]*b[j],j=1:3})^2,
i=1:d.n}/(2σ^2) )
The sum{} syntax is a special syntax that only works inside JuMP macros, and is the preferred syntax for sums.
So your example would be written as:
function ml( )
m = Model( solver = NLoptSolver( algorithm = :LD_LBFGS ) )
#variable( m, b[1:3] )
#variable( m, σ >= 0, start = 1.0 )
#NLobjective(m, Max,
-(d.n/2)*log(2π*σ^2)
- sum{
sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,
i=1:d.n}/(2σ^2) )
where I've expanded it across multiple lines to be as clear as possible.
Reza's answer isn't technically wrong, but isn't idiomatic JuMP and won't be as efficient for larger models.
I didn't trace your code but anywhere, I wish that the following works for you:
sum([(d.y[i]-sum([d.A[i,j]*b[j] for j=1:3]))^2 for i=1:d.n])
as #IainDunning mentioned, JuMP package has a special syntax for summation inside it's macros, so the more efficient and abstract way to do this is:
sum{sum{(d.y[i]-d.A[i,j]*b[j], j=1:3}^2,i=1:d.n}

(in R) Why is result of ksvm using user-defined linear kernel different from that of ksvm using "vanilladot"?

I wanted to use user-defined kernel function for Ksvm in R.
so, I tried to make a vanilladot kernel and compare with "vanilladot" which is built in "kernlab" as practice.
I write my kernel as follow.
#
###vanilla kernel with class "kernel"
#
kfunction.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "kernel"
k}
l<-0.1 ; C<-1/(2*l)
###use kfunction.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.k(), C = C)
alpha(tmp)[[1]]
ind<-alphaindex(tmp)[[1]]
x.s<-x[ind,] ; y.s<-y[ind]
w.class.k<-t(alpha(tmp)[[1]]*y.s)%*%x.s
w.class.k
I thouhgt result of this operation is eqaul to that of following.
However It dosn't.
#
###use "vanilladot"
#
l<-0.1 ; C<-1/(2*l)
tmp1<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel="vanilladot", C = C)
alpha(tmp1)[[1]]
ind1<-alphaindex(tmp1)[[1]]
x.s<-x[ind1,] ; y.s<-y[ind1]
w.tmp1<-t(alpha(tmp1)[[1]]*y.s)%*%x.s
w.tmp1
I think maybe this problem is related to kernel class.
When class is set to "kernel", this problem is occured.
However When class is set to "vanillakernel", the result of ksvm using user-defined kernel is equal to that of ksvm using "vanilladot" which is built in Kernlab.
#
###vanilla kernel with class "vanillakernel"
#
kfunction.v.k <- function(){
k <- function (x,y){crossprod(x,y)}
class(k) <- "vanillakernel"
k}
# The only difference between kfunction.k and kfunction.v.k is "class(k)".
l<-0.1 ; C<-1/(2*l)
###use kfunction.v.k
tmp<-ksvm(x,factor(y),scaled=FALSE, type = "C-svc", kernel=kfunction.v.k(), C = C)
alpha(tmp)[[1]]
ind<-alphaindex(tmp)[[1]]
x.s<-x[ind,] ; y.s<-y[ind]
w.class.v.k<-t(alpha(tmp)[[1]]*y.s)%*%x.s
w.class.v.k
I don't understand why the result is different from "vanilladot", when setting the class to "kernel".
Is there an error in my operation?
First, it seems like a really good question!
Now to the point. In the sources of ksvm we can find when is a line drawn between using user-defined kernel, and the built-ins:
if (type(ret) == "spoc-svc") {
if (!is.null(class.weights))
weightedC <- class.weights[weightlabels] * rep(C,
nclass(ret))
else weightedC <- rep(C, nclass(ret))
yd <- sort(y, method = "quick", index.return = TRUE)
xd <- matrix(x[yd$ix, ], nrow = dim(x)[1])
count <- 0
if (ktype == 4)
K <- kernelMatrix(kernel, x)
resv <- .Call("tron_optim", as.double(t(xd)), as.integer(nrow(xd)),
as.integer(ncol(xd)), as.double(rep(yd$x - 1,
2)), as.double(K), as.integer(if (sparse) xd#ia else 0),
as.integer(if (sparse) xd#ja else 0), as.integer(sparse),
as.integer(nclass(ret)), as.integer(count), as.integer(ktype),
as.integer(7), as.double(C), as.double(epsilon),
as.double(sigma), as.integer(degree), as.double(offset),
as.double(C), as.double(2), as.integer(0), as.double(0),
as.integer(0), as.double(weightedC), as.double(cache),
as.double(tol), as.integer(10), as.integer(shrinking),
PACKAGE = "kernlab")
reind <- sort(yd$ix, method = "quick", index.return = TRUE)$ix
alpha(ret) <- t(matrix(resv[-(nclass(ret) * nrow(xd) +
1)], nclass(ret)))[reind, , drop = FALSE]
coef(ret) <- lapply(1:nclass(ret), function(x) alpha(ret)[,
x][alpha(ret)[, x] != 0])
names(coef(ret)) <- lev(ret)
alphaindex(ret) <- lapply(sort(unique(y)), function(x)
which(alpha(ret)[,
x] != 0))
xmatrix(ret) <- x
obj(ret) <- resv[(nclass(ret) * nrow(xd) + 1)]
names(alphaindex(ret)) <- lev(ret)
svindex <- which(rowSums(alpha(ret) != 0) != 0)
b(ret) <- 0
param(ret)$C <- C
}
The important parts are two things, first, if we provide ksvm with our own kernel, then ktype=4 (while for vanillakernel, ktype=0) so it makes two changes:
in case of user-defined kernel, the kernel matrix is computed instead of actually using the kernel
tron_optim routine is ran with the information regarding the kernel
Now, in the svm.cpp we can find the tron routines, and in the tron_run (called from tron_optim), that LINEAR kernel has a separate optimization routine
if (param->kernel_type == LINEAR)
{
/* lots of code here */
while (Cpj < Cp)
{
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w,
Cpj, Cnj, param->eps, sii, param->shrinking,
param->qpsize);
/* lots of code here */
}
totaliter += s.Solve(l, prob->x, minus_ones, y, alpha, w, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
delete[] w;
}
else
{
Solver_B s;
s.Solve(l, BSVC_Q(*prob,*param,y), minus_ones, y, alpha, Cp, Cn,
param->eps, sii, param->shrinking, param->qpsize);
}
As you can see, the linear case is treated in the more complex, more detailed way. There is an inner optimization loop calling the solver many times. It would require really deep analysis of actual optimization being performed here, but at this step one can answer your question in a following way:
There is no error in your operation
kernlab's svm has a separate routine for training SVM with linear kernel, which is based on the type of kernel passed to the code, changing "kernel" to "vanillakernel" made the ksvm think it is actually working with vanillakernel, and so performed this separate optimization routine
It does not seem as a bug in fact, as the linear SVM is in fact very different from the kernelized version in terms of efficient optimization techniques. Amount of heuristic as well as numerical issues that has to be taken care of is really big. As a result, some approximations are required and can lead to the different results. While for the rich feature space (like those induced by RBF kernel) it should not really matter, for simple kernels line linear ones - this simplifications can lead to significant output changes.

Resources