Custom Loss Function in Flux.jl - julia

I am trying to implement a model with a custom loss function in the Flux.jl package. I include the code for a simplified model, but the error stays the same.
I have an interpolator which takes a scalar value and returns a 2x2 matrix. The goal of my model is to use 3 observations to find the best point to evaluate the interpolator at. For this I wrote a custom loss function that computes the suggested evalutation_point and evaluates the interpolator at this point. Then the interpolated result is compared to the true solution from the dataset.
using Flux, Zygote
using LinearAlgebra
using Interpolations
##
# create interpolator
x = LinRange(0,1,10)
y = [rand(2,2) for i in 1:10]
itp = interpolate(y, BSpline(Linear())) |> i -> scale(i, x)
# create training set
training_set = [(rand(3), rand(2,2)) for i in 0:0.2:1]
#build the model
model = Chain(Dense(3,1),i-> clamp(i[1],0,1))
opt = Descent()
ps = Flux.params(model)
function loss(evaluation_point, solution)
interpolated = itp(model(evaluation_point))
return norm(interpolated - solution)
end
# training NOK
n_epochs = 100
for epoch in 1:n_epochs
Flux.train!(loss, ps, training_set, opt)
println(sum([loss_fnc(i[1],i[2]) for i in training_set]))
end
This returns the following error:
ERROR: DimensionMismatch("matrix A has dimensions (2,2), vector B has length 1")
Stacktrace:
[1] generic_matvecmul!(C::Vector{Matrix{Float64}}, tA::Char, A::Matrix{Float64}, B::StaticArrays.SVector{1, Matrix{Float64}}, _add::LinearAlgebra.MulAddMul{true, true, Bool, Bool})
# LinearAlgebra C:\Users\thega\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\matmul.jl:713
[2] mul!
# C:\Users\thega\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\matmul.jl:81 [inlined]
[3] mul!
# C:\Users\thega\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\matmul.jl:275 [inlined]
[4] *
# C:\Users\thega\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\matmul.jl:51 [inlined]
[5] interpolate_pullback
# C:\Users\thega\.julia\packages\Interpolations\Glp9h\src\chainrules\chainrules.jl:13 [inlined]
[6] ZBack
# C:\Users\thega\.julia\packages\Zygote\H6vD3\src\compiler\chainrules.jl:204 [inlined]
[7] Pullback
# c:\Users\thega\Desktop\Question\main.jl:21 [inlined]
[8] (::typeof(∂(loss)))(Δ::Float64)
# Zygote C:\Users\thega\.julia\packages\Zygote\H6vD3\src\compiler\interface2.jl:0
[9] #212
# C:\Users\thega\.julia\packages\Zygote\H6vD3\src\lib\lib.jl:203 [inlined]
[10] #1750#back
# C:\Users\thega\.julia\packages\ZygoteRules\AIbCs\src\adjoint.jl:67 [inlined]
[11] Pullback
# C:\Users\thega\.julia\packages\Flux\0c9kI\src\optimise\train.jl:102 [inlined]
[12] (::typeof(∂(λ)))(Δ::Float64)
# Zygote C:\Users\thega\.julia\packages\Zygote\H6vD3\src\compiler\interface2.jl:0
[13] (::Zygote.var"#93#94"{Params, typeof(∂(λ)), Zygote.Context})(Δ::Float64)
# Zygote C:\Users\thega\.julia\packages\Zygote\H6vD3\src\compiler\interface.jl:357
[14] gradient(f::Function, args::Params)
# Zygote C:\Users\thega\.julia\packages\Zygote\H6vD3\src\compiler\interface.jl:76
[15] macro expansion
# C:\Users\thega\.julia\packages\Flux\0c9kI\src\optimise\train.jl:101 [inlined]
[16] macro expansion
# C:\Users\thega\.julia\packages\Juno\n6wyj\src\progress.jl:134 [inlined]
[17] train!(loss::Function, ps::Params, data::Vector{Tuple{Vector{Float64}, Matrix{Float64}}}, opt::Descent; cb::Flux.Optimise.var"#40#46")
# Flux.Optimise C:\Users\thega\.julia\packages\Flux\0c9kI\src\optimise\train.jl:99
[18] train!(loss::Function, ps::Params, data::Vector{Tuple{Vector{Float64}, Matrix{Float64}}}, opt::Descent)
# Flux.Optimise C:\Users\thega\.julia\packages\Flux\0c9kI\src\optimise\train.jl:97
[19] top-level scope
# c:\Users\thega\Desktop\Question\main.jl:28
So something about a dimension mismatch, but the evaluation of the loss function works fine.
loss(training_set[1][1], training_set[1][2])
I play around a bit and found that the problem is the gradient computation:
gradient(loss , training_set[1][1], training_set[1][2])

The problem can be found in the training set, with your provided example, check this out:
julia> training_set[1][1]
3-element Vector{Float64}:
0.5093876656425886
0.05272770864628318
0.7651982428671759
julia> training_set[1][2]
2×2 Matrix{Float64}:
0.0691616 0.55414
0.5153 0.654379
For the model, the input is: as x a 2-element vector, and the model should learn to return a 2x2 matrix as y. However, your model does not do that:
julia> model(training_set[1][1])
0.6585413f0 (tracked)
It only returns an instance, due to the definition of the model, which in this case model = Chain(Dense(3,1),i-> clamp(i[1],0,1)) turns to be only a Chain(Dense(3, 1), #7), which means that has a 3 element vector as an input, and returns 1 (and only 1) instance.
Solutions:
redefine your y as a 1 element output for each x
redefine your model (it will be more complicated as you want to return a 2x2 matrix). An example of this would be the following model:
julia> model = Chain(
Dense(3, 4),
x -> reshape(x, (2, 2))
)
But then, you should figure out how to adapt your interpolation code to work

I was not able to fix the problem. My guess is that Interpolations.jl is not compatible with Zygote.jl. A possible workaround I found, was writing a custom interpolations class and function. I include a working example if anyone is interested:
using Flux, Zygote
using LinearAlgebra
using Interpolations
# create a custom linear splines class
struct CustomInterpolator
x::Vector
y::Vector
function CustomInterpolator(x,y)
#assert issorted(x)
return new(x,y)
end
end
function custom_interpolate(citp::CustomInterpolator, x::Number)
left_value, right_value = 0, 0
left_index, right_index = 1, 1
# check bound
if x > citp.x[end] || x < citp.x[1]
#error "Out of bounds"
throw(DomainError(x))
end
#find the right indices
for (i,v) in enumerate(citp.x)
if left_value > x
right_value = v
right_index = i
break
end
left_value = v
left_index = i
end
# do a linear inter interpolation between the two selected indices
interpolated_value = (1 - (x - left_value)/(right_value - left_value)) * citp.y[left_index] + (x - left_value)/(right_value - left_value) * citp.y[right_index]
return interpolated_value
end
##
# create custom interpolator
x = LinRange(0,1,2)
y = [zeros(2,2), ones(2,2)]
citp = CustomInterpolator(x,y)
# create training set
training_set = [(ones(3)*i, ones(2,2) - i*ones(2,2)) for i in 0:0.2:1]
#build the model
model = Chain(Dense(3,3), Dense(3,1), i-> clamp(i[1],0,1), i->custom_interpolate(citp,i))
opt = ADAM()
ps = Flux.params(model)
loss(x,y) = Flux.mse(model(x), y)
# training
n_epochs = 1000
for epoch in 1:n_epochs
Flux.train!(loss, ps, training_set, opt)
println(sum([loss(i[1],i[2]) for i in training_set]))
end

Related

Solving a heat equation using Julia

I am new user of Julia and I want to use it for solving PDEs and ODEs numerically. I am trying to run examples that are available in Julia website or GitHub but I get error.
For instance I want to run this example:
using OrdinaryDiffEq, ModelingToolkit, DiffEqOperators
# Method of Manufactured Solutions: exact solution
u_exact = (x,t) -> exp.(-t) * cos.(x)
# Parameters, variables, and derivatives
#parameters t x
#variables u(..)
Dt = Differential(t)
Dxx = Differential(x)^2
# 1D PDE and boundary conditions
eq = Dt(u(t,x)) ~ Dxx(u(t,x))
bcs = [u(0,x) ~ cos(x),
u(t,0) ~ exp(-t),
u(t,1) ~ exp(-t) * cos(1)]
# Space and time domains
domains = [t ∈ IntervalDomain(0.0,1.0),
x ∈ IntervalDomain(0.0,1.0)]
# PDE system
pdesys = PDESystem(eq,bcs,domains,[t,x],[u(t,x)])
# Method of lines discretization
dx = 0.1
order = 2
discretization = MOLFiniteDifference([x=>dx],t)
# Convert the PDE problem into an ODE problem
prob = discretize(pdesys,discretization)
# Solve ODE problem
using OrdinaryDiffEq
sol = solve(prob,Tsit5(),saveat=0.2)
# Plot results and compare with exact solution
x = (0:dx:1)[2:end-1]
t = sol.t
using Plots
plt = plot()
for i in 1:length(t)
plot!(x,sol.u[i],label="Numerical, t=$(t[i])")
scatter!(x, u_exact(x, t[i]),label="Exact, t=$(t[i])")
end
display(plt)
savefig("plot.png")
But I get this error:
UndefKeywordError: keyword argument name not assigned
Stacktrace:
[1] PDESystem(eqs::Equation, bcs::Vector{Equation}, domain::Vector{Symbolics.VarDomainPairing}, ivs::Vector{Num}, dvs::Vector{Num}, ps::SciMLBase.NullParameters) (repeats 2 times)
# ModelingToolkit C:\Users\rm18124.julia\packages\ModelingToolkit\57XKa\src\systems\pde\pdesystem.jl:75
[2] top-level scope
# In[32]:22
[3] eval
# .\boot.jl:373 [inlined]
[4] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:
1196
I double checked the PDESystem and it looks fine, any help please?
Thanks
You forgot to ensure the pdesys was named as in the docs, i.e. #named pdesys = PDESystem(eq,bcs,domains,[t,x],[u(t,x)])

How can I solve easy linear equations with minimization problem in R?

I have following Inputs:
Inputs <- seq(2,7.7,0.3)
Weights <- paste("w",sep="_",seq(1:20))
And the following equations:
sum(Weights * Inputs) == 4.8
sum(Weights) == 1
min(sum(Weights^2))
Can someone explain how I get a solution for Weights? Thanks!
You can use the optim function. This relies on being able to specify a function that produces a single scalar output which is minimized when the conditions are met. In your case, the function might look like this:
constraints <- function(W) (sum(W * Inputs) - 4.8)^2 + (sum(Weights) - 1)^2
So to solve it we can do:
Weights <- optim(rep(0.05, 20), constraints, method = "BFGS")$par
Which gives us the following result:
Weights
#> [1] 0.04981143 0.04978314 0.04975486 0.04972657 0.04969828 0.04967000 0.04964171
#> [8] 0.04961343 0.04958514 0.04955685 0.04952857 0.04950028 0.04947200 0.04944371
#> [15] 0.04941543 0.04938714 0.04935885 0.04933057 0.04930228 0.04927400
sum(Weights * Inputs)
#> [1] 4.8
sum(Weights)
#> [1] 0.9908542
Obviously, this is a numeric optimization with a 20-dimensional input, so it doesn't perfectly converge to a sum of 1 with the given starting values.

Optimization function gives incorrect results for 2 similar data sets

I have 2 datasets not very different to each other. Each dataset has 27 rows of actual and forecast values. When tested against Solver in Excel for minimization of the absolute error (abs(actual - par * forecast) they both give nearly equal values for the parameter 'par'. However, when each of these data sets are passed on to the same optimization function that I have written, it only works for one of them. For the other data set, the objective always gets evaluated to zero (0) with'par' assisgned the upper bound value.
This is definitely incorrect. What I am not able to understand is why is R doing so?
Here are the 2 data sets :-
test
dateperiod,usage,fittedlevelusage
2019-04-13,16187.24,17257.02
2019-04-14,16410.18,17347.49
2019-04-15,18453.52,17246.88
2019-04-16,18113.1,17929.24
2019-04-17,17712.54,17476.67
2019-04-18,15098.13,17266.89
2019-04-19,13026.76,15298.11
2019-04-20,13689.49,13728.9
2019-04-21,11907.81,14122.88
2019-04-22,13078.29,13291.25
2019-04-23,15823.23,14465.34
2019-04-24,14602.43,15690.12
2019-04-25,12628.7,13806.44
2019-04-26,15064.37,12247.59
2019-04-27,17163.32,16335.43
2019-04-28,17277.18,16967.72
2019-04-29,20093.13,17418.99
2019-04-30,18820.68,18978.9
2019-05-01,18799.63,17610.66
2019-05-02,17783.24,17000.12
2019-05-03,17965.56,17818.84
2019-05-04,16891.25,18002.03
2019-05-05,18665.49,18298.02
2019-05-06,21043.86,19157.41
2019-05-07,22188.93,21092.36
2019-05-08,22358.08,21232.56
2019-05-09,22797.46,22229.69
Optimization result from R
$minimum
[1] 1.018188
$objective
[1] 28031.49
test1
dateperiod,Usage,fittedlevelusage
2019-04-13,16187.24,17248.29
2019-04-14,16410.18,17337.86
2019-04-15,18453.52,17196.25
2019-04-16,18113.10,17896.74
2019-04-17,17712.54,17464.45
2019-04-18,15098.13,17285.82
2019-04-19,13026.76,15277.10
2019-04-20,13689.49,13733.90
2019-04-21,11907.81,14152.27
2019-04-22,13078.29,13337.53
2019-04-23,15823.23,14512.41
2019-04-24,14602.43,15688.68
2019-04-25,12628.70,13808.58
2019-04-26,15064.37,12244.91
2019-04-27,17163.32,16304.28
2019-04-28,17277.18,16956.91
2019-04-29,20093.13,17441.80
2019-04-30,18820.68,18928.29
2019-05-01,18794.10,17573.40
2019-05-02,17779.00,16969.20
2019-05-03,17960.16,17764.47
2019-05-04,16884.77,17952.23
2019-05-05,18658.16,18313.66
2019-05-06,21036.49,19149.12
2019-05-07,22182.11,21103.37
2019-05-08,22335.57,21196.23
2019-05-09,22797.46,22180.51
Optimization result from R
$minimum
[1] 1.499934
$objective
[1] 0
The optimization function used is shown below :-
optfn <- function(x)
{act <- x$usage
fcst <- x$fittedlevelusage
fn <- function(par)
{sum(abs(act - (fcst * par)))
}
adjfac <- optimize(fn, c(0.5, 1.5))
return(adjfac)
}
adjfacresults <- optfn(test)
adjfacresults <- optfn(test1)
Optimization result from R
adjfacresults <- optfn(test)
$minimum
[1] 1.018188
$objective
[1] 28031.49
Optimization result from R
adjfacresults <- optfn(test1)
$minimum [1]
1.499934
$objective
[1] 0
Can anyone help to identify why is R not doing the same process over the 2 data sets and outputting the correct results in both the cases.
The corresponding results using Excel Solver for the 2 datasets are as follows :-
For 'test' data set
par value = 1.018236659
objective function valule (min) : 28031
For 'test1' data set
par value = 1.01881062927878
objective function valule (min) : 28010
Best regards
Deepak
That's because the second column of test1 is named Usage, not usage. Therefore, act = x$usage is NULL, and the function fn returns sum(abs(NULL - something)) = sum(NULL) = 0. You have to rename this column to usage.

Using for loop variable to access element in array yielding NA in R

I'm using a nested for loop to create a greedy algorithm in R.
z = 0
for (j in 1:length(t))
for (i in 1:(length(t) - j))
if ((t[j + i] - t[j]) >= 30)
{z <- c(z,j + i - 1)
j <- j + i - 1
break}
z
Where t is a vector such as:
[1] 12.01485 26.94091 33.32458 49.46742 65.07425 76.05700
[7] 87.11043 100.64116 111.72977 125.72649 139.46460 153.67292
[13] 171.46393 184.54244 201.20850 214.05093 224.16196 237.12485
[19] 251.51753 258.45865 273.95466 285.42704 299.01869 312.35587
[25] 326.26289 339.78724 353.81854 363.15847 378.89307 390.66134
[31] 402.22007 412.86049 424.23181 438.50462 448.88005 462.59917
[37] 473.65289 487.20678 499.80053 509.14141 526.03873 540.17209
[43] 550.69941 565.74602 576.06882 589.07297 598.53208 614.20677
[49] 627.44605 648.08346 665.49614 681.46445 691.01806 704.05762
[55] 714.09172 732.04124 745.90960 758.52628 769.80519 779.41537
[61] 788.35732 805.78547 818.75262 832.71196 844.97859 856.08608
[67] 865.72998 875.55945 887.20862 900.00000
The goal for the function is to find the indexes whose differences are as close to 30 as possible and save them in z.
For example, with the vector t provided, I would expect z to be [0, 2, 4, 6, 8, 10,...70]
The functionality is not my concern right now, as I am running into the error:
Error in if ((t[j + i] - t[j]) >= 30) { :
missing value where TRUE/FALSE needed
I'm new to R so I know I'm not utilizing the vectorization that R is known for. I simply want to have 'j' and 'i' as "counter variables" that I can use to access specific elements of vector t, but for a reason unknown to me, the if statement is not yielding a T/F value.
Any suggestions?
I know you want to learn how to use for-loop, but it is difficult to help you because you did not provide a reproducible example. On the other hand, in R a lot of functions were vectorized, meaning that you can avoid for-loop to achieve the same task with more efficient ways.
Based on the description in your post "The goal for the function is to find the indexes whose differences are as close to 30 as possible and save them in z." I provided the following example to address your question without a for-loop.
z <- which.min(abs(diff(vec) - 30))
z
# [1] 49
vec[c(z, z + 1)]
# [1] 627.4461 648.0835
Based on the data you provided, the indices with the numbers difference which are the closest to 30 is 49. The numbers are 627.4461 and 648.0835.
Data
vec <- c("12.01485 26.94091 33.32458 49.46742 65.07425 76.05700 87.11043
100.64116 111.72977 125.72649 139.46460 153.67292 171.46393
184.54244 201.20850 214.05093 224.16196 237.12485 251.51753
258.45865 273.95466 285.42704 299.01869 312.35587 326.26289
339.78724 353.81854 363.15847 378.89307 390.66134 402.22007
412.86049 424.23181 438.50462 448.88005 462.59917 473.65289
487.20678 499.80053 509.14141 526.03873 540.17209 550.69941
565.74602 576.06882 589.07297 598.53208 614.20677 627.44605
648.08346 665.49614 681.46445 691.01806 704.05762 714.09172
732.04124 745.90960 758.52628 769.80519 779.41537 788.35732
805.78547 818.75262 832.71196 844.97859 856.08608 865.72998
875.55945 887.20862 900.00000")
vec <- strsplit(vec, split = " ")[[1]]
vec <- as.numeric(grep("[0-9]+\\.[0-9]+", vec, value = TRUE))

Julia: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp

I am building a DataFrame row by row and then running a regression on it. For simplicity, the code is:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, #formula(response ~ 1), df)
I get the error:
ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp
This may have arisen from a call to the constructor GLM.LmResp(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] fit(::Type{GLM.LinearModel}, ::Array{Float64,2}, ::Array{Number,1}) at ~/.julia/v0.6/GLM/src/lm.jl:140
[2] #fit#44(::Dict{Any,Any}, ::Array{Any,1}, ::Function, ::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:72
[3] fit(::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:66
[4] include_from_node1(::String) at ./loading.jl:576
[5] include(::String) at ./sysimg.jl:14
while loading ~/test.jl, in expression starting on line 10
The call to the linear regression is very similar to regression in "Introducing Julia":
linearmodel = fit(LinearModel, #formula(Y1 ~ X1), anscombe)
What is the problem?
After a few hours, I realized that GLM requires concrete types and Number is an abstract type (even though the documentation for GLM.LmResp says little about this at the time of this writing, only "Encapsulates the response for a linear model"). The solution is to change the declaration to a concrete type, such as Float64:
using DataFrames
using GLM
df = DataFrame(response = Float64[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, #formula(response ~ 1), df)
Output:
StatsModels.DataFrameRegressionModel{GLM.LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
Formula: response ~ +1
Coefficients:
Estimate Std.Error t value Pr(>|t|)
(Intercept) 0.408856 0.0969961 4.21518 0.0023
The type has to be concrete, e.g. the abstract type Real with df = DataFrame(response = Real[]) fails with a more helpful error message:
ERROR: LoadError: `float` not defined on abstractly-typed arrays; please convert to a more specific type
Alternatively, you can convert to Real after building the dataframe:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
df2 = DataFrame(response = map(Real, df[:response]))
fit(LinearModel, #formula(response ~ 1), df2)
This works because converting to Real actually converts to Float64:
julia> typeof(df2[:response])
Array{Float64,1}
I filed an issue with GLM to improve the error message.

Resources