How can I solve easy linear equations with minimization problem in R? - r

I have following Inputs:
Inputs <- seq(2,7.7,0.3)
Weights <- paste("w",sep="_",seq(1:20))
And the following equations:
sum(Weights * Inputs) == 4.8
sum(Weights) == 1
min(sum(Weights^2))
Can someone explain how I get a solution for Weights? Thanks!

You can use the optim function. This relies on being able to specify a function that produces a single scalar output which is minimized when the conditions are met. In your case, the function might look like this:
constraints <- function(W) (sum(W * Inputs) - 4.8)^2 + (sum(Weights) - 1)^2
So to solve it we can do:
Weights <- optim(rep(0.05, 20), constraints, method = "BFGS")$par
Which gives us the following result:
Weights
#> [1] 0.04981143 0.04978314 0.04975486 0.04972657 0.04969828 0.04967000 0.04964171
#> [8] 0.04961343 0.04958514 0.04955685 0.04952857 0.04950028 0.04947200 0.04944371
#> [15] 0.04941543 0.04938714 0.04935885 0.04933057 0.04930228 0.04927400
sum(Weights * Inputs)
#> [1] 4.8
sum(Weights)
#> [1] 0.9908542
Obviously, this is a numeric optimization with a 20-dimensional input, so it doesn't perfectly converge to a sum of 1 with the given starting values.

Related

Solving a heat equation using Julia

I am new user of Julia and I want to use it for solving PDEs and ODEs numerically. I am trying to run examples that are available in Julia website or GitHub but I get error.
For instance I want to run this example:
using OrdinaryDiffEq, ModelingToolkit, DiffEqOperators
# Method of Manufactured Solutions: exact solution
u_exact = (x,t) -> exp.(-t) * cos.(x)
# Parameters, variables, and derivatives
#parameters t x
#variables u(..)
Dt = Differential(t)
Dxx = Differential(x)^2
# 1D PDE and boundary conditions
eq = Dt(u(t,x)) ~ Dxx(u(t,x))
bcs = [u(0,x) ~ cos(x),
u(t,0) ~ exp(-t),
u(t,1) ~ exp(-t) * cos(1)]
# Space and time domains
domains = [t ∈ IntervalDomain(0.0,1.0),
x ∈ IntervalDomain(0.0,1.0)]
# PDE system
pdesys = PDESystem(eq,bcs,domains,[t,x],[u(t,x)])
# Method of lines discretization
dx = 0.1
order = 2
discretization = MOLFiniteDifference([x=>dx],t)
# Convert the PDE problem into an ODE problem
prob = discretize(pdesys,discretization)
# Solve ODE problem
using OrdinaryDiffEq
sol = solve(prob,Tsit5(),saveat=0.2)
# Plot results and compare with exact solution
x = (0:dx:1)[2:end-1]
t = sol.t
using Plots
plt = plot()
for i in 1:length(t)
plot!(x,sol.u[i],label="Numerical, t=$(t[i])")
scatter!(x, u_exact(x, t[i]),label="Exact, t=$(t[i])")
end
display(plt)
savefig("plot.png")
But I get this error:
UndefKeywordError: keyword argument name not assigned
Stacktrace:
[1] PDESystem(eqs::Equation, bcs::Vector{Equation}, domain::Vector{Symbolics.VarDomainPairing}, ivs::Vector{Num}, dvs::Vector{Num}, ps::SciMLBase.NullParameters) (repeats 2 times)
# ModelingToolkit C:\Users\rm18124.julia\packages\ModelingToolkit\57XKa\src\systems\pde\pdesystem.jl:75
[2] top-level scope
# In[32]:22
[3] eval
# .\boot.jl:373 [inlined]
[4] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:
1196
I double checked the PDESystem and it looks fine, any help please?
Thanks
You forgot to ensure the pdesys was named as in the docs, i.e. #named pdesys = PDESystem(eq,bcs,domains,[t,x],[u(t,x)])

Optimization function gives incorrect results for 2 similar data sets

I have 2 datasets not very different to each other. Each dataset has 27 rows of actual and forecast values. When tested against Solver in Excel for minimization of the absolute error (abs(actual - par * forecast) they both give nearly equal values for the parameter 'par'. However, when each of these data sets are passed on to the same optimization function that I have written, it only works for one of them. For the other data set, the objective always gets evaluated to zero (0) with'par' assisgned the upper bound value.
This is definitely incorrect. What I am not able to understand is why is R doing so?
Here are the 2 data sets :-
test
dateperiod,usage,fittedlevelusage
2019-04-13,16187.24,17257.02
2019-04-14,16410.18,17347.49
2019-04-15,18453.52,17246.88
2019-04-16,18113.1,17929.24
2019-04-17,17712.54,17476.67
2019-04-18,15098.13,17266.89
2019-04-19,13026.76,15298.11
2019-04-20,13689.49,13728.9
2019-04-21,11907.81,14122.88
2019-04-22,13078.29,13291.25
2019-04-23,15823.23,14465.34
2019-04-24,14602.43,15690.12
2019-04-25,12628.7,13806.44
2019-04-26,15064.37,12247.59
2019-04-27,17163.32,16335.43
2019-04-28,17277.18,16967.72
2019-04-29,20093.13,17418.99
2019-04-30,18820.68,18978.9
2019-05-01,18799.63,17610.66
2019-05-02,17783.24,17000.12
2019-05-03,17965.56,17818.84
2019-05-04,16891.25,18002.03
2019-05-05,18665.49,18298.02
2019-05-06,21043.86,19157.41
2019-05-07,22188.93,21092.36
2019-05-08,22358.08,21232.56
2019-05-09,22797.46,22229.69
Optimization result from R
$minimum
[1] 1.018188
$objective
[1] 28031.49
test1
dateperiod,Usage,fittedlevelusage
2019-04-13,16187.24,17248.29
2019-04-14,16410.18,17337.86
2019-04-15,18453.52,17196.25
2019-04-16,18113.10,17896.74
2019-04-17,17712.54,17464.45
2019-04-18,15098.13,17285.82
2019-04-19,13026.76,15277.10
2019-04-20,13689.49,13733.90
2019-04-21,11907.81,14152.27
2019-04-22,13078.29,13337.53
2019-04-23,15823.23,14512.41
2019-04-24,14602.43,15688.68
2019-04-25,12628.70,13808.58
2019-04-26,15064.37,12244.91
2019-04-27,17163.32,16304.28
2019-04-28,17277.18,16956.91
2019-04-29,20093.13,17441.80
2019-04-30,18820.68,18928.29
2019-05-01,18794.10,17573.40
2019-05-02,17779.00,16969.20
2019-05-03,17960.16,17764.47
2019-05-04,16884.77,17952.23
2019-05-05,18658.16,18313.66
2019-05-06,21036.49,19149.12
2019-05-07,22182.11,21103.37
2019-05-08,22335.57,21196.23
2019-05-09,22797.46,22180.51
Optimization result from R
$minimum
[1] 1.499934
$objective
[1] 0
The optimization function used is shown below :-
optfn <- function(x)
{act <- x$usage
fcst <- x$fittedlevelusage
fn <- function(par)
{sum(abs(act - (fcst * par)))
}
adjfac <- optimize(fn, c(0.5, 1.5))
return(adjfac)
}
adjfacresults <- optfn(test)
adjfacresults <- optfn(test1)
Optimization result from R
adjfacresults <- optfn(test)
$minimum
[1] 1.018188
$objective
[1] 28031.49
Optimization result from R
adjfacresults <- optfn(test1)
$minimum [1]
1.499934
$objective
[1] 0
Can anyone help to identify why is R not doing the same process over the 2 data sets and outputting the correct results in both the cases.
The corresponding results using Excel Solver for the 2 datasets are as follows :-
For 'test' data set
par value = 1.018236659
objective function valule (min) : 28031
For 'test1' data set
par value = 1.01881062927878
objective function valule (min) : 28010
Best regards
Deepak
That's because the second column of test1 is named Usage, not usage. Therefore, act = x$usage is NULL, and the function fn returns sum(abs(NULL - something)) = sum(NULL) = 0. You have to rename this column to usage.

How to adjust an odd behaving Hessian to calculate standard errors with optim

I am using a Kalman filter to estimate various Dynamic and Arbitrage free Nelson-Siegel models for yield curves. I give some starting values to optim and the algorithm converges just fine. However, when I want to calculate standard errors using the Hessian supplied by the optim algorithm, I get NaN's due to nonpositive values on the diagonal of the Variance covariance matrix. I think it is because I have a highly nonlinear function with many local optima, however it keeps happening for all starting values I try.
The function I use is optim together with the default Nelder-Mead algorithm.
The command I use is
opt_para<-optim(par=par0, fn=Kalman_filter, y=y,
maturities=maturities,control=list(maxit=20000),hessian=TRUE) The starting values are given in par0, which is
> par0
[1] 9.736930e-01 1.046646e+00 5.936238e-01 4.444669e-02 2.889251e-07 6.646960e+00 7.715964e-01 9.945551e-01 9.663361e-01
[10] 6.000000e-01 6.000000e-01 6.000000e-01 6.000000e-02 5.000000e-01 5.000000e-01 5.000000e-01 5.000000e-01
The optimoutput that I get is
$par[1] 0.833208307 1.373442068 0.749313983 0.646577154 0.237102069 6.882644818 0.788775982 0.918378263 0.991982038
[10] 0.748509055 0.005115171 0.392213941 0.717186499 0.121525623 0.386227284
0.001970431 0.845279611
$value
[1] 575.7886
$counts
function gradient
5225 NA
$convergence
[1] 0
$message
NULL
I then use the following command to produce the standard errors of the estimates.
hessian<-opt_para$hessian
fish_info<-solve(hessian,tol=1e-100)
st_errors<- diag(sqrt(fish_info))
st_errors
I get the following output
st_errors
[1] NaN NaN 2.9170315888 NaN NaN NaN 0.0294300357 0.0373614751 NaN
[10] 0.0785349634 0.0005656580 NaN 0.0470600219 0.0053255251 0.0408666177 0.0001561243 0.4540428740
The NaNs are being produced to a negative value on the diagonal, which should be impossible in a variance-covariance matrix. However, I suspect that it is due to the optimization procedure being not correct.
To be clear, I also include the function I want to optimize. It is a Kalman-filter with updating equations and some restrictions built in.
Kalman_filter<-function(par, y, maturities){
b0<-c(par[1],par[2],par[3])
P0<-diag(c(par[4],par[5],par[6]))
Phi<-diag(c(par[7],par[8],par[9]))
mu<-c(par[10],par[11],par[12])
lambda<-par[13]
sigma11<-par[14]
sigma21<-par[15]
sigma22<-par[16]
sigma33<-par[17]
m=length(b0)
n=length(y[,1])
d<-length(y[1,])
sigma_eps<-sigma11*diag(d)
sigma_nu<-diag(c(sigma21^2,sigma22^2,sigma33^2))*(1/12)
colnames(sigma_nu)<-c("level","slope","Curvat")
X<-matrix(cbind(rep(1,length(maturities)), slope_factor(lambda,maturities), curv_factor(lambda,maturities)),ncol=3)
colnames(X)<-c("level","slope","Curvature")
bt<-matrix(NA, nrow=m, ncol=n+1)
Pt<-array(NA, dim=c(m,m,n+1))
btt<-matrix(NA, nrow=m,ncol=n+1)
Ptt<-array(NA, dim=c(m,m,n+1))
vt<-matrix(NA, nrow=d, ncol=n)
eigen_values<-eigen(Phi,only.values=TRUE)$values
if(eigen_values[1]>=1||eigen_values[2]>=1||eigen_values[3]>=1){
loglike=-70000000
}else{
c<- (diag(3) - Phi)%*% mu
loglike<-0
i<-1
btt[,1]<-b0
Ptt[,,1]<-P0
while(i< n+1){
bt[,i]<- c+ Phi%*% btt[,i]
Pt[,,i] <- Phi%*% tcrossprod(Ptt[,,i],Phi) + sigma_nu
vt[,i]<- y[i,] - X%*% bt[,i]
ft<-X%*% tcrossprod(Pt[,,i], X) + sigma_eps
det_f<-det(ft)
if( is.nan(det_f) || is.na(det_f)|| is.infinite(det_f)){
loglike<- - 700000000
} else
{
if(det_f<0){
loglike <- - 700000000
} else
{
if (abs(det_f>1e-20)){
logdet_f<- log(det_f)
f_inv<- solve(ft, tol=1e-200)
Kt<- tcrossprod(Pt[,,i],X)%*% f_inv
btt[,i+1] <- bt[,i] + Kt%*% vt[,i]
Ptt[,,i+1] <- (diag(3) - Kt%*% X)%*% Pt[,,i]
loglike_contr<- -0.5*d*log(2*pi) - 0.5 * logdet_f - 0.5*
crossprod(vt[,i],f_inv)%*% vt[,i]
loglike<-loglike+loglike_contr
} else
{ loglike<- -700000}
}
}
i<-i+1
}
}
return(-loglike)
}
Any help would be appreciated.
I have just solved the problem, I programmed the likelihood function once more with the only input parameters, the likelihood estimates from optim. After this, I used the hessianfunction from the numDerivpackage. This produces viable estimates for the standard errors.

error optim birnbaum sanders distribution fnpar

I don't know why I get an error message when I'm trying to use optim in R.
I have following data:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
And following log-likelihood function:
log.lik.bs <- function(gamma, betha, z){
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z)) + sum(log(sqrt(z/betha)) + sqrt(betha/z)) - (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
What I'm trying to do is following:
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), x=x, method="BFGS")$par
But I get an error message:
Error in fn(par, ...) :
unused argument (x = c(6.04012089885228, 7.28882174812723, 0.486806990614708, 1.1699703323488, 51.5998418613029, 11.8766733963947, 2.38732637900487, 16.9583701851951, 21.6142834611592, 0.313308870127425, 3.41783600439905, 4.43674270859797, 2.02051001746263, 10.5798883747597, 0.489003100259996, 1.67341757119939, 10.2809820486722, 6.4705423816332, 5.68019649178721, 0.943869996033357))
It is not quite clear which parameters you are trying to optimize. I assume you want to optimize log.lik.bs with respect to gamma and betha for given z with initial values 2 and 6. In that case you have two errors in your code:
log.lik.bs is expecting an argument named z but you are providing an argument x. That is the error you are getting. Fix: z = x
When using optim your target function must take a single argument for the parameters. From ?optim:
A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
Combining this I get:
x <- c(6.0401209, 7.2888217, 0.4868070,
1.1699703, 51.5998419, 11.8766734,
2.3873264, 16.9583702, 21.6142835,
0.3133089, 3.4178360, 4.4367427,
2.0205100, 10.5798884, 0.4890031,
1.6734176, 10.2809820, 6.4705424,
5.6801965, 0.9438700)
log.lik.bs <- function(x, z){
gamma <- x[1]
betha <- x[2]
n <- length(z)
- n * log(gamma) - n * log(2*sqrt(2*2*pi)) - sum(log(z))
+ sum(log(sqrt(z/betha)) + sqrt(betha/z))
- (1/2*gamma^2) * sum((sqrt(z/betha) - sqrt(betha/z))^2)
}
optim(c(2, 6), log.lik.bs, control=list(fnscale=-1), z=x, method="BFGS")$par
Unfortunately this still throws an error:
Error in optim(c(2, 6), log.lik.bs, control = list(fnscale = -1), z = x, :
non-finite finite-difference value [1]
In addition there are several warnings that NaNs where introduced by sqrt and log. So my interpretation of your question might be wrong. After all, the function goes to infinity as gamma goes to zero.

Solve simple equation in R

I have a probably really basic question concerning the possibility to solve functions in R, but to know the answer would really help to understand R better.
I have following equation:
0=-100/(1+r)+(100-50)/(1+r)^2+(100-50)/(1+r)^3+...(100-50)/(1+r)^10
How can I solve this equation in R finding the variable r?
I tried sth. like this:
n <- c(2:10)
0 = -100/(r+1)+sum((100-50)/((1+r)^n))
But got an error message:
Error in 0 = -100/(r + 1) + sum((100 - 50)/((1 + r)^n)) :
invalid (do_set) left-hand side to assignment
What's the problem and how can I find r?
There are plenty of optimization and root finding libraries for R link here. But in native R:
fnToFindRoot = function(r) {
n <- c(2:10)
return(abs(-100/(r+1)+sum((100-50)/((1+r)^n))))
}
# arbitrary starting values
r0 = 0
# minimise the function to get the parameter estimates
rootSearch = optim(r0, fnToFindRoot,method = 'BFGS', hessian=TRUE)
str(rootSearch)
fnToFindRoot(rootSearch$par)
That function is very volatile. If you are willing to bracket the root, you are probably better off with uniroot:
fnToFindRoot = function(r,a) {
n <- c(2:10)
return((-100/(r+1)+sum((100-50)/((1+r)^n)))-a)
}
str(xmin <- uniroot(fnToFindRoot, c(-1E6, 1E6), tol = 0.0001, a = 0))
The a argument is there so you can look for a root to any arbitrary value.
Try bisection. This converges to r = 0.4858343 in 25 iterations:
library(pracma)
bisect(function(r) -100/(1+r) + sum(50/(r+1)^seq(2, 10)), 0, 1)
giving:
$root
[1] 0.4858343
$f.root
[1] 8.377009e-07
$iter
[1] 25
$estim.prec
[1] 1.490116e-08
Let x = 1/(1+r), so your equation should be:
0-100x + 50x^2 + 50x^3 + ... + 50x^10 = 0.
then in R:
x <- polyroot(c(0, -100, rep(50, 9)))
(r <- 1/x - 1)
Here is the answer:
[1] Inf+ NaNi 0.4858344-0.0000000i -1.7964189-0.2778635i
[4] -0.3397136+0.6409961i -0.3397136-0.6409961i -1.4553556-0.7216708i
[7] -0.9014291+0.8702213i -0.9014291-0.8702213i -1.7964189+0.2778635i
[10] -1.4553556+0.7216708i

Resources