How can I optimize a formula on data? - r

I have an analytically derrived formula that looks something like this;
y = ((b/x)^n )* e^(-c(x-b)) * D
In the formula x is my independent variable. Alle the other lettres on the right side of the equation are parameters that I want to estimate.
I have a dataset with y and x. Is there a way by which I can get the most optimal values of the parameters b,c,d and n that minimizes square error of y.
I also have alot of constraints, such as b being inbetween 0 and 10, and c being inbetween 0 and 0.5 etc.
Is there way I can perform such an optimization problem in R preferrably? And furthermore, is it possible to have parameters b,c and d predicted by other variables in my dataset. This then becomes more of a ML problem, but I want the model to be written in the same way as the first mentioned formula.
Thanks in advance

Related

Adding Constraints in Z and D (MARSS package)

I'm estimating a Phillips Curve Model - and, as such, need to take into account the unemployment Gap, which is the difference between actual unemployment and the Nairu (here, Nairu is the unobservable variable).
I need to impose the following constraint: some of the coefficients (say, beta 1 and beta 2) in the Z matrix (which relationates to the Nairu) must be the same that in the D matrix (which accounts for the unemployment). However, it doesn't seem possible to impose such a constraint simultaneously in the Z and D matrix.
Can you guys help me?
I've tried setting the same "names" for the coefficients in the Z and D matrix, but that didn't work.

Simple Orthographic Structure from Motion using R -- Determining Metric Constraints

I would like to build a simple structure from motion program according to Tomasi and Kanade [1992]. The article can be found below:
https://people.eecs.berkeley.edu/~yang/courses/cs294-6/papers/TomasiC_Shape%20and%20motion%20from%20image%20streams%20under%20orthography.pdf
This method seems elegant and simple, however, I am having trouble calculating the metric constraints outlined in equation 16 of the above reference.
I am using R and have outlined my work thus far below:
Given a set of images
I want to track the corners of the three cabinet doors and the one picture (black points on images). First we read in the points as a matrix w where
Ultimately, we want to factorize w into a rotation matrix R and shape matrix S that describe the 3 dimensional points. I will spare as many details as I can but a complete description of the maths can be gleaned from the Tomasi and Kanade [1992] paper.
I supply w below:
w.vector=c(0.2076,0.1369,0.1918,0.1862,0.1741,0.1434,0.176,0.1723,0.2047,0.233,0.3593,0.3668,0.3744,0.3593,0.3876,0.3574,0.3639,0.3062,0.3295,0.3267,0.3128,0.2811,0.2979,0.2876,0.2782,0.2876,0.3838,0.3819,0.3819,0.3649,0.3913,0.3555,0.3593,0.2997,0.3202,0.3137,0.31,0.2718,0.2895,0.2867,0.825,0.7703,0.742,0.7251,0.7232,0.7138,0.7345,0.6911,0.1937,0.1248,0.1723,0.1741,0.1657,0.1313,0.162,0.1657,0.8834,0.8118,0.7552,0.727,0.7364,0.7232,0.7288,0.6892,0.4309,0.3798,0.4021,0.3965,0.3844,0.3546,0.3695,0.3583,0.314,0.3065,0.3989,0.3876,0.3857,0.3781,0.3989,0.3593,0.5184,0.4849,0.5147,0.5193,0.5109,0.4812,0.4979,0.4849,0.3536,0.3517,0.4121,0.3951,0.3951,0.3781,0.397,0.348,0.5175,0.484,0.5091,0.5147,0.5128,0.4784,0.4905,0.4821,0.7722,0.7326,0.7326,0.7232,0.7232,0.7119,0.7402,0.7006,0.4281,0.3779,0.3918,0.3863,0.3825,0.3472,0.3611,0.3537,0.8043,0.7628,0.7458,0.7288,0.727,0.7213,0.7364,0.6949,0.5789,0.5491,0.5761,0.5817,0.5733,0.5444,0.5537,0.5379,0.3649,0.3536,0.4177,0.3951,0.3857,0.3819,0.397,0.3461,0.697,0.671,0.6821,0.6821,0.6719,0.6412,0.6468,0.6235,0.3744,0.3649,0.4159,0.3819,0.3781,0.3612,0.3763,0.314,0.7008,0.6691,0.6794,0.6812,0.6747,0.6393,0.6412,0.6235,0.7571,0.7345,0.7439,0.7496,0.7402,0.742,0.7647,0.7213,0.5817,0.5463,0.5696,0.5779,0.5761,0.5398,0.551,0.5398,0.7665,0.7326,0.7439,0.7345,0.7288,0.727,0.7515,0.7062,0.8301,0.818,0.8571,0.8878,0.8766,0.8561,0.858,0.8394,0.4121,0.3876,0.4347,0.397,0.38,0.3631,0.3668,0.2971,0.912,0.8962,0.9185,0.939,0.9259,0.898,0.8887,0.8571,0.3989,0.3781,0.4215,0.3725,0.3612,0.3461,0.3423,0.2782,0.9092,0.8952,0.9176,0.9399,0.925,0.8971,0.8887,0.8571,0.4743,0.4536,0.4894,0.4517,0.446,0.4328,0.4385,0.3706,0.8273,0.8171,0.8571,0.8878,0.8766,0.8543,0.8561,0.8394,0.4743,0.4554,0.4969,0.4668,0.4536,0.4404,0.4536,0.3857)
w=matrix(w.vector,ncol=16,nrow=16,byrow=FALSE)
Then create registered measurement matrix wm according to equation 2 as
by
wm = w - rowMeans(w)
We can decompose wm into a '2FxP' matrix o1 a diagonal 'PxP' matrix e and 'PxP' matrix o2 by using a singular value decomposition.
svdwm <- svd(wm)
o1 <- svdwm$u
e <- diag(svdwm$d)
o2 <- t(svdwm$v) ## dont forget the transpose!
However, because of noise, we only pay attention to the first 3 columns of o1, first 3 values of e and the first 3 rows of o2 by:
o1p <- svdwm$u[,1:3]
ep <- diag(svdwm$d[1:3])
o2p <- t(svdwm$v)[1:3,] ## dont forget the transpose!
Now we can solve for our rhat and shat in equation (14)
by
rhat <- o1p%*%ep^(1/2)
shat <- ep^(1/2) %*% o2p
However, these results are not unique and we still need to solve for R and S by equation (15)
by using the metric constraints of equation (16)
Now I need to find Q. I believe there are two potential methods but am unclear how to employ either.
Method 1 involves solving for B where B=Q%*%solve(Q) then using Cholesky decomposition to find Q. Method 1 appears to be the common choice in literature, however, little detail is given as to how to actually solve the linear system. It is apparent that B is a '3x3' symmetric matrix of 6 unknowns. However, given the metric constraints (equations 16), I don't know how to solve for 6 unknowns given 3 equations. Am I forgetting a property of symmetric matrices?
Method II involves using non-linear methods to estimate Q and is less commonly used in structure from motion literature.
Can anyone offer some advice as to how to go about solving this problem? Thanks in advance and let me know if I need to be more clear in my question.
can be written as .
can be written as .
can be written as .
so our equations are:
So the first equation can be written as:
which is equivalent to
To keep it short we define now:
(I know the spacings are terrably small, but yes, this is a Vector...)
So for all equations in all different Frames f, we can write one big equation:
(sorry for the ugly formulas...)
Now you just need to solve the -Matrix using Cholesky decomposition or whatever...

univariate nonlinear optimization with quadratic constraint in R

I have a quadratic function f where, f = function (x) {2+.1*x+.23*(x*x)}. Let's say I have another quadratic fn g where g = function (x) {3+.4*x-.60*(x*x)}
Now, I want to maximize f given the constraints 1. g>0 and 2. 600<x<650
I have tried the packages optim,constrOptim and optimize. optimize does one dim. optimization, but without constraints and constrOptim I couldn't understand. I need to this using R. Please help.
P.S. In this example, the values may be erratic as I have given two random quadratic functions, but basically I want maximization of a quadratic fn given a quadratic constraint.
If you solve g(x)=0 for x by the usual quadratic formula then that just gives you another set of bounds on x. If your x^2 coefficent is negative then g(x) > 0 between the solutions, otherwise g(x)>0 outside the solutions, so within (-Inf, x1) and (x2, Inf).
In this case, g(x)>0 for -1.927 < x < 2.59. So in this case both your constraints cannot be simultaneously achieved (g(x) is LESS THAN 0 for 600<x<650).
But supposing your second condition was 1 < x < 5, then you'd just combine the solution from g(x)>0 with that interval to get 1 < x < 2.59, and then maximise f in that interval using standard univariate optimisation.
And you don't even need to run an optimisation algorithm. Your target f is quadratic. If the coefficient of x^2 is positive the maximum is going to be at one of your limits of x, so you only have a small number of values to try. If the coefficient of x^2 is -ve then the maximum is either at a limit or at the point where f(x) peaks (solve f'(x)=0) if that is within your limits.
So you can do this precisely, there's just a few conditions to test and then some intervals to compute and then some values of f at those interval limits to calculate.

Converting Optim to constrOptim in R

I am trying to determine the weights of 9 metrics which will return the highest accuracy ratio. Since they are weights, the values need to sum to 1 and lie between 0 & 1. I am currently using the optim function, but do to constraints, I think I need to switch to constrOptim. I was wondering the best way to do this. Below I have included the code i am currently using. x.matrix is 20,000 by 9 matrix of values ranked between 1-10.
pars<-c(w1=(1/9),w2=(1/9),w3=(1/9),w4=(1/9),w5=(1/9),w6=(1/9),w7=(1/9),w8=(1/9),w9=(1/9))
OptPars<-function(pars){(-(rcorr.cens(x.matrix%*%pars),f)["Dxy"])}
opt<-optim(pars,OptPars)
Say you have values x on the range (-Inf, Inf) and you need values p in the range [0,1] that sum to 1, you can do the following transformation
p <- exp(x)/sum(exp(x))
If you do that translation in your optimization function and do the same transformation on the best set of parameters, you should get what you want.

Can I force two components in a three-way linear regression to be positive?

I'm sorry if I'm not using the correct mathemathical terms, but I hope you'll understand what I'm trying to accomplish.
My problem:
I'm using linear regression (currently least squares method) on the values from two vectors x and y against the result z. This is to be done in matlab, and I'm using the \-operator to perform the regression. My dataset will contain a few thousand observations (up to about 50000 at max).
The x-values will be in the area of 10-300 (most between 60 and 100) and the y-values in the 1-3 area.
My code looks like this:
X = [ones(size(x,1) x y];
parameters = X\y;
The output "parameters" are then the three factors a0, a1 and a2 which is used in this formula:
a0 * 1 + a1 * xi + a2 * yi = zi
(The i's are supposed to be subscripted)
This works like expected, although I want the two parameters a1 and a2 to ALWAYS be positive values, even when the vector z is negative (this means that the a0 will be negative, of course), since this is what the real model looks like (z is always positively correlated to x and z). Is this possible using the least squares method? I'm also open for other algorithms for linear regression.
Let me try and rephrase to clarify. Accoring to your model z is always positively correlated with x and y. However, sometimes when you solve the linear regression for the coefficient this gives you a negative value.
If you are right about the data, this should only happen when the correct coefficient is small, and noise happens to take it negative. You could just assign it to zero, but then the means wouldn't match properly.
In which case the correct solution is as jpalacek says, but explained with more detail here:
Try and regress against x and y. If both positive take the result.
If a1 is negative, assume it should be zero. regress z against y. If a2 is positive then take a1 as 0, and a0 and a2 from this regression.
If a2 is negative, assume it should be zero too. Regress z against 1, and take this as a0. Let a1 and a2 be 0.
This should give you what you want.
The simple solution is to use a tool designed to solve it. That is, use lsqlin, from the optimization toolbox. Set a lower bound constraint for two of the three parameters.
Thus, assuming x, y, and z are all COLUMN vectors,
A = [ones(length(x),1),x,y];
lb = [-inf, 0, 0];
a = lsqlin(A,z,[],[],[],[],lb);
This will constrain only the second and third unknown parameters.
Without the optimization toolbox, use lsqnonneg, which is part of matlab itself. Here too the solution is easy enough.
A = [ones(length(x),1),x,y];
a = lsqnonneg(A,z);
Your model will be
z = a(1) + a(2)*x + a(3)*y
If a(1) is essentially zero, i.e., it is within a tolerance of zero, then assume that the first parameter was constrained by the bound at zero. In that case, solve a second problem by changing the sign on the column of ones in A.
A(:,1) = -1;
a = lsqnonneg(A,z);
If this solution has a(1) significantly non-zero, then the second solution must be better than the first. Your model will now be
z = -a(1) + a(2)*x + a(3)*y
It costs you at most two calls to lsqnonneg, and the second call is only ever made some fraction (lacking any information about your problem, the odds are 50% of the second call) of the time.

Resources