I am trying to solve the following optimization problem using cvxpy:
x and delta_x are (1,N) row vectors. A is a (N,N) symmetric matrix and b is a scalar. I am trying to find a y, such that it minimizes the sum of squares of (y - delta_x) with the constraint (x+y).A.(x+y).T - b = 0. Below is my attempt to solve it.
x = np.reshape(np.ravel(x_data.T), (1, -1))
delta_x = np.reshape(np.ravel(delta.T), (1, -1))
y = cp.Variable(delta_x.shape)
objective = cp.Minimize(cp.sum_squares(y - delta_x))
constraints = [cp.matmul(cp.matmul(x + y, A), (x + y).T) == (b*b)]
prob = cp.Problem(objective, constraints)
result = prob.solve()
I keep getting the error 'cvxpy.error.DCPError: Problem does not follow DCP rules'.
I followed the rules stated in the answer here, but I don't understand how to construct the proper cvxpy minimization Problem. Any help would be greatly appreciated.
Thanks!
Related
I am trying to solve the following ODE using DifferentialEquation.jl :
Where P is a matrix used for a projection. I am having a hard time imagining how to solve this problem. Is there a way to directly solve it using Julia? Or should I try and rearrange the equation by hand (which I already tried) to fit the usual differential equation format?
I already started by writing down some equations which can be found below but I am not getting very far.
function ODE(u, p, t)
g,N = p
Jacg = ForwardDiff.jacobian(g, u)
sum = zeros(size(N,1))
for i in 1:size(Jacg,1)
sum = sum + Jacg[i,:] .* (u / norm(u)) .* N[:,i]
end
Proj_N(N) * sum
nothing
end
prob = ODEProblem(ODE, u0, (0.0, 3.0), (g, N))
sol = solve(prob)
Any help is appreciated and thanks in advance.
If you want to use the out of place form you have to return the derivative, i.e.
function ODE(u, p, t)
g,N = p
Jacg = ForwardDiff.jacobian(g, u)
sum = zeros(size(N,1))
for i in 1:size(Jacg,1)
sum = sum + Jacg[i,:] .* (u / norm(u)) .* N[:,i]
end
Proj_N(N) * sum
end
I think you were just mixing up the mutating and non-mutating derivative forms.
I'm reading Deep Learning by Goodfellow et al. and am trying to implement gradient descent as shown in Section 4.5 Example: Linear Least Squares. This is page 92 in the hard copy of the book.
The algorithm can be viewed in detail at https://www.deeplearningbook.org/contents/numerical.html with R implementation of linear least squares on page 94.
I've tried implementing in R, and the algorithm as implemented converges on a vector, but this vector does not seem to minimize the least squares function as required. Adding epsilon to the vector in question frequently produces a "minimum" less than the minimum outputted by my program.
options(digits = 15)
dim_square = 2 ### set dimension of square matrix
# Generate random vector, random matrix, and
set.seed(1234)
A = matrix(nrow = dim_square, ncol = dim_square, byrow = T, rlnorm(dim_square ^ 2)/10)
b = rep(rnorm(1), dim_square)
# having fixed A & B, select X randomly
x = rnorm(dim_square) # vector length of dim_square--supposed to be arbitrary
f = function(x, A, b){
total_vector = A %*% x + b # this is the function that we want to minimize
total = 0.5 * sum(abs(total_vector) ^ 2) # L2 norm squared
return(total)
}
f(x,A,b)
# how close do we want to get?
epsilon = 0.1
delta = 0.01
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
steps = vector()
while(L2_norm > delta){
x = x - epsilon * value
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
print(L2_norm)
}
minimum = f(x, A, b)
minimum
minimum_minus = f(x - 0.5*epsilon, A, b)
minimum_minus # less than the minimum found by gradient descent! Why?
On page 94 of the pdf appearing at https://www.deeplearningbook.org/contents/numerical.html
I am trying to find the values of the vector x such that f(x) is minimized. However, as demonstrated by the minimum in my code, and minimum_minus, minimum is not the actual minimum, as it exceeds minimum minus.
Any idea what the problem might be?
Original Problem
Finding the value of x such that the quantity Ax - b is minimized is equivalent to finding the value of x such that Ax - b = 0, or x = (A^-1)*b. This is because the L2 norm is the euclidean norm, more commonly known as the distance formula. By definition, distance cannot be negative, making its minimum identically zero.
This algorithm, as implemented, actually comes quite close to estimating x. However, because of recursive subtraction and rounding one quickly runs into the problem of underflow, resulting in massive oscillation, below:
Value of L2 Norm as a function of step size
Above algorithm vs. solve function in R
Above we have the results of A %% x followed by A %% min_x, with x estimated by the implemented algorithm and min_x estimated by the solve function in R.
The problem of underflow, well known to those familiar with numerical analysis, is probably best tackled by the programmers of lower-level libraries best equipped to tackle it.
To summarize, the algorithm appears to work as implemented. Important to note, however, is that not every function will have a minimum (think of a straight line), and also be aware that this algorithm should only be able to find a local, as opposed to a global minimum.
I'm trying to solve the problem
d = 0.5 * ||X - \Sigma||_{Frobenius Norm} + 0.01 * ||XX||_{1},
where X is a symmetric positive definite matrix, and all the diagnoal element should be 1. XX is same with X except the diagonal matrix is 0. \Sigma is known, I want minimum d with X.
My code is as following:
using Convex
m = 5;
A = randn(m, m);
x = Semidefinite(5);
xx=x;
xx[diagind(xx)].=0;
obj=vecnorm(A-x,2)+sumabs(xx)*0.01;
pro= minimize(obj, [x >= 0]);
pro.constraints+=[x[diagind(x)].=1];
solve!(pro)
MethodError: no method matching diagind(::Convex.Variable)
I just solve the optimal problem by constrain the diagonal elements in matrix, but it seems diagind function could not work here, How can I solve the problem.
I think the following does what you want:
m = 5
Σ = randn(m, m)
X = Semidefinite(m)
XX = X - diagm(diag(X))
obj = 0.5 * vecnorm(X - Σ, 2) + 0.01 * sum(abs(XX))
constraints = [X >= 0, diag(X) == 1]
pro = minimize(obj, constraints)
solve!(pro)
For the types of operations:
diag extracts the diagonal of a matrix, as a vector
diagm constructs a diagonal matrix out of a vector
So, to have XX be X with zero diagonal, we subtract the diagonal of X from it. And to constrain X having diagonal 1, we compare its diagonal with 1, using ==.
It is a good idea to keep immutable values as far as possible, instead of trying to modify things. I don't know whether Convex even supports that.
I have written the code below for minimization of error by changing the value of alpha (using iteration method).
set.seed(16)
npoints = 10000
Y = round(runif(npoints), 3)
OY = sample(c(0, 1, 0.5), npoints, replace = T)
minimizeAlpha = function(Y, OY, alpha) {
PY = alpha*Y
error = OY - PY
squaredError = sapply(error, function(x) x*x)
sse = sum(squaredError)
return(sse)
}
# # Iterate for 10000 values
alphas = seq(0.0001, 1, 0.0001)
sse = sapply(alphas, function(x) minimizeAlpha(Y, OY, x))
print(alphas[sse == min(sse)])
I have used sapply for basic optimization. But, if the number of points are more than 10000 this code is running forever. So, is there any better way of implementation or any standard techniques to optimize (like Bisection). If so can you please help me in optimizing the code.
Note: I need the value of alpha with at least 4 decimals.
Any help is appreciated.
Replacing sapply instead of for isn’t more efficient, that’s a misconception. It’s merely often simpler code.
However, you can actually take advantage of vectorisation in your code — and that would be faster.
For instance, sapply(error, function(x) x*x) can simply be replaced by x * x. The sum of squared errors of numbers in R is thus simply sum((OY - PY) ** 2).
Your whole function thus boils down to:
minimizeAlpha = function(Y, OY, alpha)
sum((OY - alpha * Y) ** 2)
This should be more efficient — but first and foremost it’s better code and more readable.
I hope this is the right place for such a basic question. I found this and this solutions quite articulated, hence they do not help me to get the fundamentals of the procedure.
Consider a random dataset:
x <- c(1.38, -0.24, 1.72, 2.25)
w <- c(3, 2, 4, 2)
How can I find the value of μ that minimizes the least squares equation :
The package manipulate allows to manually change with bar the model with different values of μ, but I am looking for a more precise procedure than "try manually until you do not find the best fit".
Note: If the question is not correctly posted, I would welcome constructive critics.
You could proceed as follows:
optim(mean(x), function(mu) sum(w * (x - mu)^2), method = "BFGS")$par
# [1] 1.367273
Here mean(x) is an initial guess for mu.
I'm not sure if this is what you want, but here's a little algebra:
We want to find mu to minimise
S = Sum{ i | w[i]*(x[i] - mu)*(x[i] - mu) }
Expand the square, and rearrange into three summations. bringing things that don't depend on i outside the sums:
S = Sum{i|w[i]*x[i]*x[i])-2*mu*Sum{i|w[i]*x[i]}+mu*mu*Sum{i|w[i]}
Define
W = Sum{i|w[i]}
m = Sum{i|w[i]*x[i]} / W
Q = Sum{i|w[i]*x[i]*x[i]}/W
Then
S = W*(Q -2*mu*m + mu*mu)
= W*( (mu-m)*(mu-m) + Q - m*m)
(The second step is 'completing the square', a simple but very useful technique).
In the final equation, since a square is always non-negative, the value of mu to minimise S is m.