Equal weights constraints - r

I have the following nonlinear contrained optimization problem, that im am solving in R using solnp:
max F(w)
w
s.t.
w_i >= 0 for all i
sum(w) = 1
However, I would like to add an extra constraint but i'm not sure it is even possible. I would like all the w's bigger than 0 to have equal weights. Something like:
max F(w)
w
s.t.
w_i >= 0 for all i
sum(w) = 1
w_i=w_j for all i,j where w_i,w_j>0
Does anyone if it is possible, and if so, how to do it?

I am not sure if this is a necessarily hard optimization problem given that your search space is completely determined. Essentially, given finite number of dimensions for w_i, you have a finite number of points in the R^w space that you want to search over. These are:
c(1, 0, 0, ..., 0)
c(0, 1, 0, ..., 0)
...
c(0, 0, ..., 1)
c(1/2, 1/2, 0, ..., 0)
c(1/2, 0, 1/2, ..., 0)
...
c(1/3, 1/3, 1/3, 0, ..., 0)
...
c(1/n, 1/n, ..., 1/n)
so on. You get the idea.
Which means that you can just evaluate your function over these points and pick the combination which maximizes F.
Does that sound about right or have I missed something critical?

Related

Optimization problems linked together by time

Let's say i'm trying to minimize a function f1(x) with x a vector. This is a classic optimization problem and I get a solution, let's say the vector x_opt = (0, 700, 0, 1412, 0, 5466).
Now i have another function to minimize f2(x), and I know this function is related to the same individual and should have x_opt close to the first one. So, if i have (700, 0, 0, 1454, 0, 5700) I won't be happy with the solution, but if the first one was (700, 0, ...) or if the second one is (0, 700, ...) I'd be happy.
Is it ok to minimize f1(x1) + f2(x2) + lambda * || x2-x1 ||
What norm should I use, should i set lambda to one ?
What if i have more than two functions and I know f1 and f2 are more closely related than f3 and f2 and f3 and f1 ?
Is there any literature on this topic, or a name ? Because I don't even know where to look

Why does this idea to solve a absorption markov chain not work?

Edit: Seems like my method works.
I encountered a programming question that required me to calculate probability of reaching terminal states.
After a few painstaking hours trying to solve it traditionally, I googled and found that it is called an absorption markov chain. And there is a formula for it.
However, I am trying to figure out what is missing from my solution because it seems correct.
Pardon the crude drawing. Basically there are 4 nodes in this graph, the black lines show the original transitions and probability, while the coloured lines show the paths to termination.
The steps is something like this:
Trace all possible paths to a termination point, sum up the probability of every path to the termination node. That is the probability of reaching the node.
Ignore cyclical paths. Meaning that the "1/3" transition from 4 to 1 is essentially ignored.
Reason for (2): Because we can assume that going back will increase the probability of every possible path in such a way that they still maintain the same relative probability to each other! For example, if I were to go back to 1 from 4, then the chances of going to 2, 3 and 4 will each increase by 1/27 (1/3 * 1/3 * 1/3), making the relative probability still equal to each other!
I hope the above makes sense.
Calculate the probability of each node as "probability of each node" / "probability of terminating" because by eliminating cyclical graphs, the probability of reaching each node will not be 1 anymore.
So given the above algorithm, here are the values found:
Red path: 1/3
Green path: 1/3
Blue path: 1/3 * 2/3 = 2/9
Probability to reach 3: 1/3
Probability to reach 2: 2/9 + 1/3 = 5/9
Total probability to terminate: 1/3 + 5/9 = 8/9
Hence, final probability to reach 3:
(1/3) / (8/9) = 3/8
Final probability to reach 2:
(5/9) / (8/9) = 5/8
If you are unsure about step (2), we can try it again!
Assume that we went from 1 to 4 and back to 1 again, this has a probability of 1/9.
From here, we can follow each coloured paths again * 1/9 probability.
When combined with the probabilities calculated earlier, this gives us:
10/27 probability to reach 3.
50/81 probability to reach 2.
Total terminating probability of 80/81.
New probability of terminating at 3 is now (10/27) / (80/81) = 3/8 (SAME)
New probability of terminating at 2 is now (50/81) / (80/81) = 5/8 (SAME)
However, the actual probabilities are (2/5) and (3/5) for 3 and 2 respectively, using an algorithm I found online (there is a slim chance it is wrong though). Turns out I used the online solution wrongly
I realised my answer is actually pretty close, and I am not sure why is it wrong?
We can represent the transitions of the Markov chain with a matrix M. In Python notation, this would look like:
M = [[ 0, 1/3, 1/3, 1/3],
[ 0, 1, 0, 0],
[ 0, 0, 1, 0],
[1/3, 2/3, 0, 0]])
And the probabilities with a vector S, initially with 100% in state 1.
S = [1, 0, 0, 0]
Multiplying S by M gives the new probabilities:
S*M = [0, 1/3, 1/3, 1/3]
S*M**2 = [1/9, 5/9, 1/3, 0]
S*M**3 = [0, 16/27, 10/27, 1/27]
S*M**4 = [1/81, 50/81, 10/27, 0]
S*M**n = [3**(-n)*((-1)**n + 1)/2,
3**(-n)*((-1)**n + 5*3**n - 6)/8,
3**(-n)*(-(-1)**n + 3*3**n - 2)/8,
3**(-n)*(1 - (-1)**n)/2]
In the limit with n going to infinity, for even n, this would give
[0, 5/8, 3/8, 0]
Also starting with 1, 2, 3 and 4 with equal probability:
S = [1/4, 1/4, 1/4, 1/4]
S*M = [1/12, 1/2, 1/3, 1/12]
S*M**2 = [1/36, 7/12, 13/36, 1/36]
S*M**n = [3**(-n)/4, 5/8 - 3*3**(-n)/8, 3/8 - 3**(-n)/8, 3**(-n)/4]
leading to the same limit [0, 5/8, 3/8, 0].
Starting with 1 and 4 with equal probability:
S = [1/2, 0, 0, 1/2]
S*M = [1/6, 1/2, 1/6, 1/6]
S*M**2 = [1/18, 2/3, 2/9, 1/18]
S*M**n = [3**(-n)/2, 3/4 - 3*3**(-n)/4, 1/4 - 3**(-n)/4, 3**(-n)/2]
gives another limit for n going to infinity:
[0, 3/4, 1/4, 0]

Sage polynomial coefficients including zeros

If we have a multivariate polynomial in SAGE for instance
f=3*x^3*y^2+x*y+3
how can i display the full list of coefficients including the zero ones from missing terms between maximum dregree term and constant.
P.<x,y> = PolynomialRing(ZZ, 2, order='lex')
f=3*x^2*y^2+x*y+3
f.coefficients()
gives me the list
[3, 1, 3]
but i'd like the "full" list to put into a a matrix. In the above example it should be
[3, ,0 , 0, 1, 0, 0, 0, 0, 3]
corresponding to terms:
x^2*y^2, x^2*y, x*y^2, x*y, x^2, y^2, x, y, constant
Am I missing something?
Your desired output isn't quite well defined, because the monomials you listed are not in the lexicographic order (which you used in the first line of your code). Anyway, using a double loop you can arrange coefficients in any specific way you want. Here is a natural way to do this:
coeffs = []
for i in range(f.degree(x), -1, -1):
for j in range(f.degree(y), -1, -1):
coeffs.append(f.coefficient({x:i, y:j}))
Now coeffs is [3, 0, 0, 0, 1, 0, 0, 0, 3], corresponding to
x^2*y^2, x^2*y, x^2, x*y^2, x*y, x, y, constant
The built-in .coefficients() method is only useful if you also use .monomials() which provides a matching list of monomials that have those coefficients.

Square Root of a Singular Matrix in R

I need to compute the matrix A on the power of -1/2, which basically means the square root of the initial matrix's inverse.
If A is singular then the Moore-Penrose generalized inverse is computed with the ginv function from the MASS package, otherwise the regular inverse is computed using the solve function.
Matrix A is defined below:
A <- structure(c(604135780529.807, 0, 58508487574887.2, 67671936726183.9,
0, 0, 0, 1, 0, 0, 0, 0, 58508487574887.2, 0, 10663900590720128,
10874631465443760, 0, 0, 67671936726183.9, 0, 10874631465443760,
11315986615387788, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1), .Dim = c(6L,
6L))
I check singularity with the comparison of the rank and the dimension.
rankMatrix(A) == nrow(A)
The above code returns FALSE, So I have to use ginv to get the inverse. The inverse of A is as follows:
A_inv <- ginv(A)
The square-root of the inverse matrix is computed with the sqrtm function from the expm package.
library(expm)
sqrtm(A_inv)
The function returns the following error:
Error in solve.default(X[ii, ii] + X[ij, ij], S[ii, ij] - sumU) :
Lapack routine zgesv: system is exactly singular
So how can we compute the square root in this case? Please note that matrix A is not always singular so we have to provide a general solution for the problem.
Your question relates to two distinct problems:
Inverse of a matrix
Square root of a matrix
Inverse
The inverse does not exist for singular matrices. In some applications, the Moore-Penrose or some other generalised inverse may be taken as a suitable substitute for the inverse. However, note that computer numerics will incur rounding errors in most cases; and these errors may make a singular matrix appear regular to the computer or vice versa.
If A always exhibits the the block structure of the matrix you give, I suggest to consider only its non-diagonal block
A3 = A[ c( 1, 3, 4 ), c( 1, 3, 4 ) ]
A3
[,1] [,2] [,3]
[1,] 6.041358e+11 5.850849e+13 6.767194e+13
[2,] 5.850849e+13 1.066390e+16 1.087463e+16
[3,] 6.767194e+13 1.087463e+16 1.131599e+16
instead of all of A for better efficiency and less rounding issues. The remaining 1-diagonal entries would remain 1 in the inverse of the square root, so no need to clutter the calculation with them. To get an impression of the impact of this simplification, note that R can calculate
A3inv = solve(A3)
while it could not calculate
Ainv = solve(A)
But we will not need A3inverse, as will become evident below.
Square root
As a general rule, the square root of a matrix A will only exist if the matrix has a diagonal Jordan normal form (https://en.wikipedia.org/wiki/Jordan_normal_form). Hence, there is no truly general solution of the problem as you require.
Fortunately, like “most” (real or complex) matrices are invertible, “most” (real or complex) matrices have a diagonal complex Jordan normal form. In this case, the Jordan normal form
A3 = T·J·T⁻¹
can be calculated in R as such:
X = eigen(A3)
T = X$vectors
J = Diagonal( x=X$values )
To test this recipe, compare
Tinv = solve(T)
T %*% J %*% Tinv
with A3. They should match (up to rounding errors) if A3 has a diagonal Jordan normal form.
Since J is diagonal, its squareroot is simply the diagonal matrix of the square roots
Jsqrt = Diagonal( x=sqrt( X$values ) )
so that Jsqrt·Jsqrt = J. Moreover, this implies
(T·Jsqrt·T⁻¹)² = T·Jsqrt·T⁻¹·T·Jsqrt·T⁻¹ = T·Jsqrt·Jsqrt·T⁻¹ = T·J·T⁻¹ = A3
so that in fact we obtain
√A3 = T·Jsqrt·T⁻¹
or in R code
A3sqrt = T %*% Jsqrt %*% Tinv
To test this, calculate
A3sqrt %*% A3sqrt
and compare with A3.
Square root of the inverse
The square root of the inverse (or, equally, the inverse of the sqare root) can be calculated easily once a diagonal Jordan normal form has been calculated. Instead of J use
Jinvsqrt = Diagonal( x=1/sqrt( X$values ) )
and calculate, analogously to above,
A3invsqrt = T %*% Jinvsqrt %*% Tinv
and observe
A3·A3invsqrt² = … = T·(J/√J/√J)·T⁻¹ = 1
the unit matrix so that A3invsqrt is the desired result.
In case A3 is not invertible, a generalised inverse (not necessarily the Moore-Penrose one) can be calculated by replacing all undefined entries in Jinvsqrt by 0, but as I said above, this should be done with suitable care in the light of the overall application and its stability against rounding errors.
In case A3 does not have a diagonal Jordan normal form, there is no square root, so the above formulas will yield some other result. In order not to run into this case at times of bad luck, best implement a test whether
A3invsqrt %*% A3 %*% A3invsqrt
is close enough to what you would consider a 1 matrix (this only applies if A3 was invertible in the first place).
PS: Note that you can prefix a sign ± for each diagonal entry of Jinvsqrt to your liking.

Minimum across cumulative sums with different starting indices

Question: Given a vector, I want to know the minimum of a series of cumulative sums, where each cumulative sum is calculated for an increasing starting index of the vector and a fixed ending index (1:5, 2:5, ..., 5:5). Specifically, I am wondering if this can be calculated w/o using a for() loop, and if there is potentially a term for this algorithm/ calculation. I am working in R.
Context: The vector of interest contains a time series of pressure changes. I want to know of the largest (or smallest) net change in pressure across a range of starting points but with a fixed end point.
Details + Example:
#Example R code
diffP <- c(0, -1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0)
minNet1 <- min(cumsum(diffP))
minNet1 #over the whole vector, the "biggest net drop" (largest magnitude with negative sign) is -1.
#However, if I started a cumulative sum in the second half of diffP, I would get a net pressure change of -2.
hold <- list()
nDiff <- length(diffP)
for(j in 1:nDiff){
hold[[j]] <- cumsum(diffP[j:nDiff])
}
answer <- min(unlist(hold)) #this gives the answer that I ultimately want
Hopefully my example above has helped to articulate my question. answer contains the correct answer, but I'd rather do this without a for() loop in R. Is there a better way to do this calculation, or maybe a name I can put to it?
This is known as the http://en.wikipedia.org/wiki/Maximum_subarray_problem and is a typical interview question!
Most people --me included-- would solve it using a O(n^2) algorithm but there is in fact a much better algorithm with O(n) complexity. Here is an R implementation of Kadane's algorithm from the link above:
max_subarray <- function(A) {
max_ending_here <- 0
max_so_far <- 0
for (x in A) {
max_ending_here <- max(0, max_ending_here + x)
max_so_far <- max(max_so_far, max_ending_here)
}
max_so_far
}
Since in your case, you are looking for the minimum sub-array sum, you would have to call it like this:
-max_subarray(-diffP)
[1] -2
(Or you can also rewrite the function above and replace max with min everywhere.)
Note that, yes, the implementation still uses a for loop, but the complexity of the algorithm being O(n) (meaning the number of operations is of the same order as length(diff)), it should be rather quick. Also, it won't consume any memory since it only stores and updates a couple variables.

Resources