When one has a problem of a matrix inverse multiplication with a vector, as such:
one can take a Cholesky Decomposition of A and backsubstitute b to find the resulting vector x. However, a matrix inverse is sometimes needed when the problem is not formulated as above. My question is what is the best way to handle such a situation. Below, I have compared various ways(using numpy) to invert a positive definite matrix:
Firstly, generate the matrix:
>>> A = np.random.rand(5,5)
>>> A
array([[ 0.13516074, 0.2532381 , 0.61169708, 0.99678563, 0.32895589],
[ 0.35303998, 0.8549499 , 0.39071336, 0.32792806, 0.74723177],
[ 0.4016188 , 0.93897663, 0.92574706, 0.93468798, 0.90682809],
[ 0.03181169, 0.35059435, 0.10857948, 0.36422977, 0.54525 ],
[ 0.64871162, 0.37809219, 0.35742865, 0.7154568 , 0.56028468]])
>>> A = np.dot(A.transpose(), A)
>>> A
array([[ 0.72604206, 0.96959581, 0.82773451, 1.10159817, 1.05327233],
[ 0.96959581, 1.94261607, 1.53140854, 1.80864185, 1.9766411 ],
[ 0.82773451, 1.53140854, 1.52338262, 1.89841402, 1.59213299],
[ 1.10159817, 1.80864185, 1.89841402, 2.61930178, 2.01999385],
[ 1.05327233, 1.9766411 , 1.59213299, 2.01999385, 2.10012097]])
The results for the method of direct inversion are as follows:
>>> np.linalg.inv(A)
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
When using the Moore-Penrose Pseudoinverse, the results are as follows(you might notice that to the displayed precision, the results are the same as direct inversion):
>>> np.linalg.pinv(A)
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
Finally, when solving with the identity matrix:
>>> np.linalg.solve(A, np.eye(5))
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
Again, you might notice that on a cursory inspection, the result is the same as the previous two methods.
It is well known that matrix inversion is an ill posed problem due to numerical instability that should be avoided where possible. However, in situations where it appears unavoidable, what is the preferable approach and why? To clarify, I am referring to the best approach when implementing such equations in software.
An example of such a problem is provided with another of my questions.
The reason for avoiding inverting matrices has only to do with efficiency. It is faster to solve the linear systems directly. If you think of the problem in your linked question a bit differently, you can apply the same principles.
In order to find the matrix inv(K) * Y * T(Y) * inv(K) - D * inv(K) you can solve the following systems of equations:
K * R * K = Y * T(Y)
You can solve it in two parts:
R2 * K = R1
K * R1 = Y * T(Y)
So you first solve for R1 with your usual method, then solve for R2 (recognise that you can solve T(K) * T(R2) = T(R1) if you have to).
However, at this point I don't know if this will be more efficient than computing the inverse explicitly unless K is symmetric. (There may be a way to efficiently get the decomposition of T(K) from K, but I don't know offhand)
If K is symmetric then you can compute your decomposition on K once and reuse it for the two back-substitution steps and it might be more efficient than computing the inverse explicitly.
Related
I have a piece of code:
change <- round((table_subset$positive[1][1] - avg) * 100 / table$positive[1][1], 2)
but I am not sure what the [1][1] parts are doing/ what they are calling. Can someone explain this to me or if there is another way to do it?
I'm new to Julia programming I managed to solve some 1st order DDE (Delay Differential Equations) and ODE. I now need to solve a second order delay differential equation but I didn't manage to find documentation about that (I previously used DifferentialEquations.jl).
The equation (where F is a function and τ the delay):
How can I do this?
Here is my code using the given information, it seems that the system stay at rest which is incorrect. I probably did something wrong.
function bc_model(du,u,h,p,t)
# [ u'(t), u''(t) ] = [ u[1], -u[1] + F(ud[0],u[0]) ] // off by one in julia A[0] -> A[1]
γ,σ,Q = p
ud = h(p, t-σ)[1]
du = [u[2], + Q^2*(γ/Q*tanh(ud)-u[1]) - u[2]]
end
u0 = [0.1, 0]
h(p, t) = u0
lags = [σ,0]
tspan = (0.0,σ*100.0)
alg = MethodOfSteps(Tsit5())
p = (γ,σ,Q,ω0)
prob = DDEProblem(bc_model,u0,h,tspan,p; constant_lags=lags)
sol = solve(prob,alg)
plot(sol)
The code is in fact working! It seems that it is my normalization constants that are not consistent. Thank you!
You get a state space of dimension 2, containing u = [u(t),u'(t)]. Consequently the return vector of the right-side function is [u'(t),u''(t)]. Then if ud is the delayed state [u(t-τ),u'(t-τ)] the right side function can be formulated as
[ u'(t), u''(t) ] = [ u[1], -u[1] + F(ud[0],u[0]) ]
I'm currently switching from Matlab to Python and I have a problem with understanding numpy arrays.
The following code (copied from Numpy documentation) creates a [2x3] array
np.array([[1, 2, 3], [4, 5, 6]], np.int32).
Which behaves as expected.
Now I tried to adapt this to my case and tried
myArray = np.array([\
[-0.000847283, 0.000000000, 0.141182070, 2.750000000],
[ 0.000876414, -0.025855453, 0.270459334, 2.534537894],
[-0.000098373, 0.003388169, -0.021976882, 3,509325279],
[ 0.000077079, -0.004507202, 0.096453685, 2,917172446],
[-0.000049944, 0.003114201, -0.055974372, 3,933359490],
[ 0.000042697, -0.003833862, 0.117727186, 2.485846507],
[-0.000000843, 0.000084733, 0.000169340, 3.661424974],
[ 0.000000676, -0.000074756, 0.005751451, 3.596300338],
[-0.000001860, 0.000229543, -0.006420507, 3.758593109],
[ 0.000006764, -0.000934745, 0.045972458, 2.972698644],
[ 0.000014803, -0.002140505, 0.106260454, 1.967898711],
[-0.000025975, 0.004587858, -0.263799480, 8.752330828],
[ 0.000009098, -0.001725357, 0.114993424, 1.176472749],
[-0.000010418, 0.002080207, -0.132368251, 6.535975709],
[ 0.000032572, -0.006947575, 0.499576502, -8.209401868],
[-0.000039870, 0.009351884, -0.722882956, 22.352084596],
[ 0.000046909, -0.011475011, 0.943268640, -22.078624629],
[-0.000067764, 0.017766572, -1.542265901, 48.344854010],
[ 0.000144148, -0.039449875, 3.607214322,-106.139552662],
[-0.000108830, 0.032648910, -3.242170215, 110.757624352]
])
But not as expected the shape is (20,). I expected the following shape: (20x4).
Question 1: Can anyone tell me why? And how do I create the array correctly?
Question 2: When I add the datatype , dtype=np.float, I get the following
Error:
*TypeError: float() argument must be a string or a number, not 'list'*
but the array isn't intended to be a list.
I found the mistake on my own after trying to np.vstack all vectors.
The resulting error said that the size of the arrays with the row index 2, 3, 4 is not 4 as expected.
Replacing a , (comma) with a dot solved the problem.
I am using both SageMath and Wolfram Alpha to entertain myself over the weekend.
I found this SageMath demo of solving simultaneous equations:
var('x y p q')
eq1 = p+q==9
eq2 = q*y+p*x==-6
eq3 = q*y^2+p*x^2==24
solve([eq1,eq2,eq3,p==1],p,q,x,y)
And it gave me this result:
[
[ p == 1
, q == 8
, x == -4/3*sqrt(10) - 2/3
, y == 1/6*sqrt(10) - 2/3
]
,[p == 1
, q == 8
, x == 4/3*sqrt(10) - 2/3
, y == -1/6*sqrt(10) - 2/3
]
]
I tried this syntax on Alpha:
solve p+q==9 , q*y+p*x==-6 , q*y^2+p*x^2==24 , p==1
It works well.
Question:
How to operate Alpha so I assign each equation to a variable and then supply that variable to solve as a parameter?
I want to simplify my call to solve() so it looks like this:
solve eq1, eq2, eq3, p==1
instead of this:
solve p+q==9 , q*y+p*x==-6 , q*y^2+p*x^2==24 , p==1
So, Crazy Ivan's comment really answers this. WolframAlpha is not a programming language. It is a glorified online smart calculator. You can only do very limited computations in it. Check out the Wolfram Language for the full-fledged programming language.
I wrote the following code:
using JuMP
m = Model()
const A =
[ :a0 ,
:a1 ,
:a2 ]
const T = [1:5]
const U =
[
:a0 => [9 9 9 9 999],
:a1 => [11 11 11 11 11],
:a2 => [1 1 1 1 1]
]
#defVar(m, x[A,T], Bin)
#setObjective(m, Max, sum{sum{x[i,j] * U[i,j], i=A}, j=T} )
print(m)
status = solve(m)
println("Objective value: ", getObjectiveValue(m))
println("x = ", getValue(x))
When I run it I get the following error
ERROR: `*` has no method matching *(::Variable)
in anonymous at /home/username/.julia/v0.3/JuMP/src/macros.jl:71
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
while loading /programs/julia-0.2.1/models/a003.jl, in expression starting on line 21
What's the correct way of doing this?
As the manual says:
There is one key restriction on the form of the expression in the second case: if there is a product between coefficients and variables, the variables must appear last. That is, Coefficient times Variable is good, but Variable times Coefficient is bad
Let me know if there is another place I could put this that would have helped you out.
This situation isn't desirable but unfortunately we haven't got a good solution yet that retains the fast model construction capabilities of JuMP.
I believe the problem with U is that it is a dictionary of arrays, thus you first need to index into the dictionary to return the correct array, then index into the array. JuMP's variables have more powerful indexing, so allow you to do it in one set of [].
I resolved my problem: constants must preceed variables as I read somewhere, moreover it seems that an array of constants must be used as an array of arrays while variables can be used as matrices.
Here's the correct line:
#setObjective(m, Max, sum{sum{U[i][j]*x[i,j], i=A}, j=T} )