Differentiating a scalar with respect to matrix

Differentiating a scalar with respect to matrix - math

I have a scalar function which is obtained by iterative calculations. I wish to differentiate(find the directional derivative) of the values with respect to a matrix elementwise. How should I employ the finite difference approximation in this case. Does diff or gradient help in this case. Note that I only want numerical derivatives.
The typical code that I would work on is:
n=4;
for i=1:n
for x(i)=-2:0.04:4;
for y(i)=-2:0.04:4;
A(:,:,i)=[sin(x(i)), cos(y(i));2sin(x(i)),sin(x(i)+y(i)).^2];
B(:,:,i)=[sin(x(i)), cos(x(i));3sin(y(i)),cos(x(i))];
R(:,:,i)=horzcat(A(:,:,i),B(:,:,i));
L(i)=det(B(:,:,i)'*A(:,:,i)B)(:,:,i));
%how to find gradient of L with respect to x(i), y(i)
grad_L=tr((diff(L)/diff(R)')*(gradient(R))
endfor;
endfor;
endfor;
I know that the last part for grad_L would syntax error saying the dimensions don't match. How do I proceed to solve this. Note that gradient or directional derivative of a scalar functionf of a matrix variable X is given by nabla(f)=trace((partial f/patial(x_{ij})*X_dot where x_{ij} denotes elements of matrix and X_dot denotes gradient of the matrix X

Both your code and explanation are very confusing. You're using an iteration of n = 4, but you don't do anything with your inputs or outputs, and you overwrite everything. So I will ignore the n aspect for now since you don't seem to be making any use of it. Furthermore you have many syntactical mistakes which look more like maths or pseudocode, rather than any attempt to write valid Matlab / Octave.
But, essentially, you seem to be asking, "I have a function which for each (x,y) coordinate on a 2D grid, it calculates a scalar output L(x,y)", where the calculation leading to L involves multiplying two matrices and then getting their determinant. Here's how to produce such an array L:
X = -2 : 0.04 : 4;
Y = -2 : 0.04 : 4;
X_indices = 1 : length(X);
Y_indices = 1 : length(Y);
for Ind_x = X_indices
for Ind_y = Y_indices
x = X(Ind_x); y = Y(Ind_y);
A = [sin(x), cos(y); 2 * sin(x), sin(x+y)^2];
B = [sin(x), cos(x); 3 * sin(y), cos(x) ];
L(Ind_x, Ind_y) = det (B.' * A * B);
end
end
You then want to obtain the gradient of L, which, of course, is a vector output. Now, to obtain this, ignoring the maths you mentioned for a second, if you're basically trying to use the gradient function correctly, then you just use it directly onto L, and specify the grid X Y used for it to specify the spacings between the different elements in L, and collect its output as a two-element array, so that you capture both the x and y vector-components of the gradient:
[gLx, gLy] = gradient(L, X, Y);

Related

Scilab: How to use ´numderivative´ function

I am a new user of Scilab and I am not a mathematician.
As my end goal, I want to calculate (and plot) the derivative of a piece-wise defined function, see here.
I tried to start small and just use a simple (continuous) function: f(x) = 3*x.
My Google-Fu lead me to the numderivative function.
Problem: It seems that I do not understand how the argument x works since the result is not a 1D-array, instead, it is a matrix.
Update 1: Maybe I use the wrong function and diff is the way to go. But what is then the purpose of numderivative?
PS: Is this the right place to ask Scilab-related questions? It seems that there are several StackOverflow communities where Scilab-related questions are asked.
// Define limits
x0 = 0;
x1 = 2;
// Define array x for which the derivative will be calculated.
n = 100;
x = linspace (x0, x1, n);
// Define function f(x)
deff('y=f(x)','y=3*x');
// Calculate derivative of f(x) at the positions x
myDiff = numderivative(f,x)
(I expect the result 3 3 and not a matrix.)

numderivative(f,x) will give you the approximated derivative/Jacobian of f at the single vector x. For your example it yields 3 times the identity matrix, which is the expected result since f(x)=3*x. If you rather need the derivative of f considered as a function of a single scalar variable at x=1 and x=2, then numderivative is not convenient as you would have to make an explicit loop. Just code the formula yourself (here first order formula) :
// Define limits
x0 = 0;
x1 = 2;
// Define array x for which the derivative will be calculated.
n = 100;
x = linspace (x0, x1, n);
// Define function f(x)
deff('y=f(x)','y=3*x');x = [1 2];
h = sqrt(%eps);
d = (f(x+h)-f(x))/h;
The formula can be improved (second order or complex step formula).

I accidentally (but fortunately) found the elegant solution to your end goal (and also to mine) of calculating and plotting the derivatives through the SciLab documentation itself via the splin (https://help.scilab.org/docs/6.1.1/en_US/splin.html) and interp (https://help.scilab.org/docs/6.1.1/en_US/interp.html) functions. This elegantly works not only for piece-wise defined functions but also for functions defined numerically by data.
The former will give you the corresponding first derivative of your array values while the latter will give you up to the third derivative. Both are actually meant to be used in conjunction because the interp function requires the corresponding derivatives for your y values, which you can easily get through the splin function. Using this method is better than using the diff function because the output array is of the same size as the original one.
Illustrating this through your code example:
// Define limits
x0 = 0;
x1 = 2;
// Define array x for which the derivatives will be calculated.
n = 100;
x = linspace (x0, x1, n);
// Define function f(x)
deff('y=f(x)','y=3*x');
// Calculate derivative of f(x) at the positions x
myDiff = splin(x, f(x));
// Calculate up to the third derivative of f(x) at the positions x
// The first derivatives obtained earlier are required by this function
[yp, yp1, yp2, yp3]=interp(x, x, f(x), myDiff);
// Plot and label all the values in one figure with gridlines
plot(x', [yp', yp1', yp2', yp3'], 'linewidth', 2)
xgrid();
// Put the legend on the upper-left by specifying 2 after the list of legend names
legend(['yp', 'yp1', 'yp2', 'yp3'], 2);

plotting multiple function on octave. I already looked for an answer but something is not working

Let f be a continuous real function defined on the interval [a,b]. I want to aproximate this function by a piecewise quadratic polynomial. I already created a matrix that summarizes these polynomials. Let's say that I'm considering a uniform partition of the interval into N pieces ( therefore N+1 points).
I have a matrix A of size N times 3, where the k row represents the quadratic polynomial associated with the k-interval of this partition in the natural form ( the row [a b c] represents the polynomial a+bx+cx^2). I already created a method to find this matrix (obviously it depends on the choice of my interpolation points inside of each interval but that it doesn't matter for this question).
I'm trying to plot the corresponding function but I'm having some problems. I used the same idea given in Similar question. This is what I wrote
x=zeros(N+1,1);
%this is the set of points defining the uniform partition
for i=1:N+1
x(i)=a+(i-1)*((b-a)/(N));
end
%this is the length of my linspace for plotting the functions
l=100
And now I plot the functions:
figure;
hold on;
%first the original function
u=linspace(a,b,l*N);
v=arrayfun( f , u);
plot(u,v,'b')
% this is for plotting the other functions
for k=1:N
x0=linspace(x(k),x(k+1));
y0=arrayfun(#(t) [1,t,t^2]*A(k,:)',x0);
plot(x0, y0, 'r');
end
The problem is that the for is plotting the same function f and I don't know why. I tried with multiple different functions. I'm pretty sure that my matrix A is correct.

Please write a minimal working example that can be run as standalone code or copy/pasted from people here to check where you might have a bug -- often in the process of reducing your code to its bare principles in this manner, you end up figuring out what is the problem yourself in the first place. But, in any case, I have written one myself and cannot replicate the problem.
figure;
hold on;
# arbitrary values for Minimal Working Example
N = 10;
x = [10:10:110]; # (N+1, 1)
A = randn( N, 3 ); # (3 , N)
a = 100; b = 200; l = 3;
f = #(t) t.^2 .* sin(t);
%first the original function
u = linspace(a,b,l*N);
v = arrayfun( f , u);
plot(u,v,'b')
for k = 1 : N
x0 = linspace( x(k), x(k+1) )
y0 = arrayfun( #(t) ([1, t, t.^2]) * (A(k, :).'), x0 )
x0, y0
plot(x0, y0, 'r');
endfor
hold off;
Output:
Are you doing something different?

How to draw graph of Gauss function?

Gauss function has an infinite number of jump discontinuities at x = 1/n, for positive integers.
I want to draw diagram of Gauss function.
Using Maxima cas I can draw it with simple command :
f(x):= 1/x - floor(1/x); plot2d(f(x),[x,0,1]);
but the result is not good ( near x=0 it should be like here)
Also Maxima claims:
plot2d: expression evaluates to non-numeric value somewhere in plotting
range.
I can define picewise function ( jump discontinuities at x = 1/n, for positive integers )
so I tried :
define( g(x), for i:2 thru 20 step 1 do if (x=i) then x else (1/x) - floor(1/x));
but it don't works.
I can also use chebyshew polynomials to aproximate function ( like in : A Graduate Introduction to Numerical Methods From the Viewpoint of Backward Error Analysis by Corless, Robert, Fillion, Nicolas)
How to do it properly ?

For plot2d you can set the adapt_depth and nticks parameters. The default values are 5 and 29, respectively. set_plot_option() (i.e. with no argument) returns the current list of option values. If you increase adapt_depth and/or nticks, then plot2d will use more points for plotting. Perhaps that makes the figure look good enough.
Another way is to use the draw2d function (in the draw package) and explicitly tell it to plot each segment. We know that there are discontinuities at 1/k, for k = 1, 2, 3, .... We have to decide how many segments to plot. Let's say 20.
(%i6) load (draw) $
(%i7) f(x):= 1/x - floor(1/x) $
(%i8) makelist (explicit (f, x, 1/(k + 1), 1/k), k, 1, 20);
(%o8) [explicit(f,x,1/2,1),explicit(f,x,1/3,1/2),
explicit(f,x,1/4,1/3),explicit(f,x,1/5,1/4),
explicit(f,x,1/6,1/5),explicit(f,x,1/7,1/6),
explicit(f,x,1/8,1/7),explicit(f,x,1/9,1/8),
explicit(f,x,1/10,1/9),explicit(f,x,1/11,1/10),
explicit(f,x,1/12,1/11),explicit(f,x,1/13,1/12),
explicit(f,x,1/14,1/13),explicit(f,x,1/15,1/14),
explicit(f,x,1/16,1/15),explicit(f,x,1/17,1/16),
explicit(f,x,1/18,1/17),explicit(f,x,1/19,1/18),
explicit(f,x,1/20,1/19),explicit(f,x,1/21,1/20)]
(%i9) apply (draw2d, %);

I have made a list of segments with ending points. The result is :
and full code is here
Edit: smaller size with shorter lists in case of almost straight lines,
if (n>20) then iMax:10 else iMax : 250,
in the GivePart function

Add random spread to directional vector

Let's say I have a unit vector a = Vector(0,1,0) and I want to add a random spread of something between x = Vector(-0.2,0,-0.2) and y = Vector(0.2,0,0.2), how would I go about doing that?
If I were to simply generate a random vector between x and y, I'd get a value somewhere in the bounds of a square:
What I'd like instead is a value within the circle made up by x and y:
This seems like a simple problem but I can't figure out the solution right now. Any help would be appreciated.
(I didn't ask this on mathoverflow since this isn't really a 'research level mathematics question')

If I read your question correctly, you want a vector in a random direction that's within a particular length (the radius of your circle).
The formula for a circle is: x2 + y2 = r2
So, if you have a maximum radius, r, that constrains the vector length, perhaps proceed something like this:
Choose a random value for x, that lies between -r and +r
Calculate a limit for randomising y, based on your chosen x, so ylim = sqrt(r2 - x2)
Finally, choose a random value of y between -ylim and +ylim
That way, you get a random direction in x and a random direction in y, but the vector length will remain within 0 to r and so will be constrained within a circle of that radius.
In your example, it seems that r should be sqrt(0.22) which is approximately 0.28284.
UPDATE
As 3D vector has length (or magnitude) sqrt(x2+y2+z2) you could extend the technique to 3D although I would probably favour a different approach (which would also work for 2D).
Choose a random direction by choosing any x, y and z
Calculate the magnitude m = sqrt(x2+y2+z2)
Normalise the direction vector (by dividing each element by its magnitude), so x = x/m, y = y/m, z=z/m
Now choose a random length, L between 0 and r
Scale the direction vector by the random length. So x = x * L, y = y * L, z = z * L

Computing two vectors that are perpendicular to third vector in 3D

What is the best (fastest) way to compute two vectors that are perpendicular to the third vector(X) and also perpendicular to each other?
This is how am I computing this vectors right now:
// HELPER - unit vector that is NOT parallel to X
x_axis = normalize(X);
y_axis = crossProduct(x_axis, HELPER);
z_axis = crossProduct(x_axis, y_axis);
I know there is infinite number of solutions to this, and I don't care which one will be my solution.
What is behind this question: I need to construct transformation matrix, where I know which direction should X axis (first column in matrix) be pointing. I need to calculate Y and Z axis (second and third column). As we know, all axes must be perpendicular to each other.

What I have done, provided that X<>0 or Y<>0 is
A = [-Y, X, 0]
B = [-X*Z, -Y*Z, X*X+Y*Y]
and then normalize the vectors.
[ X,Y,Z]·[-Y,X,0] = -X*Y+Y*X = 0
[ X,Y,Z]·[-X*Z,-Y*Z,X*X+Y*Y] = -X*X*Z-Y*Y*Z+Z*(X*X+Y*Y) = 0
[-Y,X,0]·[-X*Z,-Y*Z,X*X+Y*Y] = Y*X*Z+X*Y*Z = 0
This is called the nullspace of your vector.
If X=0 and Y=0 then A=[1,0,0], B=[0,1,0].

This is the way to do it.
It's also probably the only way to do it. Any other way would be mathematically equivalent.
It may be possible to save a few cycles by opening the crossProduct computation and making sure you're not doing the same multiplications more than once but that's really far into micro-optimization land.
One thing you should be careful is of course the HELPER vector. Not only does it has to be not parallel to X but it's also a good idea that it would be VERY not parallel to X. If X and HELPER are going to be even somewhat parallel, your floating point calculation is going to be unstable and inaccurate. You can test and see what happens if the dot product of X and HELPER is something like 0.9999.

There is a method to find a good HELPER (really - it is ready to be your y_axis).
Let's X = (ax, ay, az). Choose 2 elements with bigger magnitude, exchange them, and negate one of them. Set to zero third element (with the least magnitude). This vector is perpendicular to X.
Example:
if (ax <= ay) and (ax <= az) then HELPER = (0, -az, ay) (or (0, az, -ay))
X*HELPER = 0*0 - ay*az + az*ay = 0
if (ay <= ax) and (ay <= az) then HELPER = (az, 0, -ay)

For a good HELPER vector: find the coordinate of X with the smallest absolute value, and use that coordinate axis:
absX = abs(X.x); absY = abs(X.y); absZ = abs(X.z);
if(absX < absY) {
if(absZ < absX)
HELPER = vector(0,0,1);
else // absX <= absZ
HELPER = vector(1,0,0);
} else { // absY <= absX
if(absZ < absY)
HELPER = vector(0,0,1);
else // absY <= absZ
HELPER = vector(0,1,0);
}
Note: this is effectively very similar to #MBo's answer: taking the cross-product with the smallest coordinate axis is equivalent to setting the smallest coordinate to zero, exchanging the larger two, and negating one.

I think the minimum maximum magnatude out of all element in a unit vector is always greater than 0.577, so you may be able to get away with this:
-> Reduce the problem of finding a perpendicular vector to a 3D vector to a 2D vector by finding any element whose magnatude is greater than say 0.5, then ignore a different element (use 0 in its place) and apply the perpendicular to a 2D vector formula in the remaining elements (for 2D x-axis=(ax,ay) -> y-axis=(-ay,ax))
let x-axis be represented by (ax,ay,az)
if (abs(ay) > 0.5) {
y-axis = normalize((-ay,ax,0))
} else if (abs(az) > 0.5) {
y-axis = normalize((0,-az,ay))
} else if (abs(ax) > 0.5) {
y-axis = normalize((az,0,-ax))
} else {
error("Impossible unit vector")
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Differentiating a scalar with respect to matrix - math

Related

Scilab: How to use ´numderivative´ function

plotting multiple function on octave. I already looked for an answer but something is not working

How to draw graph of Gauss function?

Add random spread to directional vector

Computing two vectors that are perpendicular to third vector in 3D

Categories

Resources