Step size selection for gradient descent without access to function evaluations - convex-optimization

Is there a way to select step size to do gradient descent when you only have access to gradient evaluations, but not function evaluations?
I know the function to be optimized over is convex, and given a point x, I have access to f'(x), but not f(x). Can I do anything other than a fixed step size rule for gradient descent in this case?

Related

2D noise with normalized gradient

Question :
I'm looking for a 2D noise whose gradient always have a norm of 1. which is equivalent to say that its isolines are always at the same distance. It can be any type of noise but its gradient must be continuous ( and if possible the second derivative too ). My goal is to implement it as a function in a fragment shader but just having the mathematical principle would be enough.
To explain more graphicaly what I want, here is a classic gradient noise with isolines and a simple lightning :
As you can see, the isolines density is variable because the slope isn't constant.
On this second picture, you can see the exact same noise but with a different lightning that I made by normalizing the gradient of the first one :
this looks way more like what I'm looking for, however, as you can see, the isolines are still wrong. I just cheated to get the lightning I wanted but I still don't have the noise itself.
Ways of thought :
During my research, I tried to do something similar to gradient noise ( the gradient is defined by a random vector at each grid point ), I focused on a square grid noise but a simplex grid would work too. I came across two main potential ways to solve the problem :
Finding the gradient of the noise first:
It is possible to find a function with its gradient and a fixed value, the reason why it doesn't work with the normalized gradient I used for the lighting of the second picture is that the rotational of the gradient must be 0 everywhere ( else the function can't be continuous ). So the gradient I'm looking for must have a rotational of 0, have a norm of 1, and if we integrate it from one node to another, the result must be zero ( because all nodes of a gradient noise have a value of 0 ).
norm of 1 :
I found three ways to deal with this problem : we can define the gradient by (cos(a(x, y)), sin(a(x, y))), say that the dot product between the gradient and its derivative is 0 or simply say that the dot product of he gradient with itself is 1.
rotational :
The derivative of the x component of the gradient in respect of y must be equal to the derivative of the y component of the gradient in respect of x ( which with the trigonometric technique seen above, becomes : cos(a)*da/dx = -sin(a)*da/dy )
integral from a node to the next one :
I havent investigated that part yet.
Finding the noise itself:
it solves the nodes = 0 problem easily but the main one is still there : the norm of the gradient must be 1 everywhere.
Conclusion :
Of course, those are just ideas and if your answer is completely different from that, ill take it anyways ( and with a big smile ).

Custom gradient descent algorithm in Flux

I an trying to implement the gradient descent algorithm defined in https://arxiv.org/pdf/1808.03856.pdf.
The gradient could be shown like this:
How could I do it?

Do we need to derive differential/gradient w.r.t. input data in backward function(chainer)?

I am implementing a very complex Function in my research, It use Belief Propagation in this layer. I have derived the gradient w.r.t. W(parameter) of this layer, But because its complex, I haven't derived the gradient w.r.t. input_data(the data come from former layer).
I am very confusion about the detail of back propagation. I search a lot about BP algorithm, Some notes says it is ok only to differential w.r.t. W(parameter) and use residual to get gradient ? Your example seems we need also to calculate gradient w.r.t. input data(former layer output). I am confusion?
Very typical example is, how to derive gradient w.r.t. input image in convolutional layer?
My network has two layers, Do I need to derive gradient by hand w.r.t. input X in the last layer? (backward need to return gx in order to let BP works to gradient flow to former layer)?
If you do not need the gradient w.r.t. the input, you can omit its computation. In this case, return None as the placeholder for the omitted input gradient. Note that, in this case, the grad of the input after backprop will be incorrect. If you want to write a Function that can be used in any context (including the case that one wants the gradient w.r.t. the input), you have to compute the gradients w.r.t. all the inputs (except for the case that the Function is not differentiated w.r.t. the input). This is the reason why the built-in functions of Chainer compute gradients for all the inputs.
By the way, deriving the gradient w.r.t. the input image of a convolutional layer is simple: apply transposed-convolution (which is called "deconvolution" in Chainer for the historical reason) to the output using the same weight.

Method for finding normals to a voxel surface

I was working on a method to approximate the normal to a surface of a 3d voxel image.
The method suggested in this article (only algorithm I found via Google) seems to work. The suggested method from the paper is to find the direction the surface varies the most in, choose 2 points on the tangent plane using some procedure, and then take the cross product. Some Pascal code by the article author code, commented in Portuguese, implements this method.
However, using the gradient of f (use each partial derivative as a component of the vector) as the normal seems to work pretty well; I tested this along several circles on a voxellated sphere and I got results that look correct in most spots (there are a few outliers that are off by about 30 degrees). This is very different from the method used in the paper, but it still works. What I don't understand is why the gradient of f = 1/dist calculated along the surface of an object should produce the normal.
Why does this procedure work? Is it just the fact that the sphere test was too much of a special case? Could you suggest a simpler method, or explain any of these methods?
Using the gradient of the volume as a normal for lighting is a standard technique in volume rendering.
If you interpret the value of a voxel as the opacity, the gradient will give you the direction of the greatest change in the opacity, which is similar to a surface normal.

Simulating GDI+ gamma correction in QT

I need to implement some GDI+ functionality in QT, particularly, a LinearGradientBrush. The only method that I have troubles with is SetGammaCorrection. I found a topic that mentioned that MSDN has a pretty thorough description of GDI+ gamma correction algorithm, but I couldn't find it.
I tried to simulate gamma correction as follows:
1) Suppose we have a simple LinearGradientBrush with two-color interpolation. Divide the interval between these two colors into a predefined number of points (100) with equal distance between each point.
2) Assign a value to each point. First point will have a value of 0, second--0.01, ..., the last point will have a value of 1.
3) Calculate an interpolated color value in each point:
current_color = start_color * (1 - current_point_value) + end_color * current_point_value;
Start color and end color are the gradient boundary colors, if it wasn't clear enough.
4) Perform actual gamma correction on each calculated color value (except the two boundary colors):
gamma_corrected_color_value = color_value ^ (1 / gamma);
The value of gamma is 2.2.
Then I take the QLinearGradient, make an array of gradient stops with calculated colors and their positions (point values), assign those stops to the gradient and finally create a QBrush with this gradient.
Now if I fill a rectangle with this brush, I get a result that is pretty close to the result of actual GDI+ LinearGradientBrush, but they are not the same. I have tried a different combinations of gamma values and number of segments, but I didn't manage to get almost equal gradients.
Does anyone know how the gamma correction is implemented in GDI+ or how to simulate it in QT?
Thanks, Tony.
Qt gradients are linear, gamma is non-linear. Looks like you're going to have to regenerate the gradient whenever the gamma changes -- as opposed to having the gamma be a parameter of the gradient.

Resources