I was looking through the Spatial Transformer Network paper, and I am trying to implement a custom grid_sample function (inheriting the autograd.Function class) in PyTorch for the Integer Sampling Kernel.
While defining the backward function, I have come across the following conundrum.
Given that the integer sampling works as the following:
I think that the gradients w.r.t the input map and the transformed grid (x_i^s, y_i^s) should be like the following:
Gradient w.r.t. input map:
Gradient w.r.t transformed grid (x_i^s):
Gradient w.r.t transformed grid (y_i^s):
as the derivative of the Kronecker delta function is zero (I'm unsure about this!! - HELP)
Derivative of the Kronecker delta?
Thus I am reaching a conclusion that the gradient w.r.t to the input should be: a tensor of the same size as the input filled with ones if the pixel was sampled and 0 if it wasn't sampled, and the gradient w.r.t the transformed grid should be a tensor full of zeros.
However, if the gradient of the transformed grid is 0, then due to the chain rule, no information will be passed on to the layers before the integer sampler. Therefore I think the derivate with respect to the grid should be something else. Could anybody point out what I'm doing wrong?
Many thanks in advance!
For future reference, and for those who might have had similar questions to the one I posted.
I've emailed Dr Jaderberg (one of the authors of the 'Spatial Transformer Networks') about this question and he has confirmed: "that the gradient wrt the coordinates for integer sampling is 0.". So I wasn't doing anything wrong, and it was right all along!
He was very kind in his response and expressed that integer sampling was mentioned in the paper to introduce the bilinear sampling scheme, and have given insights into how to possibly implement integer sampling if I really wanted to:
"you could think about using some numerical differentiation techniques (e.g. look at difference of x to its neighbours). This would assume smoothness in the image wrt coordinates."
So with great thanks to Dr Jaderberg, I'm happy to close this question.
I guess thinking about how I'd use numerical methods to implement the integer kernel for the sampling function is another challenge for myself, but until then I guess the bilinear sampler is my friend! :)
Related
I'm reading the PCL tutorial on plane segmentation, because I want to find 3D circles in a very large and dense point cloud I have.
I know already the approximate values for center, radius and orientation of the circle, but I have found no way so far to inform the SACSegmentation object of this fact. I could also name 3 inliers to compute initial values on, but I also don't find a way to do this.
My pointcloud is extremely large (10-20M points), so just random samples will likely be prohibitive, especially since I know already more or less what the parameter values should be and only want to optimize them.
Question: How can I set the starting point of the Sample Consensus optimization procedure?
To segment and optimize model
Set SACSegmentation::setOptimizeCoefficients(true)
Use SACSegmentation::segment which takes in an initial guess (or the final model to segment using iff optimize coefficients is set as false)
You can provide you guess here. Depending on optimization method used, you can reduce the computational load.
Could you please help me to add zooming option for wordcloud
Please find reproducible example #
´http://shiny.rstudio.com/gallery/word-cloud.html´
I tried to incorporate rbokeh and plotly but couldnt find wordcloud equivalent render function
Additionally, I found ECharts from github #
´https://github.com/XD-DENG/ECharts2Shiny/tree/8ac690a8039abc2334ec06f394ba97498b518e81´
But incorporating this ECharts are also not convenient for really zoom.
Thanks in advance,
Abi
Normalisation is required only if the predictors are not meant to be comparable on the original scaling. There's no rule that says you must normalize.
PCA is a statistical method that gives you a new linear transformation. By itself, it loses nothing. All it does is to give you new principal components.
You lose information only if you choose a subset of those principal components.
Usually PCA includes centering the data as a Pre Process Step.
PCA only arranges the data in its own Axis (Eigne Vectors) System.
If you use all axis you lose no information.
Yet, usually we want to apply Dimensionality Reduction, intuitively, having less coordinates for the data.
This process means projecting the data into Sub Space which is spanned by only some of the Eigen Vectors of the data.
If one chose wisely the number of vectors one might end up with a significant reduction in the number of dimensions of the data with negligible loss of data / information.
The way to do so is by choosing Eigen Vectors which their Eigen Values sum to most of the data power.
PCA itself is invertible, so lossless.
But:
It is common to drop some components, which will cause a loss of information.
Numerical issues may cause a loss in precision.
So, in this question I'd be grateful for hints and further information if I am correct or no.
To calculate the position upon range-measurments to fixed anchors (like GPS) you need to solve the trilateration problem, for example: non-linear least squares, geometrical algorithms or the particle filter, which also is able to solve the trilateration-problem as such.
Due to noise/errors the result might be a jagged line -> you can use the Kalman-Filter to smooth it. So far: Particle - calculation, Kalman - smoothing. Now:
Is it possible to use a Kalman-Filter NOT to smoothen an already existing result, BUT to solve the trilateration as such?
Regarding the particle filter: How to use the particle filter NOT to solve trilateration, BUT to smoothen an already existing result (e.g. calculated with NLLS)?
Best and thank you for any hints, papers, videos, solutions etc.!
The Kalman filter is an optimal solver for linear Gaussian problems. It is often used to solve the trilateration problem (Question 1). To use it in this problem the Jacobian (partial derivative of the range measurement with respect to the position) is linearized at the current position estimate. That process, linearization of the Jacobian, defines the Kalman filter as an Extended Kalman Filter, or EKF in the literature. That works well for GPS because the range to the transmitter is so great that the error in the Jacobian estimate due to position error is small enough to be negligible if the Kalman filter is crudely initialized, for example within 100 km. It breaks down when the 'fixed anchors' are closer to the user. The closer the anchor, the more quickly the line-of-sight vector to the anchor is changing with the position estimate. In these cases Unscented Kalman Filters (UKF) or Particle Filters (PF) are sometimes used instead of an EKF.
The best introduction to the KF and EKF in my view is Applied Optimal Estimation by Gelb. That book has been in print since 1974, and there is a reason why. A discussion of the breakdown of the EKF when the anchor is close can be found in the paper "The Scaled Unscented Transformation" by Julier, which can be found here.
For question 2, the answer is yes, certainly a PF could be used to smooth a solution that is created, for example, by replacing the range measurements with an epoch-by-epoch result from a least-squares solver for the position. I would not recommend the approach. The power of the PF, and the reason we pay the price of computing everything for each particle, is that it handles the non-linearities. To 'pre-linearize' the problem before handing it to the PF defeats its purpose.
If I have a system of a springs, not one, but for example 3 degree of freedom system of the springs connected in some with each other. I can make a system of differential equations for but it is impossible to solve it in a general way. The question is, are there any papers or methods for filtering such a complex oscilliations, in order to get rid of the oscilliations and get a real signal as much as possible? For example if I connect 3 springs in some way, and push them to start the vibrations, or put some weight on them, and then take the vibrations from each spring, are there any filtering methods to make it easy to determine the weight (in case if some mass is put above) of each mass? I am interested in filtering complex spring like systems.
Three springs, six degrees of freedom? This is a trivial solution using finite element methods and numerical integration. It's a system of six coupled ODEs. You can apply any form of numerical integration, such as 5th order Runge-Kutta.
I'd recommend doing an eigenvalue analysis of the system first to find out something about its frequency characteristics and normal modes. I'd also do an FFT of the dynamic forces you apply to the system. You don't mention any damping, so if you happen to excite your system at a natural frequency that's close to a resonance you might have some interesting behavior.
If the dynamic equation has this general form (sorry, I don't have LaTeX here to make it look nice):
Ma + Kx = F
where M is the mass matrix (diagonal), a is the acceleration (2nd derivative of displacements w.r.t. time), K is the stiffness matrix, and F is the forcing function.
If you're saying you know the response, you'll have to pre-multiply by the transpose of the response function and try to solve for M. It's diagonal, so you have a shot at it.
Are you connecting the springs in such a way that the behavior of the system is approximately linear? (e.g. at least as close to linear as are musical instrument springs/strings?) Is this behavior consistant over time? (e.g. the springs don't melt or break.) If so, LTI (linear time invariant) systems theory might be applicable. Given enough measurements versus the numbers of degrees of freedom in the LTI system, one might be able to estimate a pole-zero plot of the system response, and go from there. Or something like a linear predictor might be useful.
Actually it is possible to solve the resulting system of differential equations as long as you know the masses, etc.
The standard approach is to use a Laplace Transform. In particular you start with a set of linear differential equations. Add variables until you have a set of first order linear differential equations. (So if you have y'' in your equation, you'd add the equation z = y' and replace y'' with z'.) Rewrite this in the form:
v' = Av + w
where v is a vector of variable, A is a matrix, and w is a scalar vector. (An example of something that winds up in w is gravity.)
Now apply a Laplace transform to get
s L(v) - v(0) = AL(v) + s w
Solve it to get
L(v) = inv(A - I s)(s w + v(0))
where inv inverts a matrix and I is the identity matrix. Apply the inverse Laplace transform (if you read up on Laplace transforms you can find tables of inverse of common types of functions - getting a complete list of the functions you actually encounter shouldn't be that hard), and you have your solution. (Be warned, these computations quickly get very complex.)
Now you have the ability to take a particular setup and solve for the future behavior. You also have the ability to (if you do things really carefully) figure out how the model responds to a small perturbation in parameters. But your problem is that you don't know the parameters to use. However you do have the ability to measure the positions in the system at repeated times.
If you put this together, what you can do is this. Measure your position at a number of points. First estimate all of the initial values of the parameters, and then all of the values a second later. You can adjust your parameters (using Newton's method) to come close enough to the values a second later. Take the measurements from 5 seconds later and use that initial estimate as your starting point to refine your calculations for what is happening 5 seconds later. Repeat with longer intervals to get all of your answers.
Writing and debugging this should take you some time. :-) I would strongly recommend investigating how much of this Mathematica knows how to do for you already...
I'm creating a game where players can make an alloy. To make it less predictable and more interesting, I thought that the durability and hardness of an alloy should not be calculated by a simple formula, because it will be extremely easy to find extrema, where alloy have best statistics.
So the questions is, is there any formula for a function where extrema can be found only by investigating all points? Input values will be in percents: 0.0%-100.0%. I think it should look like this: half sound wave
A very simple way would be a couple of sin function, just vary the constants and the sign for each new player. Here is one example (sin(1.1*x) + sin(x) + sin(0.9 *x))^2
If you use this between 10pi and 20pi you have an by average increasing function with local minima.
Modulating a simple linear or exponential function with trigonometric functions whose frequency and amplitude are dependent on the input should get you what you want.
You don't need a formula, I think — throw a bunch of random values around your domain, and then interpolate (linear interpolation will do) between them. Then you can even change the "formula" completely each time the game is run, or once in a while, or change it slowly with time, etc, etc.
If you want something that is very hard to predict then I would suggest involving a random number generator with the same seed every time. You can use it as an envelope for whatever function you come up with (trig functions or what not) to make it more jagged.
An interesting formula to use would be that of gamma of the Black-Scholes options pricing model. It goes as follows:
You can easily replace the variables, here's a graph of how the function looks:
alt text http://www.sqbimmer.com/aalex/gamma.png