hyperplane equation in SVM - vector

How does the SVM algorithm find the optimum hyperplane? The positive margin hyperplane equation is w.x-b=1, the negative margin hyperplane equation is w.x-b=-1, and the middle(optimum) hyperplane equation is w.x-b=0).
I understand how a hyperplane equation can be got by using a normal vector of that plane and a known vector point(not the whole vector) by this tutorial. Lets say the known vector point is x1, the whole vector will be (x-x1), for some x. If w is the normal vector of the plane, then w.(x-x1)=0; eventually we will get the form w.x=b
Now, for getting a hyperplane, we need a normal vector and known point. How does the algorithm create a hyperplane at the middle where there is no data point (which I think is a known vector point needed in the equation) from training data?
Maybe I misunderstand something or my logic is not correct.

You misunderstand one basic fact: the algorithm is not required to represent a hyperplane in terms of w.x-b = 0, using a given data point. The algorithm is free to change this into any form convenient to each of its functions.
The solution is obvious, as you've already found it: the algorithm does not have to use one of the points form the data set. In fact, if the partition is ideal (no data on the wrong side), there is no point in the middle.
However, finding that hyperplane is trivial. (1) The positive and negative hyperplanes are parallel, and (2) the optimum plane bisects their separation. By (1), all three planes have the same normal vector. By (2), the reference point can be the midpoint of any segment connecting two points on opposite planes.
Briefly, pick a positive support vector and a negative support vector; these lie one on each of the planes. Find the midpoint between them, convolve with the normal vector, and there's your optimum plane.

Related

The reason of bias'es usage in networks?

It may be easy to see why but I still don't understand why we use bias in a neutral network? The weight's values will get changed, therefore ensuring whether the algorithm will learn. So, why use bias in all of this?
Because of linear equations.
Bias is another learned parameter. A single neuron will compute w*x + b where w is your weight parameter and b your bias.
Perhaps this helps you: Let's assume you are dealing with a 2D euclidian space that you'd like to classify with two labels. You can do that by computing a linear function and then classify everything below it with one label, and everything below with another. If you would not use the bias you could only change the slope of your function and your function would always intersect (0, 0). Bias gives you the possibility to define where that linear function intersects the y-axis for x=0, i.e. (0, y). E.g. without bias you could not separate data that is only separatable by the y-axis.

Find the optimized rotation

I have an application where I must find a rotation from a set of 15 orderer&indexed 3D points (X1, X2, ..., X15) to another set of 15 points with the same index (1 initial point corresponding to 1 final point).
I've read manythings about finding the rotation with Euler angles (evil for some persons), quaternions or with projecting the vector on the basis axis. But i've an additionnal constraint : a few points of my final set can be wrong (i.e. have wrong coordinates) so I want to discriminate the points that ask a rotation very far from the median rotation.
My issue is : for every set of 3 points (not aligned ones) and their images I can compute quaternions (according to the fact that the transformation matrix won't be a pure rotation I have some additionnal calculations but it can be done). So I get a set of quaternions (455 ones max) and I want to remove the wrong ones.
Is there a way to find what points give rotations far from the mean rotation ? Does the "mean" and "the standard deviation" mean something for quaternions or must I compute Euler angles ? And once I've the set of "good" quaternions, how can I compute the "mean" quaternion/rotation ?
Cheers,
Ricola3D
In computer vision, there's a technique called RANSAC for doing something like what you propose. Instead of finding all of the possible quaternions, you would use a minimal set of point correspondences to find a single quaternion/transformation matrix. You'd then evaluate all of the points for quality of fit, discarding those that don't fit well enough. If you don't have enough good matches, perhaps you got a bad match in your original set. So you'll throw away that attempt and try again. If you do get enough good matches, you'll do a least-squares regression fit of all the inlier points to get a new transformation matrix and then iterate until you're happy with the results.
Alternatively, you could take all of your normalized quaternions and find the dot-product between them all. The dot-product should always be positive; if it's not for any given calculation, you should negate all of the components of one of the two quaternions and re-compute. You then have a distance measure between the quaternions and you can cluster or look for gaps.
There are 2 problems here:
how do you compute a "best fit" for an arbitrary number of points?
how do decide which points to accept, and which points to reject?
The general answer to the first is, "do a least squares fit". Quaternions would probably be better than Euler angles for this; try the following:
foreach point pair (a -> b), ideal rotation by unit quaternion q is:
b = q a q* -> q a - b q = 0
So, look for a least-squares fit for q:
minimize sum[over i] of |q a_i - b_i q|^2
under the constraint: |q|^2 = 1
As presented above, the least-squares problem is linear except for the constraint, which should make it easier to solve than an Euler angle formulation.
For the second problem, I can see two approaches:
if your points aren't too far off, you could try running the least-squares solver with all points, then go back, throw out the "outliers" (those point pairs whose squared-error is greatest), and try again.
if wildly inconsistent points are throwing off the above procedure, you could try selecting random, small subsets of 3 or 4 pairs, and find a least-squares fit for each. If a large group of these results have similar rotations with low total error, you can use this to identify "good" pairs (and thereby eliminate bad pairs); then go back and find a least-squares fit for all good pairs.

Slerp with more than two points

The correct way to interpolate between two points on a sphere is using slerp.
How would one interpolate between more than two points on a sphere? So summing a set of points with different weights on the surface of a sphere?
Simply summing the points multiplied by their weights and then normalising the result is not accurate enough when the angles are large. We need 'true' spherical interpolation.
I asked this question on math.stackexchange.com, and someone found a paper that describes exactly this. Here it is: Spherical Averages and Applications to Spherical Splines and Interpolation
The problem I see is:
Slerp gives constant velocity. That is, a given increment in your interpolation parameter gives you the same distance on the sphere, regardless of where you are on the [0,1] range.
Unfortunately, because the sphere is curved, you can't do this for more than one interpolation parameter. Either you need to give up constant velocity, or give up interpolating with more than one parameter.
You may be able to find an interpolation function that isn't constant velocity that nonetheless satisfies your requirements. But because of the above problem, I don't think it will correspond directly and symmetrically to the 1-D slerp.

align one set of 2d points with another using only translation and rotation

I'm working in OpenCV but I don't think there is a function for this. I can find a function for finding affine transformations, but affine transformations include scaling, and I only want to consider rotation + translation.
Imagine I have two sets of points in 2d - let's say each set has exactly 50 points.
E.g. set A = {x1, y1, x2, y2, ... , x50, y50}
set B = {x1', y1', x2', y2', ... , x50', y50'}
I want to find the rotation and translation combination that gets closest to mapping set A onto set B. I guess I would define "closest" as minimises the average distance between points in A and corresponding points in B. I.e., minimises the average distance between (x1, y1) and (x1', y1'), etc.
I guess I could use brute force testing all possible translations and rotations but this would be extremely inefficient. Does anyone know a simpler way?
Thanks!
This problem has a very elegant solution in terms of singular value decomposition of the proximity matrix (distances between pairs of points). The name of this is the orthogonal Procrustes problem, after the Greek legend about a fellow who offered travellers a bed that would fit anyone.
The solution comes from finding the nearest orthogonal matrix to a given (not necessarily orthogonal) matrix.
The way I would do it in Excel is to make a couple columns representing the points.
Cells representing rotation/translation of a set (no need to rotate and translate both of them).
Then columns representing those same points rotated/translated.
Then another column for the distance between the points of the rotated/translated points.
Then a cell of the sum of the distances between points.
Finally, use Solver to optimize the rotation and translation cells.
If you fix some rotation you can get an answer using ternary search. Run search in x and for every tested x run it in y to get the best value. This will give you the correct answer since the function (sum of corresponding distances) is convex (this can be proved through observing that restriction of the function to any line is a one-dimensional convex function; and the last is a standard fact: the sum of several convex functions is convex).
Instead of brute force over the angle I can propose such a method based on the ternary search. Choose some not very large step S. Compute the target function for every angle in (0, S, 2S,...). Then, if S is small enough, we can exclude some of segments (iS, (i + 1)S) from consideration. Namely ones with relatively large values of function with angles iS and (i + 1)S. Being implemented carefully this can give an answer and can do it faster than brute force.

Optimal rotation of 3D model for 2D projection

I'm looking for a way to determine the optimal X/Y/Z rotation of a set of vertices for rendering (using the X/Y coordinates, ignoring Z) on a 2D canvas.
I've had a couple of ideas, one being pure brute-force involving performing a 3-dimensional loop ranging from 0..359 (either in steps of 1 or more, depending on results/speed requirements) on the set of vertices, measuring the difference between the min/max on both X/Y axis, storing the highest results/rotation pairs and using the most effective pair.
The second idea would be to determine the two points with the greatest distance between them in Euclidean distance, calculate the angle required to rotate the 'path' between these two points to lay along the X axis (again, we're ignoring the Z axis, so the depth within the result would not matter) and then repeating several times. The problem I can see with this is first by repeating it we may be overriding our previous rotation with a new rotation, and that the original/subsequent rotation may not neccesarily result in the greatest 2D area used. The second issue being if we use a single iteration, then the same problem occurs - the two points furthest apart may not have other poitns aligned along the same 'path', and as such we will probably not get an optimal rotation for a 2D project.
Using the second idea, perhaps using the first say 3 iterations, storing the required rotation angle, and averaging across the 3 would return a more accurate result, as it is taking into account not just a single rotation but the top 3 'pairs'.
Please, rip these ideas apart, give insight of your own. I'm intreaged to see what solutions you all may have, or algorithms unknown to me you may quote.
I would compute the principal axes of inertia, and take the axis vector v with highest corresponding moment. I would then rotate the vertices to align v with the z-axis. Let me know if you want more details about how to go about this.
Intuitively, this finds the axis about which it's hardest to rotate the points, ie, around which the vertices are the most "spread out".
Without a concrete definition of what you consider optimal, it's impossible to say how well this method performs. However, it has a few desirable properties:
If the vertices are coplanar, this method is optimal in that it will always align that plane with the x-y plane.
If the vertices are arranged into a rectangular box, the box's shortest dimension gets aligned to the z-axis.
EDIT: Here's more detailed information about how to implement this approach.
First, assign a mass to each vertex. I'll discuss options for how to do this below.
Next, compute the center of mass of your set of vertices. Then translate all of your vertices by -1 times the center of mass, so that the new center of mass is now (0,0,0).
Compute the moment of inertia tensor. This is a 3x3 matrix whose entries are given by formulas you can find on Wikipedia. The formulas depend only on the vertex positions and the masses you assigned them.
Now you need to diagonalize the inertia tensor. Since it is symmetric positive-definite, it is possible to do this by finding its eigenvectors and eigenvalues. Unfortunately, numerical algorithms for finding these tend to be complicated; the most direct approach requires finding the roots of a cubic polynomial. However finding the eigenvalues and eigenvectors of a matrix is an extremely common problem and any linear algebra package worth its salt will come with code that can do this for you (for example, the open-source linear algebra package Eigen has SelfAdjointEigenSolver.) You might also be able to find lighter-weight code specialized to the 3x3 case on the Internet.
You now have three eigenvectors and their corresponding eigenvalues. These eigenvalues will be positive. Take the eigenvector corresponding to the largest eigenvalue; this vector points in the direction of your new z-axis.
Now, about the choice of mass. The simplest thing to do is to give all vertices a mass of 1. If all you have is a cloud of points, this is probably a good solution.
You could also set each star's mass to be its real-world mass, if you have access to that data. If you do this, the z-axis you compute will also be the axis about which the star system is (most likely) rotating.
This answer is intended to be valid only for convex polyhedra.
In http://203.208.166.84/masudhasan/cgta_silhouette.pdf you can find
"In this paper, we study how to select view points of convex polyhedra such that the silhouette satisfies certain properties. Specifically, we give algorithms to find all projections of a convex polyhedron such that a given set of edges, faces and/or vertices appear on the silhouette."
The paper is an in-depth analysis of the properties and algorithms of polyhedra projections. But it is not easy to follow, I should admit.
With that algorithm at hand, your problem is combinatorics: select all sets of possible vertexes, check whether or not exist a projection for each set, and if it does exists, calculate the area of the convex hull of the silhouette.
You did not provide the approx number of vertex. But as always, a combinatorial solution is not recommended for unbounded (aka big) quantities.

Resources