how to reduce dimensionality of vector - math

I have a set of vectors. I'm working on ways to reduce a n-dimensional vector to a unary value (1-d), say
(x1,x2,....,xn) ------> y
This single value needs to be the characteristic value of the vector. Each unique vector produces a unique output value. Which of the following methods is appropriate:
1- norm of the vector - square root of sum of squares that measures euclidian distance from origin
2- compute hash of F, using some hashing techniques avoiding collision
3- use linear regression to compute, y = w1*x1 + w2*x2 + ... + wn*xn - unlikely to be good if there is no good dependence of input values on output
4- feature extraction technique like PCA that assigns weights to each of x1,x2,..xn based on
the set of input vectors

It's unclear from the method what properties you need this transform to have, so I'm making a guess that you don't need the transformation to preserve any properties other than uniqueness, and possibly invertibility.
None of the techniques you suggest can in general avoid collisions:
Norm - two vectors pointing in opposite directions have the same norm.
Hash - if the input isn't known apriori - what is generally meant by hash function has a finite image, and you have an infinite number of possible vectors - no good.
It's easy to find to vectors which give the same result for any linear regression result (think about it).
PCA is a specific kind of linear transformation - hence the same problem as with linear regression.
So - if you're just looking for uniqueness, you could "stringify" your vectors. One way to do it is to write them down as text strings, with the different coordinates separated by a special character (underscore, for example). Then take the binary value of this string as your representation.
If space is important and you need a more efficient representation, you could think of a more efficient bit encoding: each character in the set 0,1,...,9,'.','' can be represented by 4 bits - a hexadecimal digit (map '.' to A and '' to B). Now encode this string as a hexadecimal number, saving half the space.

Related

How can I improve the lindep function's applicability in Pari/GP for integral approximations?

While doing certain computations involving the Rogers L-function, the following result was generated by Wolfram Alpha:
                              
I wanted to verify this result in Pari/GP by means of the lindep function, so I calculated the integral to 20 digits in WA, yielding:
11.3879638800312828875
Then, I used the following code in Pari/GP:
lindep([zeta(2), zeta(3), 11.3879638800312828875])
As pi^2 = 6*zeta(2), one would expect the output to be a vector along the lines of:
[12,12,-3]
because that's the linear dependency suggested by WA's result. However, I got a very elaborate vector from Pari/GP:
[35237276454, -996904369, -4984618961]
I think the first vector should be the "right" output of the Pari code sample.
Questions:
Why is the lindep function in Pari/GP not yielding the output one would expect in this case?
What can I do to make it give the vector that would be more appropriate in this situation?
It comes down to Pari treating your rounded values as exact. Since you must round your values, lindep's solution doesn't always come to the same solution as the true answer due to error.
You can try changing the accuracy of lindep using the second argument. The manual states that you should choose this to be smaller than the number of correct decimal digits. I believe this should solve the issue.
lindep(v, {flag = 0}) finds a small nontrivial integral linear
combination between components of v. If none can be found return an
empty vector.
If v is a vector with real/complex entries we use a floating point
(variable precision) LLL algorithm. If flag = 0 the accuracy is chosen
internally using a crude heuristic. If flag > 0 the computation is
done with an accuracy of flag decimal digits. To get meaningful
results in the latter case, the parameter flag should be smaller than
the number of correct decimal digits in the input.

What is the most efficient way to store a set of points (embeddings) such that queries for closest points are computed quickly

Given a set of embeddings, i.e. set of [name, vector representation]
how should I store it such that queries on the closest points are computed quickly. For example given 100 embeddings in 2-d space, if I query the data struct on the 5 closest points to (10,12), it returns { [a,(9,11.5)] , [b,(12,14)],...}
The trivial approach is calculate all distances, sort and return top-k points. Alternatively, one might think of storing in a 2-d array in blocks/units of mXn space to cover the range of the embedding space. I don't think this is extensible to higher dimensions, but I'm willing to be corrected.
There are standard approximate nearest neighbors library such as faiss, flann, java-lsh etc. (which are either LSH or Product Quantization based), which you may use.
The quickest solution (which I found useful) is to transform a vector of (say 100 dimensions) to a long variable (64 bits) by using the Johnson–Lindenstrauss transform. You can then use Hamming similarity (i.e. 64 minus the number of bits set in a XOR b) to compute the similarity between bit vectors a and b. You could use the POPCOUNT machine instruction to this effect (which is very fast).
In effect, if you use POPCOUNT in C, even if you do a complete iteration over the whole set of binary transformed vectors (long variables of 64 bits), it still will be very fast.

Efficient Calculation of an N-Dimensional Cross Product?

As per the title, is the best way to calculate the n-dimensional cross product just using the determinant definition and using the LU Decomposition method of doing as such or could you guys suggest a better one?
Thanks
Edit: for clarity I mean http://en.wikipedia.org/wiki/Cross_product and not the Cartesian Product
Edit: It also seems that using the Leibniz Formula might help - though I don't know how that compares to LU Decomp. at the moment.
From your comment, it seems like you are looking for an operation which takes n −1 vectors as input and computes a single vector as its result, which will be orthogonal to all the input vectors and perhaps have a well-defined length as well.
With defined length
You can characterize the 3-dimensional cross product v =a ×b using the identity v ∙w =det(a,b,w). In other words, taking the cross product of the input vectors and then computing the dot product with any other vector w is the same as plugging the input vectors and that other vector into a matrix and computing its determinant.
This definition can be generalized to arbitrary dimensions. Due to the way a determinant can be computed using Laplace expansion along the last column, the resulting coordinates of that cross product will be the values of all (n −1)×(n −1) sub-determinants you can form from the input vectors, with alternating signs. So yes, Leibniz might be useful in theory, although it is hardly suitable for real-world computations. In practice, you'll soon have to figure out ways to avoid repeating computationswhile computing these n determinants. But wait for the last section of this answer…
Just the direction
Most applications however can do with a weaker requirement. They don't care about the length of the resulting vector, but only about its direction. In that case, what you are asking for is the kernel of the (n −1)×n matrix you can form by taking the input vectors as rows. Any element of that kernel will be orthogonal to the input vectors, and since computing kernels is a common task, you can build on a lot of existing implementations, e.g. Lapack. Details might depend on the language you are using.
Combining these
You can even combine the two approaches above: compute one element of the kernel, and for a non-zero entry of that vector, also compute the corresponding (n −1)×(n −1) determinant which would give you that single coordinate using the first approach. You can then simply scale the vector so that the selected coordinate reaches the computed value, and all the other coordinates will match that one.

Difference between a vector in maths and programming

Maybe this question is better suited in the math section of the site but I guess stackoverflow is suited too. In mathematics, a vector has a position and a direction, but in programming, a vector is usually defined as:
Vector v (3, 1, 5);
Where is the direction and magnitude? For me, this is a point, not a vector... So what gives? Probably I am not getting something so if anybody can explain this to me it would be very appreciated.
If we are working in cartesian coordinates, and assume (0,0,0) to be the origin, then a point p=(3,1,5) can be written as
where i, j and k are the unit vectors in the x, y and z directions. For convenience sake, the unit vectors are dropped from programming constructs.
The magnitude of the vector is
and its direction cosines are
respectively, both of which can be done programmatically. You can also take dot products and cross-products, which I'm sure you know about. So the usage is consistent between programming and mathematics. The difference in notations is mostly because of convenience.
However as Tomas pointed out, in programming, it is also common to define a vector of strings or objects, which really have no mathematical meaning. You can consider such vectors to be a one dimensional array or a list of items that can be accessed or manipulated easily by indexing.
In mathematics, it is easy to represent a vector by a point - just say that the "base" of the vector is implied to be the origin. Thus, a mathematical point for all practical purposes is also a representation of a mathematical vector, and the vector in your example has the magnitude sqrt(3^2 + 1^2 + 5^2) = 6 and the direction (1/2, 1/6, 5/6) (a normalized vector from the origin).
However, a vector in programming usually has no geometrical use, which means you really aren't interested in things like magnitude or direction. A vector in programming is rather just an ordered list of items. Important here is that the items need not be numbers - it can be anything handled by the language in question! Thus, ("Hello", "little", "world") is also a vector in programming, although it (obviously) has no vector interpretation in the mathematical sense.
Practically speaking (!):
A vector in mathematics is only a direction without a position (actually something more general, but to stay in your terminology). In programming you often use vectors for points. You can think of your vector as the vector pointing from the origin (0,0,0) to the point (3,1,5), called the location vector of the point. Consult texts on analytical and affine geometry for more insight.
A Vector in computer science is an "one dimensional" data structure (array) (can be thought as direction) with an usually dynamic size (length/magnitude). For that reason it is called as vector. But it's an array at least.
A vector also means a set of coordinates. This is how it is used in programming. Just as a set of numbers. You might want to represent position vectors, velocity vectors, momentum vectors, force vectors with a vector object, or you may wish to represent it any way that suits you.
Many times vector quantities may be represented by 4 coordinates instead of 3 (see homogeneous coordinates in computer graphics) so a physical vector is represented by a computer vector with 4 elements. Alternatively you can store direction and magnitude separately, or encode them with 3, 4 or more coordinates.
I guess what I am getting to, is that computer languages are designed to represent physical models, but abstract data containers that the programmer use as tools for his/hers modeling.
Vector in math is an element of n-dimensional space over some field(e.g. real/complex number, functions, string). It may have infinite dimension, e.g. functional space L^2. I don't remember infite-dimensional vectors were used in programming (infinite vectors are not vectors with non-limited length, but vector with infite number of elements)
The most rigorous statement is that a mathematical vector is a first-order tensor that transforms from one coordinate system to another according to tensor transformation rules. The physical idea to keep in mind is that vectors have both magnitude and direction.
Programming vectors are data structures that need not transform according to any rules and may or may not have a notion of a coordinate system as reference. If you happen to use a vector data structure to hold numbers, they may conform to the mathematical definition. But if you have a vector of objects, it's unlikely that they have anything to do with coordinate transformations.

How to store Sparse matrix for a matrix-vector multiply when some boundary condition values are known?

I have a sparse matrix that represents a 3D rectangular space. Along some of the boundaries, I know what the value is going to be (it's a constant). The other boundaries may be reflective, differential, etc.
Should I just set the problem up as if all the boundaries were say, differential, and then go back and set the nodes in the solution vector b to be the constants?
Thanks!
In the finite element method you treat Dirchelet (value constraints) and Neumann (derivative constraints) differently. Usually you assemble the matrix without consideration for boundary conditions first, then apply boundary conditions, then do LU decomposition to solve.
You apply boundary conditions by modifying both the assembled matrix and the RHS vector. I'd have to know more details to tell you exactly what you need to do.

Resources