Exact integer nullspace of integer matrix? - julia

nullspace(A) finds a basis for the null-space of a matrix A. The returned vectors have floating-point coordinates. If the matrix A is an integer-matrix, the basis can be found in integer coordinates.
For example, in Mathematica,
NullSpace[RandomInteger[{-10, 10}, {3, 4}]]
always returns integer vectors.
Is there a way to compute an integer basis for an integer matrix in Julia?
Update: I get build errors with Nemo.jl (see comments to Dan Getz's answer). In the mean time, is there an alternative?

Nemo.jl is a package for algebra in Julia. It has a lot of functionality and should also allow to compute the null space. One way to go about it would be:
using Nemo # install with Pkg.add("Nemo")
S = MatrixSpace(ZZ, 3, 4)
mm = rand(-10:10,3,4)
m = S(mm)
(bmat,d) = nullspace(m)
After which d is the dimension of the nullspace and bmat has a basis in its columns.
Hope this helps (I would be happy to see alternative solutions possibly using other algebra packages).

Related

unprecise math in R when dealing with infinite fractions

The deviations of the mean should always sum up to 0.
However, when the mean has a lot of digits, maybe infinitely like this one which is 20/7, R fails to calculate it.
x <- c(1,2,2,3,3,4,5)
sum(x - mean(x))
[1] -4.440892e-16
I am quite a newbie and have not found any information about this so far, maybe I was not searching for the right terms.
Is it possible to calculate with infinitely long numbers in R?
I am asking this out of theoretical interest.
The problem you have described is a general problem with all programming languages. Internally all floats are based on the IEEE754 convention. You can read more about it here.
As far as I know there is no easy way around these small errors, except for using number representations with higher precision.
EDIT: R already used the double precision representation of floating point numbers. To read more about it you can have a look at the R FAQ and this SO question.
If you deal with rational numbers only, such as your example, you can use the gmp package.
You can use the Rmpfr package to deal with numbers with an arbitrary precision (that you have to set).
Another possibility is the lazyNumbers package, freshly released on CRAN:
library(lazyNumbers)
# create a vector of lazy numbers
x <- lazyvec(c(1, 2, 2, 3, 3, 4, 5))
# compute its mean
m <- sum(x) / length(x)
# sum expected to be 0
y <- sum(x - m)
# convert it to double
as.double(y)
## 0

Math: What do vertical numbers in brackets represent?

My prof introduced a concept that required use of a vector, which he represented as follows (imagine there is only one pair of brackets below, tall enough to encapsulate both terms; I don't have the rep to paste an image and don't know how to format this otherwise):
v =
[-1/2]
[1/2 ]
One of my personal weaknesses is a lack of familiarity with mathematical notation. Is there an accepted way of interpreting this kind notation? Does it vary by discipline, or is this something generalizable that I really should know? Is there something intrinsic about this notation that would lead one to interpret it differently than if it were written v = [-1/4, 1/4]?
Thanks for the help!
A vector is a one-dimensional matrix, but it is a matrix nonetheless. Writing it out horizontally instead of vertically or vice versa changes the dimensionality of the matrix, changing its meaning among the rest of the equations.
Very often you will "transform" a vector by multiplying them by a matrix. For instance, to rotate a vector, you have to multiply it by the rotation matrix, etc. If your vectors are codified in columns, a multiplication by a matrix M will act from the left, M * v, because of the way the multiplication works (every row of M by the column vector v.) Alternatively, if your vectors are codified as rows (v = [-1/4, 1/4]) the multiplication will act from the right: v * M, again, because of the "row by column" definition of the multiplication of "matrices".
So, it is up to you to represent vectors as rows or columns provided your convention is consistent with the way you multiply them by matrices.

Converting matrix multiplication and sum function from Matlab to R

I'm converting a rather complicated set of code from Matlab to R. I have zero experience in Matlab and am a functioning novice in R.
I have a segment of code which reads (in matlab):
dSii=(sum(tao.*Sik,1))'-(sum(m'))'.*Sii-beta.*Sii./N.*(Iii+sum(Iik)');
Which I've simplified and will focus on the first segment (if I can solve the first segment I'm confident I can perform the rest):
J = (sum(A.*B,1))' - ...
tao (or A) and Sik (or B) are matrices. So my assumption is I'm performing matrix multiplication here (A * B)and summing the resultant column. The '1' is what is throwing me off in that statement. In R, that 1 would likely indicate we're talking about a sum of rows as opposed to columns(indicated by 2). But I can't find any supporting documentation for that kind of Matlab statement.
I was thinking of using a statement like this (but of course, too many '1's and ',')
J<- (apply(A*B, 1), 1, sum)
Thanks for all your help. I searched for other examples here and elsewhere and couldn't find an answer. I'm willing to work for it but this is akin to me studying French (which I don't know) to translate in Spanish (which I'm moderate in) while interpreting the whole process in English. :D
Because of the different conventions in R and Matlab, the idiosyncrasies have to be learned for each (just like your language analogy!). The Matlab command sum(A.*B,1) means multiply A and B element-wise, so they must be the same shape, and then sum along dimension 1, i.e. add each row together to get the column sums. Dimension 1 is the default so, sum(A.*B) would do the same thing as sum(A.*B,1). Because R treats * as element-wise for matrix multiplication, the following Matlab and R codes will produce the same column of numbers in J:
Matlab:
A=[[1,2,3];[4,5,6];[7,8,9]];
B=[[10,11,12];[13,14,15];[16,17,18]];
J=sum(A.*B,1)'; %the ' means to transpose the column sums to be a 3x1 matrix
R:
A<-matrix(c(1,2,3,4,5,6,7,8,9),3,byrow=T)
B<-matrix(c(10,11,12,13,14,15,16,17,18),3,byrow=T)
J<-matrix(colSums(A*B)) # no transpose needed here: nrow(J)==3

How to store a polynomial?

Integers can be used to store individual numbers, but not mathematical expressions. For example, lets say I have the expression:
6x^2 + 5x + 3
How would I store the polynomial? I could create my own object, but I don't see how I could represent the polynomial through member data. I do not want to create a function to evaluate a passed in argument because I do not only need to evaluate it, but also need to manipulate the expression.
Is a vector my only option or is there a more apt solution?
A simple yet inefficient way would be to store it as a list of coefficients. For example, the polynomial in the question would look like this:
[6, 5, 3]
If a term is missing, place a zero in its place. For instance, the polynomial 2x^3 - 4x + 7 would be represented like this:
[2, 0, -4, 7]
The degree of the polynomial is given by the length of the list minus one. This representation has one serious disadvantage: for sparse polynomials, the list will contain a lot of zeros.
A more reasonable representation of the term list of a sparse polynomial is as a list of the nonzero terms, where each term is a list containing the order of the term and the coefficient for that order; the degree of the polynomial is given by the order of the first term. For example, the polynomial x^100+2x^2+1 would be represented by this list:
[[100, 1], [2, 2], [0, 1]]
As an example of how useful this representation is, the book SICP builds a simple but very effective symbolic algebra system using the second representation for polynomials described above.
A list is not the only option.
You can use a map (dictionary) mapping the exponent to the corresponding coefficient.
Using a map, your example would be
{2: 6, 1: 5, 0: 3}
A list of (coefficient, exponent) pairs is quite standard. If you know your polynomial is dense, that is, all the exponent positions are small integers in the range 0 to some small maximum exponent, you can use the array, as I see Óscar Lopez just posted. :)
You can represent expressions as Expression Trees. See for example .NET Expression Trees.
This allows for much more complex expressions than simple polynomials and those expressions can also use multiple variables.
In .NET you can manipulate the expression tree as a tree AND you can evaluate it as a function.
Expression<Func<double,double>> polynomial = x => (x * x + 2 * x - 1);
double result = polynomial.Compile()(23.0);
An object-oriented approach would say that a Polynomial is a collection of Monomials, and a Monomial encapsulates a coefficient and exponent together.
This approach works when when you have a polynomial like this:
y(x) = x^1000 + 1
An approach that tied a data structure to a polynomial order would be terribly wasteful for this pathological case.
You need to store two things:
The degree of your polynomial (e.g. "3")
A list containing each coefficient (e.g. "{3, 0, 2}")
In standard C++, "std::vector<>" and "std::list<>" can do both.
Vector/array is obvious choice. Depending on type of expressions you may consider some sort of sparse vector type (custom made, i.e. based on dictionary or even linked list if you expressions have 2-3 non-zero coefficients 5x^100+x ).
In either case exposing through custom class/interface would be beneficial as you can replace implementation later. You would likely want to provide standard operations (+, -, *, equals) if you plan to write a lot of expression manipulation code.
Just store the coefficients in an array or vector. For example, in C++ if you are only using integer coefficients, you could use std::vector<int>, or for real numbers, std::vector<double>. Then you just push the coefficients in order and access them by variable exponent number.
For example (again in C++), to store 5*x^3 + 9*x - 2 you might do:
std::vector<int> poly;
poly.push_back(-2); // x^0, acceesed with poly[0]
poly.push_back(9); // x^1, accessed with poly[1]
poly.push_back(0); // x^2, etc
poly.push_back(5); // x^3, etc
If you have large, sparse, polynomials, then maybe you'd want to use a map instead of a vector. If you have fixed sized lengths, then you'd perhaps use an fixed length array instead of a vector.
I've used C++ for examples, but this same scheme can be used in any language.
You can also transform it into reverse Polish notation:
6x^2 + 5x + 3 -> x 2 ^ 6 * x 5 * + 3 +
Where x and numbers are "pushed" onto a stack and operations (^,*,+) take the two top-most values from the stack and replace them with the result of the operation. In the end you get the resultant value on the stack.
In this form it's easy to calculate arbitrarily complex expressions.
This representation is also close to tree representation of expressions where non-leaf tree nodes represent operations and functions and leaf nodes are for constants and variables.
What's good about trees is that you can also easily evaluate expressions and you can also do things like symbolic differentiation on them. Both have recursive nature.

Why can cosine similarity between two vectors be negative?

I have 2 vectors with 11 dimentions.
a <- c(-0.012813841, -0.024518383, -0.002765056, 0.079496744, 0.063928973,
0.476156960, 0.122111977, 0.322930189, 0.400701256, 0.454048860,
0.525526219)
b <- c(0.64175768, 0.54625694, 0.40728261, 0.24819750, 0.09406221,
0.16681692, -0.04211932, -0.07130129, -0.08182200, -0.08266852,
-0.07215885)
cosine_sim <- cosine(a,b)
which returns:
-0.05397935
I used cosine() from lsa package.
for some values i am getting negative cosine_sim like the given one. I am not sure how the similarity can be negative. It should be between 0 and 1.
Can anyone explain what is going on here.
The nice thing about R is that you can often dig into the functions and see for yourself what is going on. If you type cosine (without any parentheses, arguments, etc.) then R prints out the body of the function. Poking through it (which takes some practice), you can see that there is a bunch of machinery for computing the pairwise similarities of the columns of the matrix (i.e., the bit wrapped in the if (is.matrix(x) && is.null(y)) condition, but the key line of the function is
crossprod(x, y)/sqrt(crossprod(x) * crossprod(y))
Let's pull this out and apply it to your example:
> crossprod(a,b)/sqrt(crossprod(a)*crossprod(b))
[,1]
[1,] -0.05397935
> crossprod(a)
[,1]
[1,] 1
> crossprod(b)
[,1]
[1,] 1
So, you're using vectors that are already normalized, so you just have crossprod to look at. In your case this is equivalent to
> sum(a*b)
[1] -0.05397935
(for real matrix operations, crossprod is much more efficient than constructing the equivalent operation by hand).
As #Jack Maney's answer says, the dot product of two vectors (which is length(a)*length(b)*cos(a,b)) can be negative ...
For what it's worth, I suspect that the cosine function in lsa might be more easily/efficiently implemented for matrix arguments as as.dist(crossprod(x)) ...
edit: in comments on a now-deleted answer below, I suggested that the square of the cosine-distance measure might be appropriate if one wants a similarity measure on [0,1] -- this would be analogous to using the coefficient of determination (r^2) rather than the correlation coefficient (r) -- but that it might also be worth going back and thinking more carefully about the purpose/meaning of the similarity measures to be used ...
The cosine function returns
crossprod(a, b)/sqrt(crossprod(a) * crossprod(b))
In this case, both the terms in the denominator are 1, but crossprod(a, b) is -0.05.
The cosine function can take on negative values.
While cosine of two vectors can take any value between -1 and +1, cosine similarity (in dicument retreival) used to take values from the [0,1] interval. The reason is simple: in the WordxDocument matrix there are no negative values, so the maximum angle of two vectors is 90 degrees, for wich the cosine is 0.

Resources