Affinity Propagatiomh in julia with the Clustering pkg - julia

I would like to use the Affinity Propagation algorithm from the Clustering pkg in Julia.
I have a colletection of n points with m variables. I created a mxn array but i would like to know what is the S input in the function affinityprop(S::DenseMatrix{T}; ...)
The python sklearn implementation seems to take the mxn array as input.

Most likely it expects an affinity matrix.
But clearly there is a lack of documentation. You should file a bug report that asks for proper documentation to be included with Clustering.jl

Related

Does Flux.jl have tensor objects?

Coming from Tensorflow and Pytorch, does Flux.jl contain a tensor like structure? If not, what is the common way to structure your data?
From the Flux.jl docs:
The starting point for all of our models is the Array (sometimes referred to as a Tensor in other frameworks). This is really just a list of numbers, which might be arranged into a shape like a square.
So given this, the way to represent data is just via traditional matrices (which are just arrays). You can find out more about Julia's first class array support here: https://docs.julialang.org/en/v1/manual/arrays/

Does R include an efficient implementation of sets?

Is there an efficient implementation of the set data structure in R?
In C++ I would use an std::set (which is implemented using red-black trees), in Python a set (which is implemented using hash tables), but I am not sure what I should use in R.
I have found this link, which describes some set operations, like union() and intersection(), that you can perform on vectors. So, I guess that since vectors are involved, the complexities would not be logarithmic, as you could have using the data structures mentioned above.
Fun fact, note how in this case the name of the language does not help, searching "r set" one finds many results concerning $\mathbb{R}$, and not the programming language :D

Lapack Orthonormalization Function for Rectangular Matrix

I was wondering if there was a function in Lapack for orthonormalizing the columns of a very tall and skinny matrix. A similar previous question asked this question, presumably in the context of a square matrix. My setting is as follows: I have an M by N matrix A that I am trying to orthonormalize the columns of.
So, my first thought was to do a qr decomposition. The functions for doing a qr decomposition in Lapack seem to be dgeqrf and dormqr. Great. However, my problem is as follows: my matrix A is so tall, that I don't want to actually compute all of Q, because it is M by M. In fact, I can't afford to instantiate an M by M matrix at all during any of my computation (it would not fit in memory). I would rather compute just the matrix that wikipedia calls Q1. However, I can't seem to find a way to make this work.
The weird thing is, that I think it is possible. Numpy, in particular, has a function numpy.linalg.qr that appears to do just this. However, even after reading their source code, I can't figure out how they are using lapack calls to get this to work.
Do folks have ideas? I would strongly prefer this to only use lapack functions because I am hoping to port this code to CuSOLVE, which has implemented several lapack functions (including dgeqrf and dormqr) for the GPU.
You want the "thin" or "economy size" version of QR. In matlab, you can do this with:
[Q,R] = qr(A,0);
I haven't used Lapack directly, but I would imagine there's a corresponding call there. It appears that you can do this in python with:
numpy.linalg.qr(a, mode='reduced')

Is there an equivalent to matlab's rcond() function in Julia?

I'm porting some matlab code that uses rcond() to test for singularity, as also recommended here (for matlab singularity testing).
I see that there is a cond() function in Julia (as also in Matlab), but rcond() doesn't appear to be available by default:
ERROR: rcond not defined
I'd assume that rcond(), like the Matlab version is more efficient than 1/cond(). Is there such a function in Julia, perhaps using an add-on module?
Julia calculates the condition number using the ratio of maximum to the minimum of the eigenvalues (got to love open source, no more MATLAB black boxs!)
Julia doesn't have a rcond function in Base, and I'm unaware of one in any package. If it did, it'd just be the ratio of the maximum to the minimum instead. I'm not sure why its efficient in MATLAB, but its quite possible that whatever the reason is it doesn't carry though to Julia.
Matlab's rcond is an optimization based upon the fact that its an estimate of the condition number for square matrices. In my testing and given that its help mentions LAPACK's 1-norm estimator, it appears as though it uses LAPACK's dgecon.f. In fact, this is exactly what Julia does when you ask for the condition number of a square matrix with the 1- or Inf-norm.
So you can simply define
rcond(A::StridedMatrix) = 1/cond(A,1)
You can save Julia from twice-inverting LAPACK's results by manually combining cond(::StridedMatrix) and cond(::LU), but the savings here will almost certainly be immeasurable. Where there is a measurable savings, however, is that you can directly take the norm(A) instead of reconstructing a matrix similar to A through its LU factorization.
rcond(A::StridedMatrix) = LAPACK.gecon!('1', lufact(A).factors, norm(A, 1))
In my tests, this behaves identically to Matlab's rcond (2014b), and provides a decent speedup.

Full Singular Value Decomposition in R

In most applications (esp. statistical ones) the thin SVD suffices. However, on occasion one needs the full SVD in order to obtain an orthobasis of the null space of a matrix (and its conjugate). It seems that svd() in R only returns the thin version. Is it possible to produce the full version? Are there alternatives?
library(sos)
> findFn("svd NULL space")
found 47 matches; retrieving 3 pages
This looks on point:
MSBVAR null.space Find the null space of a matrix
As does this function in MASS.
R Core uses the routines from Linpack, Lapack, ... that it needs.
If you need something different, you probably need to either get yourself other Linpack etc routines, or connect to a library providing more.
Doug Bates just wrapped the Eigen library in the RcppEigen package which may have something for you. Eigen appear to be both powerful and fairly featureful while being highly optimised.

Resources