Create a sparse symmetric random matrix in Julia - julia

is there an easy way to create a sparse symmetric random matrix in Julia?
Julia has the command
sprand(m,n,d)
which "Creates a [sparse] m-by-n random matrix (of density d) with iid non-zero elements distributed uniformly on the half-open interval [0,1)[0,1)." But as far as I can tell this doesn't necessarily return a symmetric matrix.
I am looking for an equivalent command to MATLAB's
R = sprandsym(n,density)
which automatically creates a sparse symmetric random matrix. If such a command isn't implemented yet, what would be a workaround to transform the matrix returned by sprand(m,n,d) into a symmetric one?
Thank you!

You could Symmetric(sprand(10,10,0.4))

To avoid extra memory caveat mentioned in the comment to Michael Borregaard's answer, the following function takes a sparse matrix and drops the entries in the lower triangular part. If the SparseMatrixCSC format is unfamiliar, it also serves as a good presentation of how the format is manipulated:
function droplower(A::SparseMatrixCSC)
m,n = size(A)
rows = rowvals(A)
vals = nonzeros(A)
V = Vector{eltype(A)}()
I = Vector{Int}()
J = Vector{Int}()
for i=1:n
for j in nzrange(A,i)
rows[j]>i && break
push!(I,rows[j])
push!(J,i)
push!(V,vals[j])
end
end
return sparse(I,J,V,m,n)
end
Example usage:
julia> a = [0.5 1.0 0.0 ; 2.0 0.0 0.0 ; 0.0 0.0 0.0]
3×3 Array{Float64,2}:
0.5 1.0 0.0
2.0 0.0 0.0
0.0 0.0 0.0
julia> b = sparse(a)
3×3 SparseMatrixCSC{Float64,Int64} with 3 stored entries:
[1, 1] = 0.5
[2, 1] = 2.0
[1, 2] = 1.0
julia> c = droplower(b)
3×3 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
[1, 1] = 0.5
[1, 2] = 1.0
julia> full(Symmetric(c)) # note this is symmetric although c isn't
3×3 Array{Float64,2}:
0.5 1.0 0.0
1.0 0.0 0.0
0.0 0.0 0.0

Operations on the SparseMatrixCSC often need to be customized for maximum efficiency. So, to get from a sparse matrix A to a symmetric sparse matrix with the same upper part, here is a custom version (it is a bit cryptic, but working):
function symmetrize(A::SparseMatrixCSC)
m,n = size(A)
m == n || error("argument expected to be square matrix")
rows = rowvals(A) ; vals = nonzeros(A)
a = zeros(Int,n) ; b = zeros(Int,n) ; c = 0
for i=1:n
for j in nzrange(A, i)
if rows[j]>=i
if rows[j]==i a[i] += 1 ; c += 1 ; end
break
end
a[i] += 1 ; b[rows[j]] += 1 ; c += 2
end
end
c == 0 && return SparseMatrixCSC(n, n, ones(n+1), nrows, nvals)
ncolptr = Vector{Int}(n+1)
nrows = Vector{Int}(c) ; nvals = Vector{eltype(A)}(c)
idx = 1
for i=1:n
ncolptr[i] = idx
if a[i]==0 a[i] = idx ; idx += b[i] ; continue ; end
for j in (0:a[i]-1)+first(nzrange(A, i))
nvals[idx] = vals[j] ; nrows[idx] = rows[j] ; idx += 1
rows[j] >= i && break
nvals[a[rows[j]]] = vals[j] ; nrows[a[rows[j]]] = i
a[rows[j]] += 1
end
a[i] = idx ; idx += b[i]
end
ncolptr[n+1] = idx
return SparseMatrixCSC(n, n, ncolptr, nrows, nvals)
end
And a sample run:
julia> f = sprand(5,5,0.2)
5×5 SparseMatrixCSC{Float64,Int64} with 5 stored entries:
[1, 1] = 0.981579
[3, 1] = 0.330961
[5, 1] = 0.527683
[4, 5] = 0.196898
[5, 5] = 0.579006
julia> full(f)
5×5 Array{Float64,2}:
0.981579 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.330961 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.196898
0.527683 0.0 0.0 0.0 0.579006
julia> full(symmetrize(f))
5×5 Array{Float64,2}:
0.981579 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.196898
0.0 0.0 0.0 0.196898 0.579006
This version should be faster than others, but this still needs to be benchmarked (and some #inbounds added in the for loops).

Related

Pairwise distance matrix between two vectors

x1=[1,2,3]
x2=[2,3,4]
how to find the Pairwise distance matrix between x1 and x2 (distance matrix should be a 3 x 3 matrix)
This is not a Euclidean distance matrix, but it is 3 X 3. Is it what you want?
julia> x1 = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> x2 = [2,3,4]
3-element Vector{Int64}:
2
3
4
julia> [(a-b)^2 for a in x1, b in x2]
3×3 Matrix{Int64}:
1 4 9
0 1 4
1 0 1
With Distances.jl:
julia> pairwise(Euclidean(), x1, x2)
3×3 Matrix{Float64}:
1.0 2.0 3.0
0.0 1.0 2.0
1.0 0.0 1.0
(Although this will not return integers, as it uses BLAS stuff internally.)
Since you ask for Euclidean distance between all combinations of vector pairs from x1 and x2, that's, the distance between [1, 2] and [2, 3], [1, 2] and [2, 4], ..., [2, 3] and [3, 4], this can be done as follows:
Using Combinatorics.jl, construct all pairs from x1 by taking 2 elements at a time. Do the same for x2. Now you have c1 and c2, just loop over the two sequences applying the formal definition of Euclidean distance, sqrt(sum((x-y)^2)), to get the 3-by-3 matrix of pairwise distances that you want.
using Combinatorics
x1 = [1, 2, 3]
x2 = [2, 3, 4]
c1 = combinations(x1, 2)
c2 = combinations(x2, 2)
pairwise = [sqrt(sum((i.-j).^2)) for i in c1, j in c2]
3×3 Matrix{Float64}:
1.41421 2.23607 2.82843
1.0 1.41421 2.23607
0.0 1.0 1.41421
If you like higher-order index notations similar to math books, you can use Tullio.jl like this:
using Tullio
x = collect(combinations(x1, 2))
y = collect(combinations(x2, 2))
#tullio pairwise[i,j] := sqrt(sum((x[i].-y[j]).^2))
3×3 Matrix{Float64}:
1.41421 2.23607 2.82843
1.0 1.41421 2.23607
0.0 1.0 1.41421
You can try:
abs.(x1 .- x2')
#3×3 Array{Int64,2}:
# 1 2 3
# 0 1 2
# 1 0 1
Where x2' turns x2 in a column vector and .- and abs. makes element wise operations.
Or creating the desired pairs (1,2)and(2,3), (1,2)and(2,4), ... (2,3)and(3,4) and calculating the distance using norm.
using LinearAlgebra
#Create pairs
c1 = [(x1[i], x1[j]) for i in 1:lastindex(x1)-1 for j in i+1:lastindex(x1)]
c2 = [(x2[i], x2[j]) for i in 1:lastindex(x2)-1 for j in i+1:lastindex(x2)]
#Calc distance
[norm(i.-j) for i in c1, j in c2]
#3×3 Array{Float64,2}:
# 1.41421 2.23607 2.82843
# 1.0 1.41421 2.23607
# 0.0 1.0 1.41421
#Calc cityblock distance
[norm(i.-j, 1) for i in c1, j in c2]
#3×3 Array{Float64,2}:
# 2.0 3.0 4.0
# 1.0 2.0 3.0
# 0.0 1.0 2.0

Flattening block matrix (matrix of matrices)

Consider a 2x2 matrix of 2x2 blocks, e.g.,
using LinearAlgebra
using StaticArrays
M = zeros(SMatrix{2, 2, Float64, 4}, 2, 2)
M[1,1] = SA[1 0; 0 2]
M[2,2] = SA[0 3; 4 0]
What is the best way to convert M into a 4x4 matrix of scalars?
# how to convert M into this?
M2 = [1 0 0 0;
0 2 0 0;
0 0 0 3;
0 0 4 0]
In my real problem the matrix sizes will be larger, but the general question remains: How to flatten a block matrix into one larger matrix.
I recommend using BlockArrays.jl:
julia> using BlockArrays
julia> mortar(M)
2×2-blocked 4×4 BlockMatrix{Float64, Matrix{SMatrix{2, 2, Float64, 4}}, Tuple{BlockedUnitRange{Vector{Int64}}, BlockedUnitRange{Vector{Int64}}}}:
1.0 0.0 │ 0.0 0.0
0.0 2.0 │ 0.0 0.0
──────────┼──────────
0.0 0.0 │ 0.0 3.0
0.0 0.0 │ 4.0 0.0
You can of course do Matrix(mortar(M)) to get back to a "normal" matrix. However, if you have this kind of data structure you should like staying with the BlockArray.

Replace the specific element in array also change other one [duplicate]

This question already has answers here:
Creating copies in Julia with = operator
(2 answers)
Closed 3 years ago.
Here is the example code. I can't understand why the first element in array B also be revised. Can I keep the same element in array B?
julia> A = [0.0 0.1 0.2 0.3];
julia> B = A;
julia> A[1] = 0.1;
julia> A
1×4 Array{Float64,2}:
0.1 0.1 0.2 0.3
julia> B
1×4 Array{Float64,2}:
0.1 0.1 0.2 0.3
Julia Array is passed by reference. You need to create a copy:
julia> A = [0.0 0.1 0.2 0.3];
julia> B = deepcopy(A)
1×4 Array{Float64,2}:
0.0 0.1 0.2 0.3
julia> A[1] = 0.1;
julia> A, B
([0.1 0.1 0.2 0.3], [0.0 0.1 0.2 0.3])
Note that for this code just copy will be also enough but if for example you have an Array of objects that you mutate deepcopy should be used.

Lower RAM consumption for a transformation of a transition matrix

I've written the following two functions, that take as input a transition matrix and which nodes should be at absorbing states and transforms it.
The first function set.absorbing.states() has 3 arguments. tm is the initial transition matrix, the second one inn is one specified innitial node, while the third one soi is the set of interest. By 'set of interest', I mean a set of nodes in that matrix that must been set as absorbing states. Such an initial matrix is the following:
tm <- read.table(row.names=1, header=FALSE, text="
A 0.2 0.3 0.1 0.2 0.1 0.1
B 0.3 0.1 0.1 0.2 0.2 0.1
C 0 0.2 0.4 0.1 0.2 0.1
D 0.2 0.1 0.2 0.3 0.1 0.1
E 0.2 0.2 0.1 0.2 0.1 0.2
F 0.3 0.2 0.4 0.1 0 0")
colnames(tm) <- row.names(tm)
As you can see there are no absorbing states in that matrix. Let's say for example that we want to set as absorbing states the A and E and a randomly selected initial node B.
By executing the first function tm1 <- set.absorbing.states( tm , "B", c("A","E")) we are getting back a matrix that the absorbing states have been setted:
A B C D E F
A 1.0 0.0 0.0 0.0 0.0 0.0
B 0.3 0.1 0.1 0.2 0.2 0.1
C 0.0 0.2 0.4 0.1 0.2 0.1
D 0.2 0.1 0.2 0.3 0.1 0.1
E 0.0 0.0 0.0 0.0 1.0 0.0
F 0.3 0.2 0.4 0.1 0.0 0.0
As you can see, A and E have been changed into absorbing states.
The next step is to transform that matrix into a way that all absorbing state nodes (both rows and columns) go to the end. So by running ptm <- transform.tm( tm1, c("A","E") ) we get back a matrix that looks like:
B C D F A E
B 0.1 0.1 0.2 0.1 0.3 0.2
C 0.2 0.4 0.1 0.1 0.0 0.2
D 0.1 0.2 0.3 0.1 0.2 0.1
F 0.2 0.4 0.1 0.0 0.3 0.0
A 0.0 0.0 0.0 0.0 1.0 0.0
E 0.0 0.0 0.0 0.0 0.0 1.0
You can see now clearly that A and E nodes went to the end of that matrix.
Here follows the function I'm using.
set.absorbing.states <- function ( tm, inn, soi )
{
set <- which( row.names(tm) %in% soi )
set <- set[which( set != inn )]
for (i in set )
tm[i,] <- 0
for (i in set)
tm[i,i] <- 1
tm
}
transform.tm <- function ( tm, soi )
{
end_sets <- which(row.names(tm) %in% soi)
ptm <- rbind( cbind(tm[-end_sets, -end_sets], tm[-end_sets, end_sets]) , cbind(tm[end_sets, -end_sets], tm[end_sets, end_sets]) )
ptm
}
The thing now is that with such small matrices, everything is working properly. But I tried to use a big matrix (20.000*20.000) and it needed 32GB RAM to execute the second function.
So is there any way to execute this in more resource efficient way ?
Use indexing will significantly reduce the number of copies that your transformation function is creating (via rbind and cbind). It is probably a bit simpler conceptually (conditional on a solid understanding of indexing with [).
transform.tm1 <- function ( tm, soi ) {
newOrder <- c(setdiff(row.names(tm), soi), soi)
tm[newOrder, newOrder]
}
Here, setdiff is used to pull the non matching names and put them at the front a the vector. Then, simply reorder the matrix via row/column names.
This returns
transform.tm1(tm1, c("A", "E"))
B C D F A E
B 0.1 0.1 0.2 0.1 0.3 0.2
C 0.2 0.4 0.1 0.1 0.0 0.2
D 0.1 0.2 0.3 0.1 0.2 0.1
F 0.2 0.4 0.1 0.0 0.3 0.0
A 0.0 0.0 0.0 0.0 1.0 0.0
E 0.0 0.0 0.0 0.0 0.0 1.0
check that they return the same results
identical(transform.tm(tm1, c("A", "E")), transform.tm1(tm1, c("A", "E")))
[1] TRUE

How do you generate a regular non-integer sequence in julia?

How are regular, non-integer sequences generated in julia?
I'm trying to get 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
In MATLAB, I would use
0.1:0.1:1
And in R
seq(0.1, 1, by = 0.1)
But I can't find anything except integer sequences in julia (e.g., 1:10). Searching for "sequence" in the docs only gives me information about how strings are sequences.
Similarly to Matlab, but with the difference that 0.1:0.1:1 defines a Range:
julia> typeof(0.1:0.1:1)
Range{Float64} (constructor with 3 methods)
and thus if an Array is needed:
julia> [0.1:0.1:1]
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Unfortunately, this use of Range is only briefly mentioned at this point of the documentation.
Edit: As mentioned in the comments by #ivarne it is possible to achieve a similar result using linspace:
julia> linspace(.1,1,10)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
but note that the results are not exactly the same due to rounding differences:
julia> linspace(.1,1,10)==[0.1:0.1:1]
false
The original answer is now deprecated. You should use collect() to generate a sequence.
## In Julia
> collect(0:.1:1)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
## In R
> seq(0, 1, .1)
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
They are generated the same way as in Matlab
julia> sequence = 0:.1:1
0.0:0.1:1.0
Alternatively, you can use the range() function, which allows you to specify the length, step size, or both
julia> range(0, 1, length = 5)
0.0:0.25:1.0
julia> range(0, 1, step = .01)
0.0:0.01:1.0
julia> range(0, step = .01, length = 5)
0.0:0.01:0.04
You can still do all of the thinks you would normally do with a vector, eg indexing
julia> sequence[4]
0.3
math and stats...
julia> sum(sequence)
5.5
julia> using Statistics
julia> mean(sequence)
0.5
This will (in most cases) work the same way as a vector, but nothing is actually allocated. It can be comfortable to make the vector, but in most cases you shouldn't (it's less performant). This works because
julia> sequence isa AbstractArray
true
If you truly need the vector, you can collect(), splat (...) or use a comprehension:
julia> v = collect(sequence)
11-element Array{Float64,1}:
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
julia> v == [sequence...] == [x for x in sequence]
true

Resources