How to normalize the columns of a matrix in Julia - julia

Given a matrix A of dimensions m,n, how would one normalize the columns of that matrix by some function or other process in Julia (the goal would be to normalize the columns of A so that our new matrix has columns of length 1)?

If you want a new matrix then mapslices is probably what you want:
julia> using LinearAlgebra
julia> x = rand(5, 3)
5×3 Matrix{Float64}:
0.185911 0.368737 0.533008
0.957431 0.748933 0.479297
0.567692 0.477587 0.345943
0.743359 0.552979 0.252407
0.944899 0.185316 0.375296
julia> y = mapslices(x -> x / norm(x), x, dims=1)
5×3 Matrix{Float64}:
0.112747 0.327836 0.582234
0.580642 0.66586 0.523562
0.344282 0.424613 0.377893
0.450816 0.491642 0.275718
0.573042 0.164761 0.409956
julia> map(norm, eachcol(y))
3-element Vector{Float64}:
1.0
1.0
1.0

If by length 1 you mean like with norm 1 maybe this could work
using LinearAlgebra
# if this is your matrix
m = rand(10, 10)
# norm comes from LinearAlgebra
# or you can define it as
# norm(x) = sqrt(sum(i^2 for i in x))
g(x) = x ./ norm(x)
# this has columns that have approximately norm 1
normed_m = reduce(hcat, g.(eachcol(m)))
Though there are likely better solutions I don't know!

mapslices seems to have some issues with performance. On my computer (and v1.7.2) this is 20x faster:
x ./ norm.(eachcol(x))'
This is an in-place version (because eachcol creates views), which is faster still (but still allocates a bit):
normalize!.(eachcol(x))
And, finally, some loop versions that are 40-70x faster than mapslices for the 5x3 matrix:
# works in-place:
function normcol!(x)
for col in eachcol(x)
col ./= norm(col)
end
return x
end
# creates new array:
normcol(x) = normcol!(copy(x))
Edit: Added a one-liner with zero allocations:
foreach(normalize!, eachcol(x))
The reason this does not allocate anything, unlike normalize!., is that foreach doesn't return anything, which makes it useful in cases where output is not needed.

Related

How to insert an element at a specific position of an empty vector?

In R we can create an empty vector where it is possible to insert an element in any position of this vector.
Example:
> x <- c()
> x[1] = 10
> x[4] = 20
The final result is:
> x
[1] 10 NA NA 20
I would like to do something similar using Julia, but couldn't find a way to do this.
The “append” function do not perform something like that.
Could anyone help?
You need to do this in two steps:
First resize the vector or create a vector with an appropriate size.
Next set the elements accordingly.
Since you are coming from R I assume you want the vector to be initially filled with missing values. Here is the way to do this.
In my example I assume you want to store integers in the vector. Before both options load the Missings.jl package:
using Missings
Option 1. Start with an empty vector
julia> x = missings(Int, 0)
Union{Missing, Int64}[]
julia> resize!(x, 4)
4-element Vector{Union{Missing, Int64}}:
missing
missing
missing
missing
julia> x[1] = 10
10
julia> x[4] = 40
40
julia> x
4-element Vector{Union{Missing, Int64}}:
10
missing
missing
40
Option 2. Preallocate a vector
julia> x = missings(Int, 4)
4-element Vector{Union{Missing, Int64}}:
missing
missing
missing
missing
julia> x[1] = 10
10
julia> x[4] = 40
40
The reason why Julia does not resize the vectors automatically is for safety. Sometimes it would be useful, but most of the time if x is an empty vector and you write x[4] = 40 it is a bug in the code and Julia catches such cases.
EDIT
What you can do is:
function setvalue(vec::Vector, idx, val)
#assert idx > 0
if idx > length(vec)
resize!(vec, idx)
end
vec[idx] = val
return vec
end

broadcasting vector multiplication with a 3 vector and n vector in julia

I have a 3-vector c = [0.7, 0.5, 0.2] and I want to multiply it with everything in an n-vector x = rand((-1,1),n) such that I get a resulting n+2-vector y where y[i] == x[i]*c[3] + x[i-1]*c[2] + x[i-2]*c[1]
How should I do this in julia? I feel like there should be a way to broadcast the smaller 3 vector to all the values in the n vector. And for the edge cases, if i-1 or i-2 is out of bounds I just want zero for those components.
If I understand your question correctly you want a convolution, with a twist that in a standard convolution the vector c would be reversed. You can use e.g. DSP.jl for this.
Is this what you want?
julia> using DSP
julia> c = [0.7, 0.5, 0.2]
3-element Array{Float64,1}:
0.7
0.5
0.2
julia> conv([10, 100, 1000, 10000], reverse(c))
6-element Array{Float64,1}:
1.9999999999996967
25.0
257.0000000000003
2569.9999999999995
5700.0
6999.999999999998
You can also manually implement it using dot from the LinearAlgebra module like this:
julia> using LinearAlgebra
julia> x = [10, 100, 1000, 10000]
4-element Array{Int64,1}:
10
100
1000
10000
julia> y = [0;0;x;0;0]
8-element Array{Int64,1}:
0
0
10
100
1000
10000
0
0
julia> [dot(#view(y[i:i+2]), c) for i in 1:length(x)+2]
6-element Array{Float64,1}:
2.0
25.0
257.0
2570.0
5700.0
7000.0
Here's one approach that uses ShiftedArrays.jl.
using ShiftedArrays
c = [0.7, 0.5, 0.2]
Create lagged versions of x, with initial zeros:
x = 1:5
xminus1 = lag(x, 1, default=0)
xminus2 = lag(x, 2, default=0)
Horizontally concatenate the vectors and use matrix multiplication with c:
X = [xminus2 xminus1 x]
X * c
Here's what X and X * c look like at the REPL:
julia> X = [xminus2 xminus1 x]
5×3 Array{Int64,2}:
0 0 1
0 1 2
1 2 3
2 3 4
3 4 5
julia> X * c
5-element Array{Float64,1}:
0.2
0.9
2.3
3.7
5.1
Note that this produces an output vector of length length(x), not length(x) + 2. I'm not sure how it would make sense for the output to be of length length(x) + 2, as you requested in the question.
I have a package for doing such things. The simplest use is like this:
julia> c = [0.7, 0.5, 0.2]; # from question
julia> x = [10, 100, 1000, 10_000]; # from another answer
julia> using Tullio, OffsetArrays
julia> #tullio y[i] := x[i]*c[3] + x[i-1]*c[2] + x[i-2]*c[1]
2-element OffsetArray(::Vector{Float64}, 3:4) with eltype Float64 with indices 3:4:
257.0
2570.0
julia> #tullio y[i] := x[i+k-3] * c[k] # sum over all k, range of i that's safe
2-element OffsetArray(::Array{Float64,1}, 3:4) with eltype Float64 with indices 3:4:
257.0
2570.0
Since eachindex(c) == 1:3, that's the range of k-values which this sums over, and the range of i is as big as it can be so that i+k-3 stays inside eachindex(x) == 1:4.
To extend the range of i by padding x with two zeros in each direction, write pad(i+k-3, 2). And to compute the shift of i needed to produce an ordinary 1-based Array, write i+_ on the left (and then the -3 makes no difference). Then:
julia> #tullio y[i+_] := x[pad(i+k, 2)] * c[k]
6-element Array{Float64,1}:
2.0
25.0
257.0
2570.0
5700.0
7000.0
On larger arrays, this won't be very fast (at the moment) as it must check at every step whether it is inside x or out in the padding. It's very likely that DSP.conv is a bit smarter about this. (Edit -- DSP.jl seems never to be faster for this c; with a kernel of length 1000 it's faster with 10^6 elements in x.)

Is there a lazy and iterative `map` in Julia?

The map function seems eager, e.g.
map(x->x+1, 1:3) gives one [2,3,4].
I want to find a lazy and iterative version of map so that the values are not generated all at once, so I can just get values one by one from the result of the map?
You can use Base.Generator for this, e.g. in your case:
julia> g = (x + 1 for x in 1:3)
Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##5#6"))}(getfield(Main, Symbol("##5#6"))(), 1:3)
julia> collect(g)
3-element Array{Int64,1}:
2
3
4

Eigenvalues of a matrix, assuming symmetry

I am trying to find the eigenvalues of the following 2 X 2 matrix (equal to a) in Julia:
2×2 Array{Float64,2}:
0.120066 0.956959
0.408367 0.422321
I have the same array in R, and running the following R command I get this output:
eigen(a, symmetric=T, only.values=T)
$values
[1] 0.706626 -0.164245
In Julia, however, when I run this command I get this output:
eigvals(LowerTriangular(a))
2-element Array{Float64,1}:
0.120066
0.422321
Is there a way to replicate the R eigen() function for symmetric matrices in Julia because my way with the LowerTriangular function is not working?
Use Symmetric function like this:
julia> eigvals(Symmetric(x, :L))
2-element Array{Float64,1}:
-0.164241
0.706628
Since Julia 0.7 you will have to use using LinearAlgebra to import the functions.
> x
[,1] [,2]
[1,] 0.120066 0.956959
[2,] 0.408367 0.422321
In Julia, eigvals(LowerTriangular(a)) computes the eigen values of the lower triangular part of x (that is, the entries of the strict upper triangular part are set to 0):
> xx <- x
> xx[1,2] <- 0
> eigen(xx, only.values = TRUE)
$values
[1] 0.422321 0.120066 # same as Julia
While in R, eigen(x, symmetric=TRUE) assumes x is symmetric and takes the lower triangular part to derive the other entries:
> xx <- x
> xx[1,2] <- x[2,1]
> eigen(xx, only.values = TRUE)
$values
[1] 0.7066279 -0.1642409
> eigen(x, only.values = TRUE, symmetric = TRUE)
$values
[1] 0.7066279 -0.1642409

Is there an elegant, built-in way to do modulo indexing in R?

Currently, I have
extract_modulo = function(x, n, fn=`[`) fn(x, (n-1L) %% length(x) + 1L)
`%[mod%` = function (x, n) extract_modulo(x, n)
And then:
seq(12) %[mod% 14
#[1] 2
Is this already built into R somewhere? I would think so, because R has several functions that recycle values (e.g., paste). However, I'm not finding anything with help('[['), ??index, or ??mod. I would think an R notation for this would be something like seq(12)[/14/] or as.list(seq(12))[[/14/]], for example.
rep_len() is a fast .Internal function, and appropriate for this use or when recycling arguments in your own function. For this particular case, where you're looking for the value at an index position beyond the length of a vector, rep_len(x, n)[n] will always do what you're looking for, for any nonnegative whole number 'n', and any non NULL x.
rep_len(seq(12), 14)[14]
# [1] 2
rep_len(letters, 125)[125]
# [1] "u"
And if it turns out you didn't need to recycle x, it works just as fine with an n value that is less than length(x)
rep_len(seq(12), 5)[5]
# [1] 5
rep_len(seq(12), 0)[0]
# integer(0)
# as would be expected, there is nothing there
You could of course create a wrapper if you'd like:
recycle_index <- function(x, n) rep_len(x, n)[n]
recycle_index(seq(12), 14)
# [1] 2

Resources