Cross Entropy Loss in Flux.jl - Julia - julia

I want to setup my model to use cross entropy loss in Flux.jl. How can I do this and where would I pass the loss function itself?

Flux.jl provides a build in module with many common loss function via the Flux.Losses module which you can access by doing using Flux.Losses. There is a built in cross entropy loss function in the module which can be used as follows:
julia> y_label = Flux.onehotbatch([0, 1, 2, 1, 0], 0:2)
3×5 Flux.OneHotArray{3,2,Vector{UInt32}}:
1 0 0 0 1
0 1 0 1 0
0 0 1 0 0
julia> y_model = softmax(reshape(-7:7, 3, 5) .* 1f0)
3×5 Matrix{Float32}:
0.0900306 0.0900306 0.0900306 0.0900306 0.0900306
0.244728 0.244728 0.244728 0.244728 0.244728
0.665241 0.665241 0.665241 0.665241 0.665241
julia> sum(y_model; dims=1)
1×5 Matrix{Float32}:
1.0 1.0 1.0 1.0 1.0
julia> Flux.crossentropy(y_model, y_label)
1.6076053f0
You can find the complete Flux.crossentropy function definition here: https://fluxml.ai/Flux.jl/stable/models/losses/#Flux.Losses.crossentropy
After you define the loss function, you can pass it to the built in train function: Flux.train!(loss, params(model), data, opt) or use it in your custom training loop.

Related

Fast Fourier Transform for tensor product spaces in Julia

Right now I use the package FFTW in order to get some Fourier Transforms I am interested in. However, I'm wondering if there is already a package of FFT that can do the transformation in a vector space which is of the form kron(C2, Rn), where C2 means a 2x2 system and Rn represents the "spatial" subspace in which one is interested in getting the Fourier Transform. In other words, does it exist a routine that implements:
kron(Id2x2, FFT)[kron(C2, Rn)] = kron(C2, FFT(Rn))
Of course the real problem I am interested is in the "two particle case" where the vector space (Hilbert space) is kron(kron(C2, Rn),kron(C2, Rn)), so in this case the routine would need an operator like kron(kron(Id2x2, FFT), kron(Id2x2, FFT)).
Note 1: I haven't tried to do the problem taking partial traces, but in my case this option simply may not work because the states are sparse, i.e. it might be ineficient.
Note 2: Note that (unless I'm mistaken) for kron(C2, Rn) one could do "twice" the fft (one in each sector of C2). However this might also be ineficient for large vector spaces.
Here's an example of what I think you are asking. res is computed by FFT from mat = kron(C2, Rn), and this is (as you say) a wasteful way of doing kron(C2, fft(Rn)) since it the FFT along the k dimension is re-done for each of the 2×2 other dimensions. But the point, presumably, is to do this for "entangled" states in the product space -- a generic likemat = rand(8,2) cannot be decomposed into factors kron(likeC2, likeRn).
(If instead you are really only interested in "un-entangled" product states, then you should probably just work with their components. Combining with kron will then always be wasteful. The package Kronecker.jl may help for some things, but I don't think it knows about fft.)
This uses my package to handle kron-like operations; you could just write out the necessary reshapes yourself, too.
julia> C2 = [1 2; 3 4]; Rn = [1,10,0,0];
julia> mat = kron(C2,Rn)
8×2 Matrix{Int64}:
1 2
10 20
0 0
0 0
3 4
30 40
0 0
0 0
julia> using TensorCast, FFTW
# notation: kron is a reshape of a tensor product, to combine i & k
julia> kron(C2,Rn) == #cast out[(k,i),j] := C2[i,j] * Rn[k]
true
# reshape mat to put the index from Rn in its own dimension:
julia> #cast tri[k,i,j] := mat[(k,i),j] (i in 1:2);
julia> summary(tri)
"4×2×2 Array{Int64, 3}"
# then fft(tri, 1) is the FFT along only that, reshape back:
julia> #cast res[(ktil,i),j] := fft(tri, 1)[ktil,i,j]
8×2 Matrix{ComplexF64}:
11.0+0.0im 22.0+0.0im
1.0-10.0im 2.0-20.0im
-9.0+0.0im -18.0+0.0im
1.0+10.0im 2.0+20.0im
33.0+0.0im 44.0+0.0im
3.0-30.0im 4.0-40.0im
-27.0+0.0im -36.0+0.0im
3.0+30.0im 4.0+40.0im
julia> res ≈ kron(C2, fft(Rn))
true
julia> res ≈ fft(mat, 1)
false
julia> fft(Rn)
4-element Vector{ComplexF64}:
11.0 + 0.0im
1.0 - 10.0im
-9.0 + 0.0im
1.0 + 10.0im
# if fft() understood the dims keyword, it could be tidier:
julia> _fft(x; dims) = fft(x, dims);
julia> #cast _res[(k,i),j] := _fft(k) mat[(k,i),j] (i in 1:2);
julia> _res ≈ res
true

Julia DSP: Convolution of discrete signals

Here is the problem. I want to write a convolution for two simple signals x[n]=0.2^n*u[n] and h[n]=u[n+2] for some values of n. This is how I implement it:
using Plots, DSP
x(n) = if n<0 0 else 0.2^n end
h(n) = if n<-2 0 else 1 end
n = -10:10
conv(h.(n),x.(n))
It doesn't work. Here is the error:
`float` not defined on abstractly-typed arrays; please convert to a more specific type
Any idea how may I fix it?
I ran it fine with a fresh REPL session:
julia> using Plots, DSP
[ Info: Precompiling Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]
[ Info: Precompiling DSP [717857b8-e6f2-59f4-9121-6e50c889abd2]
julia> x(n) = if n<0 0 else 2^n end
x (generic function with 1 method)
julia> h(n) = if n<-2 0 else 1 end
h (generic function with 1 method)
julia> n = -10:10
-10:10
julia> conv(h.(n),x.(n))
41-element Array{Int64,1}:
0
0
(etc)
1984
1920
1792
1536
1024
julia> plot(conv(h.(n),x.(n)))
(plots ok)
If you change the 2 in 2^n to a float you need to specify Float64:
julia> x(n) = if n<0 0 else 0.2^n end
x (generic function with 1 method)
julia> conv(h.(n),Float64.(x.(n)))
41-element Array{Float64,1}:
0.0
8.458842092382145e-17
2.5376526277146434e-16
4.229421046191072e-17
2.1147105230955362e-16
(etc)
7.997440000004915e-5
1.597440000003685e-5
3.1744000002024485e-6
6.144000000924524e-7
1.0240000015600833e-7

Sample from Vector/Array with Probabilities

I have a Bool vector, simply [true, false]. I can draw 10 samples from that vector with
rand([true,false], 10)
but how can I achieve that true is drawn with a 80%-probability and false is drawn with a 20%-probability?
Use the sample function from StatsBase.jl with Weights argument:
julia> using StatsBase
julia> sample([true, false], Weights([0.8, 0.2]), 10)
10-element Array{Bool,1}:
1
0
1
1
1
1
1
1
1
1
And to make sure you get what you wanted you can write:
julia> countmap(sample([true, false], Weights([0.8, 0.2]), 10^8))
Dict{Bool,Int64} with 2 entries:
false => 20003766
true => 79996234
(of course your exact numbers will differ)
Also if you specifically need binary sampling you can use Bernoulli distribution from Distributions.jl:
julia> using Distributions
julia> rand(Bernoulli(0.8), 10)
10-element Array{Bool,1}:
0
1
1
0
1
1
1
1
1
1
julia> countmap(rand(Bernoulli(0.8), 10^8))
Dict{Bool,Int64} with 2 entries:
false => 20005900
true => 79994100
(you can expect this method to be faster)
Finally - if you do not want to use any packages and need a binary result you can just write rand(10) .< 0.8, and again - you get what you wanted:
julia> countmap(rand(10^8) .< 0.8)
Dict{Bool,Int64} with 2 entries:
false => 20003950
true => 79996050

Julia: creating SparseArrays

I want to create a sparse array where I define a rule to combine duplicates. The documentation says that I can do this with sparse(i,j,v,[m,n,combine]). I've tried in the example below but am unsuccessful. Can you please advise?
i = [1,2,3,3];
j = [1,2,3,2];
v = [10.,11.,12.,13.];
full(sparse([i;j],[j;i],[v;v], [3,3,combine(a,b)=mean([a,b])]))
full(sparse([i;j],[j;i],[v;v], [3,3,mean]))
full(sparse([i;j],[j;i],[v;v], [3,3,-(a,b)]))
full(sparse([i;j],[j;i],[v;v], [3,3,-]))
Square brackets in docstring mean that those are optional arguments. The way to write it is:
julia> full(sparse([i;j],[j;i],[v;v], 3,3,-))
3×3 Array{Int64,2}:
0 0 0
0 0 13
0 13 0
you can omit the last argument, then combine defaults to +:
julia> full(sparse([i;j],[j;i],[v;v], 3,3))
3×3 Array{Int64,2}:
20 0 0
0 22 13
0 13 24
You can check what argument sets the function accepts using methods(sparse). Additionally if you e.g. write #edit sparse([i;j],[j;i],[v;v]) you will go to the source code of sparse and can learn exactly what is accepted.

Bad results in sparse matrix assignment with logical indexing

In Matlab/Octave, I can use logical indexing to assign a value to matrix B in every location that meets a certain requirement in matrix A.
octave:1> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09];
octave:2> C = A < .12
C =
1
0
0
0
1
0
0
1
1
1
octave:3> B = spalloc(10,1);
octave:4> B(C) = 1
B =
Compressed Column Sparse (rows = 10, cols = 1, nnz = 5 [50%])
(1, 1) -> 1
(5, 1) -> 1
(8, 1) -> 1
(9, 1) -> 1
(10, 1) -> 1
However, if I attempt essentially the same code in Julia, the results are incorrect:
julia> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09];
julia> B = spzeros(10,1)
10x1 sparse matrix with 0 Float64 entries:
julia> C = A .< .12
10-element BitArray{1}:
true
false
false
false
true
false
false
true
true
true
julia> B[C] = 1
1
julia> B
10x1 sparse matrix with 5 Float64 entries:
[0 , 1] = 1.0
[0 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
Have I made a mistake in the syntax somewhere, am I misunderstanding something, or is this a bug? Note, I get the correct results if I use full matrices in Julia, but since the matrix in my application is really sparse (essential boundary conditions in a finite element simulation), I would much prefer to use the sparse matrices
It looks as if sparse has some problems with BitArray's.
julia> VERSION
v"0.3.5"
julia> A = [.1;.2;.3;.4;.11;.13;.14;.01;.04;.09]
julia> B = spzeros(10,1)
julia> C = A .< .12
julia> B[C] = 1
julia> B
10x1 sparse matrix with 5 Float64 entries:
[0 , 1] = 1.0
[0 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
[1 , 1] = 1.0
So I get the same thing as the questioner. However when I do things "my way"
julia> B = sparse(C)
ERROR: `sparse` has no method matching sparse(::BitArray{1})
julia> B = sparse(float(C))
10x1 sparse matrix with 5 Float64 entries:
[1 , 1] = 1.0
[5 , 1] = 1.0
[8 , 1] = 1.0
[9 , 1] = 1.0
[10, 1] = 1.0
So this works if you convert the BitArray to Float. I imagine that this workaround will get you going, but it does seem that sparse should work with BitArray.
Some Additional Thoughts (Edit)
As I thought further about this, it occurs to me that one reason why there is no BitArray method for sparse() is that it is not terribly useful to implement sparse storage for an already highly compact type. Considering B and C from above:
julia> sizeof(C)
8
julia> sizeof(B)
40
So for these data, the sparse version is much larger than the original. It's actually worse than this simple (perhaps simplistic) check shows at first glance. sizeof(::BitArray{1}) appears to be the size of the entire array, but sizeof(::SparseMatrixCSC{}) shows the size of each element stored. So the real size disparity is something like 8 versus 200 bytes.
Of course if the data is sparse enough (somewhat less than 1% true), sparse storage begins to win out, despite it's high overhead.
julia> C = rand(10^6) .< 0.01
julia> B = sparse(float(C))
julia> sizeof(C)
125000
julia> sum(C)*sizeof(B)
394520
julia> C = rand(10^6) .< 0.001
julia> B = sparse(float(C))
julia> sizeof(C)
125000
julia> sum(C)*sizeof(B)
40280
So perhaps it is not an oversight that sparse() has no BitArray method. Cases where it would represent a significant space saving may be less common than one might think at first glance.

Resources