Union of Integer, AbstractVector{Integer}, and String - julia

I have a function in which the user passes an argument to select which columns of a matrix should be processed as in the minimalistic example below:
function foo{P<:Real, T<:Integer}(;x::AbstractMatrix{P}=zeros(3, 10), colN::Union(T, AbstractVector{T})=1)
x[:,colN] = x[:,colN]+1
return x
end
I want to have a way for the user to specify that all columns should be processed, in the end I changed the function so that this would be the default behavior:
function foo{P<:Real, T<:Integer}(;x::AbstractMatrix{P}=zeros(3, 10), colN::Union(T, AbstractVector{T})=[1:size(x)[2]])
x[:,colN] = x[:,colN]+1
return x
end
originally, however, I wanted to allow the colN argument to take a String, so that the user could pass it the value "all" to mean that all columns should be processed, but the following doesn't work as I expected:
function foo{P<:Real, T<:Integer}(;x::AbstractMatrix{P}=zeros(3, 10), colN::Union(T, AbstractVector{T}, String)="all")
if colN == "all"
colN = [1:size(x)[2]]
end
x[:,colN] = x[:,colN]+1
return x
end
calling this last version of the function gives:
foo(colN="all")
ERROR: `__foo#8__` has no method matching __foo#8__(::Array{Float64,2}, ::ASCIIString)
why such an union between an Integer, a vector of integers, and a string doesn't seem to work?

The problem is that Julia can't deduce the type T when you pass a string, and is thus not able to resolve which method to call.
Consider the following two functions, f and g:
function f{T<:Integer}(x::Union(T,String))
x
end
function g{T<:Integer}(y::T, x::Union(T,String))
x
end
In this case, you will observe the following behavior:
f(1) is 1 because the value of T can be deduced
f("hello") gives an error because the value of T is unknown
g(1, "hello") is "hello" because the value of T can be deduced
That being said, I think that it would be more idiomatic Julia to use multiple dispatch instead of Union types to achieve what you want to do.
Update. Seeing as your colN is either a string or a list of indexes, I believe you would be fine with T = Int (or Int64 if you want to address a lot of memory). Compare the following function h to f and g above:
function h(x::Union(Int,String))
x
end
In this case, both h(1) and h("hello") work as expected (and e.g. h(1.0) raises an error).

There is no need to pass a string in this function to indicate handling all columns.
Also, it looks like the matrix should be passed in as a positional argument,
instead of a keyword argument.
so that Julia can specialize on it.
There are many ways to handle this more efficiently in Julia.
function foo{P<:Real, T<:Integer}(x::AbstractMatrix{P}, colN::Union(T, AbstractVector{T}))
x[:,colN] += 1
return x
end
function foo{P<:Real}(x::AbstractMatrix{P})
x[:,[1:size(x)[2]]] += 1
return x
end
so foo(zeros(10,3)) will give you:
3x10 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
and foo(zeros(3,10),5) will give you:
3x10 Array{Float64,2}:
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

Related

TypeError in Julia

I create a new struct called HousingData, and also define function such as iterate and length. However, when I use the function collect for my HousingData object, I run into the following error.
TypeError: in typeassert, expected Integer, got a value of type Float64
import Base: length, size, iterate
struct HousingData
x
y
batchsize::Int
shuffle::Bool
num_instances::Int
function HousingData(
x, y; batchsize::Int=100, shuffle::Bool=false, dtype::Type=Array{Float64})
new(convert(dtype,x),convert(dtype,y),batchsize,shuffle,size(y)[end])
end
end
function length(d::HousingData)
return ceil(d.num_instances/d.batchsize)
end
function iterate(d::HousingData, state=ifelse(
d.shuffle, randperm(d.num_instances), collect(1:d.num_instances)))
if(length(state)==0)
return nothing
end
return ((d.x[:,state[1]],d.y[:,state[1]]),state[2:end])
end
x1 = randn(5, 100); y1 = rand(1, 100);
obj = HousingData(x1,y1; batchsize=20)
collect(obj)
There are multiple problems in your code. The first one is related to length not returning an integer, but rather a float. This is explained by the behavior of ceil:
julia> ceil(3.8)
4.0 # Notice: 4.0 (Float64) and not 4 (Int)
You can easily fix this:
function length(d::HousingData)
return Int(ceil(d.num_instances/d.batchsize))
end
Another problem lies in the logic of your iteration function, which is not consistent with the advertised length. To take a smaller example than yours:
julia> x1 = [i+j/10 for i in 1:2, j in 1:6]
2×6 Array{Float64,2}:
1.1 1.2 1.3 1.4 1.5 1.6
2.1 2.2 2.3 2.4 2.5 2.6
# As an aside, unless you really want to work with 1xN matrices
# it is more idiomatic in Julia to use 1D Vectors in such situations
julia> y1 = [Float64(j) for i in 1:1, j in 1:6]
1×6 Array{Float64,2}:
1.0 2.0 3.0 4.0 5.0 6.0
julia> obj = HousingData(x1,y1; batchsize=3)
HousingData([1.1 1.2 … 1.5 1.6; 2.1 2.2 … 2.5 2.6], [1.0 2.0 … 5.0 6.0], 3, false, 6)
julia> length(obj)
2
julia> for (i, e) in enumerate(obj)
println("$i -> $e")
end
1 -> ([1.1, 2.1], [1.0])
2 -> ([1.2, 2.2], [2.0])
3 -> ([1.3, 2.3], [3.0])
4 -> ([1.4, 2.4], [4.0])
5 -> ([1.5, 2.5], [5.0])
6 -> ([1.6, 2.6], [6.0])
The iterator produces 6 elements, whereas the length of this object is only 2. This explains why collect errors out:
julia> collect(obj)
ERROR: ArgumentError: destination has fewer elements than required
Knowing your code, you're probably the best person to fix its logic.

Julia - absolute value of an array

I want to get the absolute value of the following array:
x = [1.1 -22.3 3.01, -1]
i.e.: I want an output of the type: x2 = [1.1 22.3 3.01 1]
However when I type:
abs(x)
I get an error:
ERROR: MethodError: no method matching abs(::Array{Float64,2})
Closest candidates are:
abs(::Pkg.Resolve.MaxSum.FieldValues.FieldValue) at /Users/vagrant/worker/juliapro-release-osx1011-0_6/build/tmp_julia/Julia-1.0.app/Contents/Resources/julia/share/julia/stdlib/v1.0/Pkg/src/resolve/FieldValues.jl:67
abs(::Pkg.Resolve.VersionWeights.VersionWeight) at /Users/vagrant/worker/juliapro-release-osx1011-0_6/build/tmp_julia/Julia-1.0.app/Contents/Resources/julia/share/julia/stdlib/v1.0/Pkg/src/resolve/VersionWeights.jl:40
abs(::Missing) at missing.jl:79
...
Stacktrace:
[1] top-level scope at none:0
Julia does not automatically apply scalar functions, like abs, to elements of an array. You should instead tell Julia this is what you want, and broadcast the scalar function abs over your array, see https://docs.julialang.org/en/v1/manual/arrays/#Broadcasting-1. This can be done as
julia> x = [1.1, -22.3, 3.01, -1];
julia> broadcast(abs, x)
4-element Array{Float64,1}:
1.1
22.3
3.01
1.0
or you can use "dot-notation", which is more ideomatic:
julia> abs.(x)
4-element Array{Float64,1}:
1.1
22.3
3.01
1.0

Using linspace in Julia 0.7

I am confused about using linspace in Julia 0.7. Here is the what I entered in the REPL and the result:
julia> a = linspace(0.1,1.1,6)
┌ Warning: `linspace(start, stop, length::Integer)` is deprecated, use `range(start, stop=stop, length=length)` instead.
│ caller = top-level scope
└ # Core :0
0.1:0.2:1.1
My question is about the deprecated warning and the suggested use of range. The range statement doesn't do the same thing as the linspace command.
If you enter the a = linspace(0.1,1.1,6) and collect(a), you get the following:
julia> collect(a)
6-element Array{Float64,1}:
0.1
0.3
0.5
0.7
0.9
1.1
If you enter b = range(0.1,1.1,6) and collect(b), you get:
julia> collect(b)
6-element Array{Float64,1}:
0.1
1.2
2.3
3.4
4.5
5.6
This is obviously not the same.
Why is linspace deprecated (perhaps a different question) and a non-equivalent range command suggested?
My actual question is: Is it safe to keep using linspace for the desired results it provides, and, if not, what should I be using instead?
You should use LinRange, as documented here.
A range with len linearly spaced elements between its start and stop. The size of the spacing is controlled by len, which must be an Int.
julia> LinRange(1.5, 5.5, 9)
9-element LinRange{Float64}:
1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5
Edit 2021: As of version 1.7 you can use the range function for this:
jl> range(1.5, 5.5, 9)
1.5:0.5:5.5
For version 1.6 you have to write: range(1.5, 5.5, length=9).
Following the deprecations, it is now:
julia> range(0.1, stop = 1.1, length = 6) |> collect
6-element Array{Float64,1}:
0.1
0.3
0.5
0.7
0.9
1.1
In your example, the second argument is a step, not the stop, notice this method is also deprecated, you have to use keyword arguments now:
julia> #which range(0.1, 1.1, 6)
range(start, step, length) in Base at deprecated.jl:53

Diagonalizing sparse unitary matrix

I have to gather the eigenvalues of a sparse unitary matrix.
Basically there is just an element different from zero in each
row and column (it's the transfer matrix of some Markovian process).
My question here is how to proceed, what would be the best choice
among all the suite of functions. I have seen that eigs could help,
but I also saw that one has to choose the inital vector.
The following code eventually defines pdeig which returns the eigenvalues of a matrix which is a pdmatrix i.e. a product of a permutation and diagonal matrix, or in other words a matrix like the question describes. Calculating the eigenvectors quickly is also possible (they have an explicit formula):
issquare(m) = all(x->x==size(m,1),size(m))
isunique(v) = v == unique(v)
permmatrix(sigma) =
[i==sigma[j] ? 1.0 : 0.0 for i=1:length(sigma),j=1:length(sigma)]
mat2perm(m) = [findfirst(m[:,i]) for i=1:size(m,1)]
function ispdmatrix(m) # used to verify input matrix form
(r,c,v) = findnz(m)
return issquare(m) && isunique(r) && isunique(c)
end
function pdfact(m::Matrix) # factor into permutation/dilation
ispdmatrix(m) || error("input matrix must be a PD matrix")
n = size(m,1)
p = mat2perm(m)
d = [p[i]>0 ? m[p[i],i] : zero(eltype(m)) for i=1:n]
return (p,d)
end
# return eigenvalues from factored pdmatrix
function pdeig(p::Vector{Int},d::Vector)
n = length(p)
active = trues(n)
eigv = Vector{Complex{eltype(d)}}(0)
for i=1:n
if !active[i]
continue
end
if p[i]>0
j=1
cump = d[i]
k=p[i]
active[i]=false
while active[k] > 0
j+=1
cump *= d[k]
active[k] = false
k=p[k]
end
append!(eigv,[cump^(1.0/j)*exp(2*im*π*m/j) for m=1:j])
else
push!(eigv,0.0 + 0.0im)
end
end
return eigv
end
pdeig(m::Matrix) = pdeig(pdfact(m)...)
n = 4 # testing vector to matrix transformation of permutations
σ=randperm(n)
#assert mat2perm(permmatrix(σ))==σ
For example, the following:
m = [ 0.0 1.0 0.0 ; 2.0 0.0 0.0 ; 0.0 0.0 0.0 ]
pdeig(m)
Outputs:
3-element Array{Complex{Float64},1}:
-1.41421+1.73191e-16im
1.41421-3.46382e-16im
0.0+0.0im
Since these matrices are diagonalizable, the eigenvalues should provide the diagonal matrix (just use diagm on them).
These matrices are very structured, and a proper Julia treatment would define a type for these matrices and then define the various linear algebra functions to dispatch on this type.
In case of errors, just add a comment, and I will try to fix them (or if I happen to see a nice refactoring then I'll edit).
BTW the calculations introduce small numerical errors, these should not be a problem and can be eliminated with proper rounding (so no need to get scared of -1.0 being -1.0+1.234234e-16im)

Parse a multi-line string into an array

I'm trying to get my head around Julia, coming from Python. Currently working through some Project Euler problems I've solved using Python in Julia to get a better feeling for the language. One thing that I do a lot (in Project Euler and in real life) is to parse a big multiline data object into an array. For example, if I have the data
data = """1 2 3 4
5 6 7 8
9 0 1 2"""
In python I might do
def parse(input):
output = []
for line in input.splitlines():
output.append(map(int,line.split()))
return np.array(output)
Here's what I have so far in Julia:
function parse(input)
nrow = ncol = 0
# Count first
for row=split(input,'\n')
nrow += 1
ncol = max(ncol,length(split(row)))
end
output = zeros(Int64,(nrow,ncol))
for (i,row) in enumerate(split(input,'\n'))
for (j,word) in enumerate(split(row))
output[i,j] = int(word)
end
end
return output
end
What's the Julia version of "pythonic" called? Whatever it is, I don't think I'm doing it. I'm pretty sure there's a way to (1) not have to pass through the data twice, (2) not have to be so specific about allocating the array. I've tried hcat/vcat a little, without luck.
I'd welcome suggestions for solving this. I'd also be interested in references to proper Julia style (julia-onic?), and general language usage practices. Thanks!
readdlm is really useful here. See the docs for all the options, but here's an example.
julia> data="1 2 3 4
5 6 7 8
9 0 1 2"
"1 2 3 4\n5 6 7 8\n9 0 1 2"
julia> readdlm(IOBuffer(data))
3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
5.0 6.0 7.0 8.0
9.0 0.0 1.0 2.0
julia> readdlm(IOBuffer(data),Int)
3x4 Array{Int32,2}:
1 2 3 4
5 6 7 8
9 0 1 2

Resources