Julia - creating matrices of Union{Nothing,String} vs Union{Nothing,Bool} - julia

In a program I have, I want to initialise a bunch of matrices with Nothing and then if some condition is met, change individual elements to a value of type Bool or String
This works fine when I initialise with
Array{Union{Nothing,Bool},2}(undef,5,5)
which yields something looking like
5×5 Matrix{Union{Nothing, Bool}}:
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
But not when I initialise with
Array{Union{Nothing,String},2}(undef,5,5)
which gives me
5×5 Matrix{Union{Nothing, String}}:
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
Now I can change values in that second array to Strings so that I get
5×5 Matrix{Union{Nothing, String}}:
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef "Look"
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
But when I have a second large array constructed as
Array{Union{Nothing, Bool, String}}(undef,10,5)
which looks like
10×5 Matrix{Union{Nothing, Bool, String}}:
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
I can assign the first 5 rows quite happily to the first array constructed out of Nothing and Bool, but I can't assign to the second matrix constructed out of #undef/nothing and String. Instead, I get this error
UndefRefError: access to undefined reference
Initially I thought it was something to do with converting from type Nothing or #undef to String but I seem to be able to do this when I assign individual strings up above.
Any ideas?

First let me start with a recommendation, what to do.
It is best to create your matrices in general in the following way:
julia> Array{Union{Nothing,String},2}(nothing,5,5)
5×5 Matrix{Union{Nothing, String}}:
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
nothing nothing nothing nothing nothing
In this way you ensure that they are properly initialized.
Now to explain what you observe. #undef as opposed to nothing is not a value. It means that the given cell in a matrix is not connected to any value. You cannot read from such a cell. You must first write to it before you can read it:
julia> x = Vector{String}(undef, 3)
3-element Vector{String}:
#undef
#undef
#undef
julia> x[1]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
[1] getindex(A::Vector{String}, i1::Int64)
# Base .\array.jl:801
[2] top-level scope
# REPL[6]:1
julia> x[1] = "a"
"a"
julia> x
3-element Vector{String}:
"a"
#undef
#undef
julia> x[1]
"a"
You might ask why for the case of Union{Bool, Nothing} element type you get a value in an array, while in the case of Union{String, Nothing} you do get #undef, i.e. no value that you can read.
The answer is that in Julia there are two kinds of types:
bits type, for which isbitstype function returns true; such data are immutable and do not contain references to other values; an example of such data is Bool
non-bits type, which either are mutable, or contain references; an example of such data is String (as technically string is represented as a reference to some location in memory where the contents of the string is stored)
As you can see here:
julia> isbitstype(Bool)
true
julia> isbitstype(String)
false
Now - if your array is to store a bits type (or their union) then it stores it directly, so there is always some value (there is no guarantee what value it would be but you know you will get a value), e.g.:
julia> Matrix{Int}(undef, 5, 5)
5×5 Matrix{Int64}:
260255120 260384864 260235344 260254240 0
260254240 260235344 260235344 260255120 0
261849744 260235344 260235344 260235344 0
260465792 260465440 260464224 260235344 0
260235344 260235344 260235344 260235344 0
and as you can see we have some values stored in it, but it is undefined what the values would be.
On the other hand if your array is to store a non-bits type it actually stores references to the values. Which means that when you create an array of such values without initializing it you get #undef - which means that in this cell there is no reference to a valid value.
Just to show you that it has nothing to do directly with strings, let me show you e.g. the String7 type (it is exported e.g. by CSV.jl), which is a fixed width string (maximum width of 7 bytes) that is a bits type. Observe the difference:
julia> using CSV
julia> isbitstype(String7)
true
julia> Matrix{String7}(undef, 5, 5)
5×5 Matrix{String7}:
"\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" … "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
"\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
"\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
"\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
"\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0" "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
julia> Matrix{String}(undef, 5, 5)
5×5 Matrix{String}:
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
#undef #undef #undef #undef #undef
In the first case you got a matrix of String7 values that is initialized, but the values it is initialized to are undefined (it is some garbage).
In the second case you got a matrix of String values, and since they are non-bits the matrix is uninitialized - it does not hold any values yet. You first have to assign some values before you will be able to read them.
Finally there is a isassigned function that allows you to check if the container has a value associated with some index (i.e. to check if it is not #undef or the index is out of bounds). Here is an example:
julia> x = Vector{String}(undef, 3)
3-element Vector{String}:
#undef
#undef
#undef
julia> x[1] = "a"
"a"
julia> isassigned(x, 1) # we have a value here
true
julia> isassigned(x, 2) # no value
false
julia> isassigned(x, 3) # no value
false
julia> isassigned(x, 4) # out of bounds
false
If anything is unclear please comment and I can expand the answer.

Related

Create a Vector of Integers and missing Values

What a hazzle...
I'm trying to create a vector of integers and missing values. This works fine:
b = [4, missing, missing, 3]
But I would actually like the vector to be longer with more missing values and therefore use repeat(), but this doesn't work
append!([1,2,3], repeat([missing], 1000))
and this also doesn't work
[1,2,3, repeat([missing], 1000)]
Please, help me out, here.
It is also worth to note that if you do not need to do an in-place operation with append! actually in such cases it is much easier to do vertical concatenation:
julia> [[1, 2, 3]; repeat([missing], 2); 4; 5] # note ; that denotes vcat
7-element Array{Union{Missing, Int64},1}:
1
2
3
missing
missing
4
5
julia> vcat([1,2,3], repeat([missing], 2), 4, 5) # this is the same but using a different syntax
7-element Array{Union{Missing, Int64},1}:
1
2
3
missing
missing
4
5
The benefit of vcat is that it automatically does the type promotion (as opposed to append! in which case you have to correctly specify the eltype of the target container before the operation).
Note that because vcat does automatic type promotion in corner cases you might get a different eltype of the result of the operation:
julia> x = [1, 2, 3]
3-element Array{Int64,1}:
1
2
3
julia> append!(x, [1.0, 2.0]) # conversion from Float64 to Int happens here
5-element Array{Int64,1}:
1
2
3
1
2
julia> [[1, 2, 3]; [1.0, 2.0]] # promotion of Int to Float64 happens in this case
5-element Array{Float64,1}:
1.0
2.0
3.0
1.0
2.0
See also https://docs.julialang.org/en/v1/manual/arrays/#man-array-literals.
This will work:
append!(Union{Int,Missing}[1,2,3], repeat([missing], 1000))
[1,2,3] creates just a Vector{Int} and since Julia is strongly typed the Vector{Int} cannot accept values of non-Int type. Hence, when defining a structure, that you plan to hold more data types within, you need to explicitly state it - here we have defined Vector{Union{Int,Missing}}.

Julia Array with different sized vectors

When creating an array of various sized vectors(e.g. arrays) I am generating an error msg.
julia> A = [[1,2] [1,2,3] [1,4] [1] [1,5,6,7]]
ERROR: DimensionMismatch("vectors must have same lengths")
Stacktrace:
[1] hcat(::Array{Int64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Vararg{Array{Int64,1},N} where N) at .\array.jl:1524
[2] top-level scope at none:0
Although, if I initalize an array and assign the vectors 'its okay'...
julia> A = Array{Any}(undef,5)
5-element Array{Any,1}:
#undef
#undef
#undef
#undef
#undef
pseudo code> A[i] = [x,y...]
2-element Array{Int64,1}:
1
2
julia> A
5-element Array{Any,1}:
[1, 2]
[1, 2, 3]
[1]
[1, 5]
[1, 2, 6, 4, 5]
Is there a way to initialize the array with the variously sized arrays or is Julia configured this way to prevent errors.
The space-separated syntax you're using for the outermost array is specifically for horizontal concatenation of matrices, so your code is trying to concatenate all of these vectors into a matrix, which doesn't work since they have different lengths. Use commas in the outer array like the inner one to get an array of arrays:
julia> A = [[1,2], [1,2,3], [1,4], [1], [1,5,6,7]]
5-element Array{Array{Int64,1},1}:
[1, 2]
[1, 2, 3]
[1, 4]
[1]
[1, 5, 6, 7]

How can I have a cell array in Julia?

Does a cell array exist in Julia? I want an array which its elements are vector or matrix.
for example A={1,[2 3],[5 6;7 8];"salam", [1 2 3 4],magic(5)}.
if you don't mind please help me.
An Array{Any} is equivalent to a MATLAB cell array. You can put anything in there. ["hi",:bye,10]. a = Array{Any}(undef,5) builds an uninitialized one, you can a[1] = ... to modify values, push!(a,...) to increase its length, etc.
A cell array is a data type with indexed data containers called cells, where each cell can contain any type of data
In Julia, arrays can contain values of homogeneous ([1, 2, 3]) or heterogeneous types ([1, 2.5, "3"]). Julia will try to promote the values to a common concrete type by default. If Julia can not promote the types contained, the resulting array would be of the abstract type Any.
Example ported from Access Data in Cell Array, using Julia 1.0.3:
julia> C = ["one" "two" "three"; # Matrix literal
1 2 3 ]
2×3 Array{Any,2}:
"one" "two" "three"
1 2 3
julia> upperLeft = C[1:2,1:2] # slicing
2×2 Array{Any,2}:
"one" "two"
1 2
julia> C[1,1:3] = ["first","second","third"] # slice assignment
3-element Array{String,1}:
"first"
"second"
"third"
julia> C
2×3 Array{Any,2}:
"first" "second" "third"
1 2 3
julia> numericCells = C[2,1:3]
3-element Array{Any,1}:
1
2
3
julia> last = C[2,3] # indexing
3
julia> C[2,3] = 300 # indexing assignment
300
julia> C
2×3 Array{Any,2}:
"first" "second" "third"
1 2 300
julia> r1c1, r2c1, r1c2, r2c2 = C[1:2,1:2] # destructuring
2×2 Array{Any,2}:
"first" "second"
1 2
julia> r1c1
"first"
julia> r2c1
1
julia> r1c2
"second"
julia> r2c2
2
julia> nums = C[2,:]
3-element Array{Any,1}:
1
2
300
Example ported from Combining Cell Arrays with Non-Cell Arrays:
Notice the use of the splice operator (...) to incorporate the values of the inner array into the outer one, and the usage of the Any[] syntax to prevent Julia from promoting the UInt8 to an Int.
julia> A = [100, Any[UInt8(200), 300]..., "Julia"]
4-element Array{Any,1}:
100
0xc8
300
"Julia"
The .( broadcast syntax, applies the function typeof element wise.
julia> typeof.(A)
4-element Array{DataType,1}:
Int64
UInt8
Int64
String
So in summary Julia doesn't need cell arrays, it uses parametric n-dimensional arrays instead. Also Julia only uses brackets for both slicing and indexing (A[n], A[i, j], A[a:b, x:y]), parenthesis after a variable symbol is reserved for function calls (foo(), foo(args...), foo(bar = "baz")).

Vectorized splatting

I'd like to be able to splat an array of tuples into a function in a vectorized fashion. For example, if I have the following function,
function foo(x, y)
x + y
end
and the following array of tuples,
args_array = [(1, 2), (3, 4), (5, 6)]
then I could use a list comprehension to obtain the desired result:
julia> [foo(args...) for args in args_array]
3-element Array{Int64,1}:
3
7
11
However, I would like to be able to use the dot vectorization notation for this operation:
julia> foo.(args_array...)
ERROR: MethodError: no method matching foo(::Int64, ::Int64, ::Int64)
But as you can see, that particular syntax doesn't work. Is there a vectorized way to do this?
foo.(args_array...) doesn't work because it's doing:
foo.((1, 2), (3, 4), (5, 6))
# which is roughly equivalent to
[foo(1,3,5), foo(2,4,6)]
In other words, it's taking each element of args_array as a separate argument and then broadcasting foo over those arguments. You want to broadcast foo over the elements directly. The trouble is that running:
foo.(args_array)
# is roughly equivalent to:
[foo((1,2)), foo((3,4)), foo((5,6))]
In other words, the broadcast syntax is just passing each tuple as a single argument to foo. We can fix that with a simple intermediate function:
julia> bar(args) = foo(args...);
julia> bar.(args_array)
3-element Array{Int64,1}:
3
7
11
Now that's doing what you want! You don't even need to construct the second argument if you don't want to. This is exactly equivalent:
julia> (args->foo(args...)).(args_array)
3-element Array{Int64,1}:
3
7
11
And in fact you can generalize this quite easily:
julia> splat(f) = args -> f(args...);
julia> (splat(foo)).(args_array)
3-element Array{Int64,1}:
3
7
11
You could zip the args_array, which effectively transposes the array of tuples:
julia> collect(zip(args_array...))
2-element Array{Tuple{Int64,Int64,Int64},1}:
(1, 3, 5)
(2, 4, 6)
Then you can broadcast foo over the transposed array (actually an iterator) of tuples:
julia> foo.(zip(args_array...)...)
(3, 7, 11)
However, this returns a tuple instead of an array. If you need the return value to be an array, you could use any of the following somewhat cryptic solutions:
julia> foo.(collect.(zip(args_array...))...)
3-element Array{Int64,1}:
3
7
11
julia> collect(foo.(zip(args_array...)...))
3-element Array{Int64,1}:
3
7
11
julia> [foo.(zip(args_array...)...)...]
3-element Array{Int64,1}:
3
7
11
How about
[foo(x,y) for (x,y) in args_array]

Circular permutations

Given a vector z = [1, 2, 3], I want to create a vector of vectors with all circular permutations of z (i.e. zp = [[1,2,3], [3,1,2], [2,3,1]]).
I can print all elements of zp with
for i in 1:length(z)
push!(z, shift!(z)) |> println
end
How can I store the resulting permutations? Note that
zp = Vector(length(z))
for i in 1:length(z)
push!(z, shift!(z))
push!(zp, z)
end
doesn't work as it stores the same vector z 3 times in zp.
One way would just be to copy the vector before pushing it:
z = [1, 2, 3];
zp = Vector();
for i in 1:length(z)
push!(z, shift!(z))
push!(zp, copy(z))
end
gives me
julia> zp
3-element Array{Any,1}:
[2,3,1]
[3,1,2]
[1,2,3]
But I tend to prefer avoiding mutating operations when I can. So I'd instead write this as
julia> zp = [circshift(z, i) for i=1:length(z)]
3-element Array{Array{Int64,1},1}:
[3,1,2]
[2,3,1]
[1,2,3]
This seems to execute pretty quick on my machine (faster than a comprehension):
julia> z=[1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> zp=Vector{typeof(z)}(length(z))
3-element Array{Array{Int64,1},1}:
#undef
#undef
#undef
julia> for i=1:length(z)
zp[i]=circshift(z,i-1)
end
julia> zp
3-element Array{Array{Int64,1},1}:
[1,2,3]
[3,1,2]
[2,3,1]
julia>

Resources