I am trying to translate the first answer in Remove NaN row from X array and also the corresponding row in Y from Python to Julia 0.5.0 without importing numpy. I can replicate the "removing the NaNs" part with:
x1 = x[!isnan(x)]
but only using that reduces the 2D array down to 1D, and I don't want that. What would be the Julia equivalent of numpy.any in this case? Or if there isn't an equivalent, how can I keep my array 2D and delete entire rows that contain NaNs?
You can find rows that contain a NaN entry with any:
julia> A = rand(5, 4)
A[rand(1:end, 4)] = NaN
A
5×4 Array{Float64,2}:
0.951717 0.0248771 0.903009 0.529702
0.702505 NaN 0.730396 0.785191
NaN 0.390453 0.838332 NaN
0.213665 NaN 0.178303 0.0100249
0.124465 0.363872 0.434887 0.305722
julia> nanrows = any(isnan(A), 2) # 2 means that we reduce over the second dimension
5×1 Array{Bool,2}:
false
true
true
true
false
Then you can use the returned logical array as a mask into the first dimension, but we need to make it one-dimensional first:
julia> A[!vec(nanrows), :]
2×4 Array{Float64,2}:
0.951717 0.0248771 0.903009 0.529702
0.124465 0.363872 0.434887 0.305722
Related
Coming from R I am used to do something like this to get the first element of vector a:
a <- c(1:3, 5)
a[1]
[1] 1
H can I get the 1 in Julia? The first element of a is now a range.
a = [1:3, 5]
a[1]
1-element Array{UnitRange{Int64},1}:
1:3
The core problem here is that c(1:3, 5) in R and [1:3, 5] in Julia do not do the same thing. The R code concatenates a vector with an integer producing a vector of four integers:
> c(1:3, 5)
[1] 1 2 3 5
The Julia code constructs a two-element vector whose elements are the range 1:3 and the integer 5:
julia> [1:3, 5]
2-element Vector{Any}:
1:3
5
julia> map(typeof, ans)
2-element Vector{DataType}:
UnitRange{Int64}
Int64
This vector has element type Any because there's no smaller useful common supertype of a range and an integer. If you want to concatenate 1:3 and 5 together into a vector you can use ; inside of the brackets instead of ,:
julia> a = [1:3; 5]
4-element Vector{Int64}:
1
2
3
5
Once you've defined a correctly, you can get its first element with a[1] just like in R. In general inside of square brackets in Julia:
Comma (,) is only for constructing vectors of the exact elements given, much like in Python, Ruby, Perl or JavaScript.
If you want block concatenation like in R or Matlab, then you need to use semicolons/newlines (; or \n) for vertical concatenation and spaces for horizontal concatenation.
The given example of [1:3; 5] is a very simple instance of block concatenation, but there are significantly more complex ones possible. Here's a fancy example of constructing a block matrix:
julia> using LinearAlgebra
julia> A = rand(2, 3)
2×3 Matrix{Float64}:
0.895017 0.442896 0.0488714
0.750572 0.797464 0.765322
julia> [A' I
0I A]
5×5 Matrix{Float64}:
0.895017 0.750572 1.0 0.0 0.0
0.442896 0.797464 0.0 1.0 0.0
0.0488714 0.765322 0.0 0.0 1.0
0.0 0.0 0.895017 0.442896 0.0488714
0.0 0.0 0.750572 0.797464 0.765322
Apologies for StackOverflow's lousy syntax highlighting here: it seems to get confused by the postfix ', interpreting it as a neverending character literal. To explain this example a bit:
A is a 2×3 random matrix of Float64 elements
A' is the adjoint (conjugate transpose) of A
I is a variable size unit diagonal operator
0I is similar but the diagonal scalar is zero
These are concatenated together to form a single 5×5 matrix of Float64 elements where the upper left and lower right parts are filled from A' and A, respectively, while the lower left is filled with zeros and the upper left is filled with the 3×3 identity matrix (i.e. zeros with diagonal ones).
In this case, your a[1] is a UnitRange collection. If you want to access an individual element of it, you can use collect
For example for the first element,
collect(a[1])[1]
Is there a difference between Array and Vector?
typeof(Array([1,2,3]))
Vector{Int64}
typeof(Vector([1,2,3]))
Vector{Int64}
Both seem to create the same thing, but they are not the same:
Array == Vector
false
Array === Vector
false
So, what is actually the difference?
The difference is that Vector is a 1-dimensional Array, so when you write e.g. Vector{Int} it is a shorthand to Array{Int, 1}:
julia> Vector{Int}
Array{Int64,1}
When you call constructors Array([1,2,3]) and Vector([1,2,3]) they internally get translated to the same call Array{Int,1}([1,2,3]) as you passed a vector to them.
You would see the difference if you wanted to pass an array that is not 1-dimensional:
julia> Array(ones(2,2))
2×2 Array{Float64,2}:
1.0 1.0
1.0 1.0
julia> Vector(ones(2,2))
ERROR: MethodError: no method matching Array{T,1} where T(::Array{Float64,2})
Also note the effect of:
julia> x=[1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> Vector(x)
3-element Array{Int64,1}:
1
2
3
julia> Vector(x) === x
false
So essentially the call Vector(x) makes a copy of x. Usually in the code you would probably simply write copy(x).
A general rule is that Array is a parametric type that has two parameters given in curly braces:
the first one is element type (you can access it using eltype)
the second one is the dimension of the array (you can access it using ndims)
See https://docs.julialang.org/en/v1/manual/arrays/ for details.
Vector is an alias for a one-dimensional Array. You can see that in the Julia REPL:
julia> Vector
Array{T, 1} where T
julia> Vector{Int32}
Array{Int32, 1}
Similarly, a Matrix is a 2-dimensional Array:
julia> Matrix
Array{T,2} where T
If I define A = [1] I get that A is not equal to A' since they are of different types:
julia> A=[1]
1-element Array{Int64,1}:
1
julia> A'
1×1 LinearAlgebra.Adjoint{Int64,Array{Int64,1}}:
1
julia> A == A'
false
If I define another vector B = [1, 2, 3] and I try to do the products with A'and A I obtain the following output:
B=[1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> B*A'
3×1 Array{Int64,2}:
1
2
3
julia> B*A
ERROR: MethodError: no method matching *(::Array{Int64,1}, ::Array{Int64,1})
...
...
That seems a problem of * operator signature that seems not to accept two Array{Int64,1} as operands while defining another vector C = [4 5] we get:
julia> C=[4 5]
1×2 Array{Int64,2}:
4 5
julia> B*C
3×2 Array{Int64,2}:
4 5
8 10
12 15
So * is defined for operands of types Array{Int64,1} and Array{Int64,2} respectively. Why I cannot multiply a column vector by a singleton vector A but I can using A'?
The answer to this depends on how well you understand linear algebra. Julia follows the conventions of linear algebra for it's array multiplication, if you need to brush up, wikipedia's page is a good source.
It boils down to the fact that your A is a column vector whereas A' is a row vector (like C). Matrix multiplication is defined between (n, k) and (k, m) matrices to produce a (n, m) matrix. Column vectors can sometimes be thought of as (n, 1) matrices, so there's no well defined notion of multiplication between two column vectors.
If you want the dot product, use the dot function (you'll need to do using LinearAlgebra first). If you want an element-wise product, you can use the broadcasting notation, u .* v.
In Julia Vectors are one dimensional Arrays, while the transposition works on two dimensional matrices (Array{T,2} equivalent to Matrix{T})
julia> A=[1]
1-element Array{Int64,1}:
1
julia> collect(A')
1×1 Array{Int64,2}:
1
Since transposition in Julia does not materialize the data and rather holds reference to the original I needed to use collect to actually see what is going on.
When using multiplication on 2-dimensional arrays you are actually using linear algebra operations.
If you want to multiply element-wise use the dot . operator instead:
julia> A .== A'
1×1 BitArray{2}:
1
Note it return an Array rather than a single value.
If you want to multiply element-wise (rather than using linear algebra matrix multiplication) you need to vectorize again:
julia> B.*A
3-element Array{Int64,1}:
1
2
3
I have an array of type Array{Float64,2} but it's an array of 1 column, and I'm unable to pass this into a function that's expecting a single-column array of type Array{Float64,1}. I don't really understand what the 2 means or how to fix my problem, and I haven't been able to figure it out by searching through any documentation.
In Array{Float64,2}, the 2 is the number of dimensions in the array. Since you say it's "it's an array of 1 column", you probably have something which is 2-dimensional with either one row or one column, i.e. one of
julia> c = rand(1,3)
1x3 Array{Float64,2}:
0.190944 0.928697 0.251519
julia> d = rand(3,1)
3x1 Array{Float64,2}:
0.0818493
0.0342291
0.58341
To turn this into a 1-dimensional array, you could slice the array manually or use squeeze, as you prefer:
julia> c[1,:]
3-element Array{Float64,1}:
0.190944
0.928697
0.251519
julia> squeeze(d,2)
3-element Array{Float64,1}:
0.0818493
0.0342291
0.58341
Either approach should give you something of type Array{Float64,1}.
As noted in the comments, another approach is to use reshape, e.g. (using a different random c):
julia> reshape(c, length(c))
3-element Array{Float64,1}:
0.680653
0.0573147
0.607054
This has the advantage -- and disadvantage -- of not caring whether you have an array of shape 1xN or Nx1.
I was delighted to learn that Julia allows a beautifully succinct way to form inner products:
julia> x = [1;0]; y = [0;1];
julia> x'y
1-element Array{Int64,1}:
0
This alternative to dot(x,y) is nice, but it can lead to surprises:
julia> #printf "Inner product = %f\n" x'y
Inner product = ERROR: type: non-boolean (Array{Bool,1}) used in boolean context
julia> #printf "Inner product = %f\n" dot(x,y)
Inner product = 0.000000
So while i'd like to write x'y, it seems best to avoid it, since otherwise I need to be conscious of pitfalls related to scalars versus 1-by-1 matrices.
But I'm new to Julia, and probably I'm not thinking in the right way. Do others use this succinct alternative to dot, and if so, when is it safe to do so?
There is a conceptual problem here. When you do
julia> x = [1;0]; y = [0;1];
julia> x'y
0
That is actually turned into a matrix * vector product with dimensions of 2x1 and 1 respectively, resulting in a 1x1 matrix. Other languages, such as MATLAB, don't distinguish between a 1x1 matrix and a scalar quantity, but Julia does for a variety of reasons. It is thus never safe to use it as alternative to the "true" inner product function dot, which is defined to return a scalar output.
Now, if you aren't a fan of the dots, you can consider sum(x.*y) of sum(x'y). Also keep in mind that column and row vectors are different: in fact, there is no such thing as a row vector in Julia, more that there is a 1xN matrix. So you get things like
julia> x = [ 1 2 3 ]
1x3 Array{Int64,2}:
1 2 3
julia> y = [ 3 2 1]
1x3 Array{Int64,2}:
3 2 1
julia> dot(x,y)
ERROR: `dot` has no method matching dot(::Array{Int64,2}, ::Array{Int64,2})
You might have used a 2d row vector where a 1d column vector was required.
Note the difference between 1d column vector [1,2,3] and 2d row vector [1 2 3].
You can convert to a column vector with the vec() function.
The error message suggestion is dot(vec(x),vec(y), but sum(x.*y) also works in this case and is shorter.
julia> sum(x.*y)
10
julia> dot(vec(x),vec(y))
10
Now, you can write x⋅y instead of dot(x,y).
To write the ⋅ symbol, type \cdot followed by the TAB key.
If the first argument is complex, it is conjugated.
Now, dot() and ⋅ also work for matrices.
Since version 1.0, you need
using LinearAlgebra
before you use the dot product function or operator.