How to broadcast set operation on array of sets in Julia? - julia

I'm trying to perform set operations between a given set y and all items in some array of sets X as follows:
X=Array{Set}([Set([1,2,1]), Set([4,6,8 ]), Set([4,5])])
y=Set{Int16}([2,8,4])
z=broadcast(intersect, y, X)
println(z)
Which gives me empty sets, instead of sets with the singletons in y, for my example.

You have to protect y from being iterated over. Normally you would get an error but unfortunately y has three elements as well as vector X. Let us create a bigger vector then so see the problem:
julia> X=Array{Set}([Set([1,2,1]), Set([4,6,8 ]), Set([4,5]), Set([7])])
4-element Array{Set,1}:
Set([2, 1])
Set([4, 8, 6])
Set([4, 5])
Set([7])
julia> y=Set{Int16}([2,8,4])
Set{Int16} with 3 elements:
4
2
8
julia> z=broadcast(intersect, y, X)
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 3 and 4")
How to solve it - wrap y in a 0-dimensional container with Ref(y) like this:
julia> z=broadcast(intersect, Ref(y), X)
4-element Array{Set{Int16},1}:
Set([2])
Set([4, 8])
Set([4])
Set()
or equivalently just write:
julia> z=intersect.(Ref(y), X)
4-element Array{Set{Int16},1}:
Set([2])
Set([4, 8])
Set([4])
Set()

Related

Outputting variable name and value in a loop

I want to loop over a list of variables nad output the variable name and value. E.g., say I have x=1 and y=2, then I want an output
x is 1
y is 2
I suspect I need to use Symbols for this. Here is my approach, but it isn't working:
function t(x,y)
for i in [x,y]
println("$(Symbol(i)) is $(eval(i))") # outputs "1 is 1" and "2 is 2"
end
end
t(1, 2)
Is there a way to achieve this? I guess a Dictionary would work, but would be interested to see if Symbols can also be used here.
One option is to use a NamedTuple:
julia> x = 1; y = 2
2
julia> vals = (; x, y)
(x = 1, y = 2)
julia> for (n, v) ∈ pairs(vals)
println("$n is $v")
end
x is 1
y is 2
Note the semicolon in (; x, y), which turns the x and y into kwargs so that the whole expression becomes shorthand for (x = x, y = y).
I will also add that your question looks like like you are trying to dynamically work with variable names in global scope, which is generally discouraged and an indication that you probably should be considering a datastructure that holds values alongside labels, such as the dictionary proposed in the other answer or a NamedTuple. You can google around if you want to read more on this, here's a related SO question:
Is it a good idea to dynamically create variables?
You can do this by passing the variable names:
x = 1
y = 2
function t(a, b)
for i in [a, b]
println("$(i) is $(eval(i))")
end
end
t(:x, :y)
x is 1
y is 2
At the start of the function, there's no record of the "x"-ness of x, or the "y"-ness of y. The function only sees 1 and 2. It's a bit confusing that you also called your two local variables x and y, I renamed them to show what's happening more clearly.
A solution with dictionaries would be nicer:
dict = Dict()
dict[:x] = 1
dict[:y] = 2
function t(d)
for k in keys(d)
println("$(k) is $(d[k])")
end
end
t(dict)
y is 2
x is 1
If you rather want to see programmatically what variables are present you could use varinfo or names:
julia> x=5; y=7;
julia> varinfo()
name size summary
–––––––––––––––– ––––––––––– –––––––
Base Module
Core Module
InteractiveUtils 316.128 KiB Module
Main Module
ans 8 bytes Int64
x 8 bytes Int64
y 8 bytes Int64
julia> names(Main)
7-element Vector{Symbol}:
:Base
:Core
:InteractiveUtils
:Main
:ans
:x
With any given name it's value can be obtained via getfield:
julia> getfield(Main, :x)
5
If you are rather inside a function than use #locals macro:
julia> function f(a)
b=5
c=8
#show Base.#locals
end;
julia> f(1)
#= REPL[13]:4 =# Base.#locals() = Dict{Symbol, Any}(:a => 1, :b => 5, :c => 8)
Dict{Symbol, Any} with 3 entries:
:a => 1
:b => 5
:c => 8

Julia - Equivalent of python `pop`. Remove elements from array using boolean array and return them

Is there an equivalent to Python's pop? I have an array x and a boolean array flag of the same length. I would like to extract x[flag] and be able to store it in a variable x_flagged while at the same time remove them in place from x.
x = rand(1:5, 100)
flag = x .> 2
x_flagged = some_function!(x, flag) # Now x would be equal to x[x .<= 2]
Try this one using deleteat!
julia> function pop_r!(list, y) t = list[y]; deleteat!( list, y ); t end
julia> x = rand(1:5, 100)
100-element Vector{Int64}
julia> flag = x .> 2
100-element BitVector
julia> pop_r!( x, flag )
60-element Vector{Int64}
julia> x
40-element Vector{Int64}
You can use splice! with a bit of help from findall:
julia> x_flagged = splice!(x, findall(flag))
59-element Vector{Int64}:
...
julia> size(x)
(41,)
splice!(a::Vector, indices, [replacement]) -> items
Remove items at specified indices, and return a collection containing the removed items.

Combining vectors of unequal length

x = [1, 2, 3, 4]
y = [1, 2]
If I want to be able to operate on the two vectors with a default value filling in, what are the strategies?
E.g. would like to do the following and implicitly fill in with 0 or missing
x + y # would like [2, 4, 3, 4]
Ideally would like to do this in a generic way so that I could do arbitrary operations with the two.
Disregarding whether Julia has something built-in to do this, remember that Julia is fast. This means that you can write code to support this kind of need.
extend!(x, y::Vector, default=0) = extend!(x, length(y), default)
extend!(x, n::Int, default=0) = begin
while length(x) < n
push!(x, default)
end
x
end
Then when you have code such as you describe, you can symmetrically extend x and y:
x = [1, 2, 3, 4]
y = [1, 2]
extend!(x, y)
extend!(y, x)
x + y
==> [2, 4, 3, 4]
Note that this mutates y. In many cases, the desired length would come from outside the code and would be applied to both x and y. I can also imagine that 0 is a bad default in general (even though it is completely appropriate in your context of addition.
A comment below makes the worthy point that you should consider using append! instead of looping over push!. In fact, it is best to measure differences like that if you care about very small differences. I went ahead and tested:
julia> using BenchmarkTools
julia> extend1(x, n) = begin
while length(x) < n
push!(x, 0)
end
x
end
julia> #btime begin
x = rand(10)
sum(x)
end
59.815 ns (1 allocation: 160 bytes)
5.037723569560573
julia> #btime begin
x = rand(10)
extend1(x, 1000)
sum(x)
end
7.281 μs (8 allocations: 20.33 KiB)
6.079832879992913
julia> x = rand(10)
julia> #btime begin
x = rand(10)
append!(x, zeros(990))
sum(x)
end
1.290 μs (3 allocations: 15.91 KiB)
3.688526541987817
julia>
Pushing primitives in a loop is damned fast, allocating a vector of zeros so we can use append! is very slightly faster.
But the real lesson here is seen in the fact that the loop version takes microseconds to append nearly 1000 values (reallocating the array several times). Appending 10 values one by one takes just over 150ns (and append! is slightly faster). This is blindingly fast. Literally doing nothing in R or Python can take longer than this.
This difference would matter in some situations and would be undetectable in many others. If it matters, measure. If it doesn't, do the simplest thing that comes to mind because Julia has your back (performance-wise).
FURTHER UPDATE
Taking a hint from another of Colin's comments, here are results where we use append! but we don't allocate a list. Instead, we use a generator ... that is, a data structure that invents data when asked for it with an interface much like a list. The results are much better than what I showed above.
julia> #btime begin
x = rand(10)
append!(x, (0 for i in 1:990))
sum(x)
end
565.814 ns (2 allocations: 8.03 KiB)
Note the round brackets around the 0 for i in 1:990.
In the end, Colin was right. Using append! is much faster if we can avoid related overheads. Surprisingly, the base function Iterators.repeated(0, 990) is much slower.
But, no matter what, all of these options are pretty blazingly fast and all of them would probably be so fast that none of these subtle differences would matter.
Julia is fun!
Note that if you want to fill with missing or some other type different from the element type in your original vector, then you will need to change the type of your vectors to allow those new elements. The function below will handle any case.
function fillvectors(x, y, fillvalue=missing)
xl = length(x)
yl = length(y)
if xl < yl
x::Vector{Union{eltype(x), typeof(fillvalue)}} = x
for i in xl+1:yl
push!(x, fillvalue)
end
end
if yl < xl
y::Vector{Union{eltype(y), typeof(fillvalue)}} = y
for i in yl+1:xl
push!(y, fillvalue)
end
end
return x, y
end
x = [1, 2, 3, 4]
y = [1, 2]
julia> (x, y) = fillvectors(x, y)
([1, 2, 3, 4], Union{Missing, Int64}[1, 2, missing, missing])
julia> y
4-element Vector{Union{Missing, Int64}}:
1
2
missing
missing
julia> (x, y) = fillvectors(x, y, 0)
([1, 2, 3, 4], [1, 2, 0, 0])
julia> y
4-element Vector{Int64}:
1
2
0
0
julia> (x, y) = fillvectors(x, y, 1.001)
([1, 2, 3, 4], Union{Float64, Int64}[1, 2, 1.001, 1.001])
julia> y
4-element Vector{Union{Float64, Int64}}:
1
2
1.001
1.001

How can I slice the high-order multidimeonal array (or tensor) on the specific axis in Julia?

I am using Julia1.6
Here, X is a D-order multidimeonal array.
How can I slice from i to j on the d-th axis of X ?
Here is an exapmle in case of D=6 and d=4.
X = rand(3,5,6,6,5,6)
Y = X[:,:,:,i:j,:,:]
i and j are given values which are smaller than 6 in the above example.
You can use the built-in function selectdim
help?> selectdim
search: selectdim
selectdim(A, d::Integer, i)
Return a view of all the data of A where the index for dimension d equals i.
Equivalent to view(A,:,:,...,i,:,:,...) where i is in position d.
Examples
≡≡≡≡≡≡≡≡≡≡
julia> A = [1 2 3 4; 5 6 7 8]
2×4 Matrix{Int64}:
1 2 3 4
5 6 7 8
julia> selectdim(A, 2, 3)
2-element view(::Matrix{Int64}, :, 3) with eltype Int64:
3
7
Which would be used something like:
julia> a = rand(10,10,10,10);
julia> selectedaxis = 5
5
julia> indices = 1:2
1:2
julia> selectdim(a,selectedaxis,indices)
Notice that in the documentation example, i is an integer, but you can use ranges of the form i:j as well.
If you need to just slice on a single axis, use the built in selectdim(A, dim, index), e.g., selectdim(X, 4, i:j).
If you need to slice more than one axis at a time, you can build the array that indexes the array by first creating an array of all Colons and then filling in the specified dimensions with the specified indices.
function selectdims(A, dims, indices)
indexer = repeat(Any[:], ndims(A))
for (dim, index) in zip(dims, indices)
indexer[dim] = index
end
return A[indexer...]
end
idx = ntuple( l -> l==d ? (i:j) : (:), D)
Y = X[idx...]

Using result from a function to access a julia array

I'm new to julia and have a problem. I have an array, y.
y = rand (5, 10)
y [1. 1] = 0
Running this gives me an error
for j=1:d
x_filt [j, 1] = y [j, findfirst (y [j, :])]
end
ERROR: syntax: missing separator in array expression
But this doesn't
for j=1:d # fix to 1st obs if 1st tick is missing
temp = findfirst (y [j, :])
x_filt [j, 1] = y [j, temp];
end
Can someone explain how to make the first version work? Or at least explain why it doesn't?
Thanks!
First, I guess you meant y[1, 1] = 0? I get an error if I use y [1. 1] = 0.
Julia has space sensitive syntax in some contexts, notable inside brackets [].
Some examples:
julia> max(1, 2)
2
julia> max (1, 2)
2
julia> [max(1, 2)]
1-element Array{Int64,1}:
2
julia> [max (1, 2)]
1x2 Array{Any,2}:
max (1,2)
julia> [1 + 2]
1-element Array{Int64,1}:
3
julia> [1 +2]
1x2 Array{Int64,2}:
1 2
In your first example, the call to findfirst in
x_filt [j, 1] = y [j, findfirst (y [j, :])]
is interpreted as two space-separated items, findfirst and (y [j, :])]. Julia then complains that they are separated by a space and not a comma.
In your second example, you were able to circumvent this since the call to findfirst in
temp = findfirst (y [j, :])
is no longer in a space sensitive context.
I would recommend that when writing Julia code, you should never put a space between the function name and parenthesis ( in a function call or the variable and bracket [ in indexing, because the code will be treated differently in space sensitive contexts. E.g., your first example without the extra spaces
for j=1:d
x_filt[j, 1] = y[j, findfirst(y[j, :])]
end
works fine (provided that you define d and x_filt appropriately first).

Resources