Check for equivalence of string and symbol - julia

I have a vector of symbols a_sym of length N_sym and a vector of strings a_str of length N_str
a_sym contains symbols at each index, such as :H₅B₂O₆⁻
a_str contains strings at each index, such as H₅B₂O₆⁻
I would like to check for equivalence of a_sym and a_str to see which index the equivalence occurs in each vector.
I have tried to implement a loop to check these two vectors:
E = zeros(Int64,N_sym)
for i in 1:N_str
for ii in 1:N_sym
if a_sym[ii] == a_str[i]
E[ii] = i
end
end
end
Where E is my attempt to map equivalent indices, but my loop never detects the strings to be equivalent. How could this be remedied? (and perhaps simplified?)
For example:
a_sym = [:H₃BO₃ ,:H₄BO₄⁻ ,:Li⁺ ,:H₅B₂O₆⁻, :H₄B₃O₇⁻]
where N_sym would be 5, and:
a_str = ["O⁻","H₃BO₃","H₄BO₄⁻","H₅B₂O₆⁻","H₄B₃O₇⁻","H₃B₃O₆"]
where N_str would be 6. I require the loop to check both vectors and map the indices when there is equivalnce, for instance the index of H₃BO₃ in a_sym would be 1, and its index in a_str would be 2.
I expect a vector E = [2, 3, 0, 4, 5] which is filled with the indices of a_str, and 0 if a_str does not contain a match for a_sym

Symbols and strings are never equivalent:
julia> :a == "a"
false
So you have to convert either to the other first. I would write your function as follows using the builtin findfirst:
julia> E = [findfirst(==(String(b)), a_str) for b in a_sym]
5-element Array{Union{Nothing, Int64},1}:
2
3
nothing
4
5
(Although, as Przemislaw notes, converting the strings to symbols would likely be more efficient.)
nothing is what findfirst returns if it does not find anything. You can convert this to a default by broadcasting something:
julia> something.(E, 0)
5-element Array{Int64,1}:
2
3
0
4
5

Related

How to delete an element from a list in Julia?

v = range(1e10, -1e10, step=-1e8) # velocities [cm/s]
deleteat!(v, findall(x->x==0,v))
I want to delete the value 0 from v. Following this tutorial, I tried deleteat! but I get the error
MethodError: no method matching deleteat!(::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, ::Vector{Int64})
What am I missing here?
Notice the type that is returned by the function range.
typeof(range(1e10, -1e10, step=-1e8))
The above yields to
StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}
Calling the help function for the function deleteat!.
? deleteat!()
deleteat!(a::Vector, inds)
Remove the items at the indices given by inds, and return the > modified a. Subsequent items are shifted to fill the resulting gap.
inds can be either an iterator or a collection of sorted and > unique integer indices, or a boolean vector of the same length as a with true indicating entries to delete.
We can convert the returned type of range using collect. Try the following code.
v = collect(range(1e10, -1e10, step=-1e8))
deleteat!(v,findall(x->x==0,v))
Notice that we can shorten x->x==0 to iszero which yields to
v = collect(range(1e10, -1e10, step=-1e8))
deleteat!(v,findall(iszero,v))
Use filter! or filter:
julia> filter!(!=(0), [1,0,2,0,4])
3-element Vector{Int64}:
1
2
4
In case of a range you can collect it or use:
julia> filter(!=(0), range(2, -2, step=-1))
4-element Vector{Int64}:
2
1
-1
-2
However for big ranges you might just not want to materialize them to save the memory footprint. In that case you could use:
(x for x in range(2, -2, step=-1) if x !== 0)
To see what is being generated you need to collect it:
julia> collect(x for x in range(2, -2, step=-1) if x !== 0)
4-element Vector{Int64}:
2
1
-1
-2

Compare array values with a value in Julia

I have an array a = [1, 2, 3, 4];. I want to compare each element of array a with a number and return a new array contains True/False elements in Julia as few steps as possible. I try result = a < 2 and expected array is result = [True, False, False, False] but it's not working. Hope your help
You need to vectorize (broadcast) the comparison operator so it operates on Vectors.
You can do this by adding a dot . to your code.
julia> a = [1, 2, 3, 4]
4-element Vector{Int64}:
1
2
3
4
julia> a .<= 2
4-element BitVector:
1
1
0
0
Read more about broadcasting here.
Note that Python's numpy will do this for you automatically, but there are cases where an operation might be ambiguous - do you want it to be element wise or a matrix multiplication? So Julia solves this by explicitly broadcasting any operation with the . command.

How to convert a multidimensional array to/from vector of vector of ... vector in julia

Is there a method in julia to convert a multidimensional array to a vector of vector and so on, and vice versa? It is OK to define a method for a fix number of dimensions. But how about a method for arbitrary dims?
julia> s = (1,2,3)
julia> a = reshape(1:prod(s), s)
1×2×3 Base.ReshapedArray{Int64,3,UnitRange{Int64},Tuple{}}:
[:, :, 1] =
1 2
[:, :, 2] =
3 4
[:, :, 3] =
5 6
julia> b = [[[a[i,j,k] for i=1:s[1]] for j=1:s[2]] for k=1:s[3]]
3-element Array{Array{Array{Int64,1},1},1}:
Array{Int64,1}[[1], [2]]
Array{Int64,1}[[3], [4]]
Array{Int64,1}[[5], [6]]
julia> unstack(a) == b
ERROR: UndefVarError: unstack not defined
RecursiveArrayTools.jl can help with this kind of work.
recs = [rand(8) for i in 1:10]
A = VectorOfArray(recs)
A[i] # Returns the ith array in the vector of arrays
A[j,i] # Returns the jth component in the ith array
A[j1,...,jN,i] # Returns the (j1,...,jN) component of the ith array
So it acts like the matrix without ever building the matrix, which is a good way to save allocations if you tend to act on the columns (which are the separate arrays). It also has a fast conversion to a contiguous array via the indexing fallback (honestly, I tried to create a faster one but the fallback worked better than I could make it):
arr = convert(Array,A)
Converting back would require allocating of course
VA = VectorOfArray([A[:,i] for i in size(A,2)])

Transform nested array into new dimension

Given an array as follows:
A = Array{Array{Int}}(2,2)
A[1,1] = [1,2]
A[1,2] = [3,4]
A[2,1] = [5,6]
A[2,2] = [7,8]
We then have that A is a 2x2 array with elements of type Array{Int}:
2×2 Array{Array{Int64,N} where N,2}:
[1, 2] [3, 4]
[5, 6] [7, 8]
It is possible to access the entries with e.g. A[1,2] but A[1,2,2] would not work since the third dimension is not present in A. However, A[1,2][2] works, since A[1,2] returns an array of length 2.
The question is then, what is a nice way to convert A into a 3-dimensional array, B, so that B[i,j,k] refers the the i,j-th array and the k-th element in that array. E.g. B[2,1,2] = 6.
There is a straightforward way to do this using 3 nested loops and reconstructing the array, element-by-element, but I'm hoping there is a nicer construction. (Some application of cat perhaps?)
You can construct a 3-d array from A using an array comprehension
julia> B = [ A[i,j][k] for i=1:2, j=:1:2, k=1:2 ]
2×2×2 Array{Int64,3}:
[:, :, 1] =
1 3
5 7
[:, :, 2] =
2 4
6 8
julia> B[2,1,2]
6
However a more general solution would be to overload the getindex function for arrays with the same type of A. This is more efficient since there is no need to copy the original data.
julia> import Base.getindex
julia> getindex(A::Array{Array{Int}}, i::Int, j::Int, k::Int) = A[i,j][k]
getindex (generic function with 179 methods)
julia> A[2,1,2]
6
With thanks to Dan Getz's comments, I think the following works well and is succinct:
cat(3,(getindex.(A,i) for i=1:2)...)
where 2 is the length of the nested array. It would also work for higher dimensions.
permutedims(reshape(collect(Base.Iterators.flatten(A)), (2,2,2)), (2,3,1))
also does the job and appears to be faster than the accepted cat() answer for me.
EDIT: I'm sorry, I just saw that this has already been suggested in the comments.

Re. partitions()

Why is
julia> collect(partitions(1,2))
0-element Array{Any,1}
returned instead of
2-element Array{Any,1}:
[0,1]
[1,0]
and do I really have to
x = collect(partitions(n,m));
y = Array(Int64,length(x),length(x[1]));
for i in 1:length(x)
for j in 1:length(x[1])
y[i,j] = x[i][j];
end
end
to convert the result to a two-dimensional array?
From the wikipedia:
In number theory and combinatorics, a partition of a positive integer n, also called an integer partition, is a way of writing n as a sum of positive integers.
For array conversion, try:
julia> x = collect(partitions(5,3))
2-element Array{Any,1}:
[3,1,1]
[2,2,1]
or
julia> x = partitions(5,3)
Base.FixedPartitions(5,3)
then
julia> hcat(x...)
3x2 Array{Int64,2}:
3 2
1 2
1 1
Here's another approach to your problem that I think is a little simpler, using the Combinatorics.jl library:
multisets(n, k) = map(A -> [sum(A .== i) for i in 1:n],
with_replacement_combinations(1:n, k))
This allocates a bunch of memory, but I think your current approach does too. Maybe it would be useful to make a first-class version and add it to Combinatorics.jl.
Examples:
julia> multisets(2, 1)
2-element Array{Array{Int64,1},1}:
[1,0]
[0,1]
julia> multisets(3, 5)
21-element Array{Array{Int64,1},1}:
[5,0,0]
[4,1,0]
[4,0,1]
[3,2,0]
[3,1,1]
[3,0,2]
[2,3,0]
[2,2,1]
[2,1,2]
[2,0,3]
⋮
[1,2,2]
[1,1,3]
[1,0,4]
[0,5,0]
[0,4,1]
[0,3,2]
[0,2,3]
[0,1,4]
[0,0,5]
The argument order is backwards from yours to match mathematical convention. If you prefer the other way, that can easily be changed.
one robust solution can be achieved using lexicographic premutations generation algorithm, originally By Donald Knuth plus classic partitions(n).
that is lexicographic premutations generator:
function lpremutations{T}(a::T)
b=Vector{T}()
sort!(a)
n=length(a)
while(true)
push!(b,copy(a))
j=n-1
while(a[j]>=a[j+1])
j-=1
j==0 && return(b)
end
l=n
while(a[j]>=a[l])
l-=1
end
tmp=a[l]
a[l]=a[j]
a[j]=tmp
k=j+1
l=n
while(k<l)
tmp=a[k]
a[k]=a[l]
a[l]=tmp
k+=1
l-=1
end
end
end
The above algorithm will generates all possible unique
combinations of an array elements with repetition:
julia> lpremutations([2,2,0])
3-element Array{Array{Int64,1},1}:
[0,2,2]
[2,0,2]
[2,2,0]
Then we will generate all integer arrays that sum to n using partitions(n) (forget the length of desired arrays m), and resize them to the lenght m using resize_!
function resize_!(x,m)
[x;zeros(Int,m-length(x))]
end
And main function looks like:
function lpartitions(n,m)
result=[]
for i in partitions(n)
append!(result,lpremutations(resize_!(i, m)))
end
result
end
Check it
julia> lpartitions(3,4)
20-element Array{Any,1}:
[0,0,0,3]
[0,0,3,0]
[0,3,0,0]
[3,0,0,0]
[0,0,1,2]
[0,0,2,1]
[0,1,0,2]
[0,1,2,0]
[0,2,0,1]
[0,2,1,0]
[1,0,0,2]
[1,0,2,0]
[1,2,0,0]
[2,0,0,1]
[2,0,1,0]
[2,1,0,0]
[0,1,1,1]
[1,0,1,1]
[1,1,0,1]
[1,1,1,0]
The MATLAB script from http://www.mathworks.com/matlabcentral/fileexchange/28340-nsumk actually behaves the way I need, and is what I though that partitions() would do from the description given. The Julia version is
# k - sum, n - number of non-negative integers
function nsumk(k,n)
m = binomial(k+n-1,n-1);
d1 = zeros(Int16,m,1);
d2 = collect(combinations(collect((1:(k+n-1))),n-1));
d2 = convert(Array{Int16,2},hcat(d2...)');
d3 = ones(Int16,m,1)*(k+n);
dividers = [d1 d2 d3];
return diff(dividers,2)-1;
end
julia> nsumk(3,2)
4x2 Array{Int16,2}:
0 3
1 2
2 1
3 0
using daycaster's lovely hcat(x...) tidbit :)
I still wish there would be a more compact way of doing this.
The the first mention of this approach seem to be https://au.mathworks.com/matlabcentral/newsreader/view_thread/52610, and as far as I can understand it is based on the "stars and bars" method https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)

Resources