TensorOperations.jl Julia Package for computing contractions of tensors with vectors - vector

I am trying to compute contractions of tensors with vector objects, and construct tensor objects from vectors using TensorOperations.jl. For example, I would like to compute the outer product of two simple vectors
using TensorOperations
first_vec = [1 1 1]
second_vec = [2 2 2]
#tensor combination[a, b]:=first_vec[a]*second_vec[b]
Throws the following error:
TensorOperations.IndexError{String}("invalid permutation of length 2: (1,)")
Stacktrace:
\[1\] contract!(α::Bool, A::Matrix{Int64}, CA::Symbol, B::Matrix{Int64}, CB::Symbol, β::Bool, C::Matrix{Int64}, oindA::Tuple{Int64}, cindA::Tuple{}, oindB::Tuple{Int64}, cindB::Tuple{}, indCinoAB::Tuple{Int64, Int64}, syms::Tuple{Symbol, Symbol, Symbol})
# TensorOperations \~/.julia/packages/TensorOperations/LDxfx/src/implementation/stridedarray.jl:247
\[2\] contract!(α::Bool, A::Matrix{Int64}, CA::Symbol, B::Matrix{Int64}, CB::Symbol, β::Bool, C::Matrix{Int64}, oindA::Tuple{Int64}, cindA::Tuple{}, oindB::Tuple{Int64}, cindB::Tuple{}, indleft::Tuple{Int64, Int64}, indright::Tuple{}, syms::Tuple{Symbol, Symbol, Symbol})
# TensorOperations \~/.julia/packages/TensorOperations/LDxfx/src/implementation/stridedarray.jl:89
\[3\] top-level scope
# In\[329\]:4
\[4\] eval
# ./boot.jl:368 \[inlined\]
\[5\] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base ./loading.jl:1428\

The problem is that:
first_vec = [1 1 1]
second_vec = [2 2 2]
define matrices and not vectors.
first_vec = [1,1,1]
second_vec = [2,2,2]
#tensor combination[a, b]:=first_vec[a]*second_vec[b]
works fine.
Note the commas replacing spaces in square brackets. Spaces do hcat (horizontal concatenation) and it becomes a row matrix. Vectors in Julia are usually thought of as columns.

Related

A function for boxplot a column of dataframe in Julia

In Julia, one can draw a boxplot using StatsPlots.jl. Assuming There is a DataFrame named df, we can draw a boxplot for one of its columns named a by this:
julia> #df df boxplot(["a"], :a, fillalpha=0.75, linewidth=2)
I want to put the same structure in a function:
julia> function BoxPlotColumn(col::Union{Symbol, String}, df::DataFrame)
if isa(col, String)
#df df boxplot([col], Symbol(col), fillalpha=0.75, linewidth=2)
else
#df df boxplot([String(col)], col, fillalpha=0.75, linewidth=2)
end
end
BoxPlotColumn (generic function with 1 method)
Then, if I say BoxPlotColumn("a", df), Julia throws an error:
ERROR: Cannot convert Symbol to series data for plotting
Stacktrace:
[1] error(s::String)
# Base .\error.jl:35
[2] _prepare_series_data(x::Symbol)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:8
[3] _series_data_vector(x::Symbol, plotattributes::Dict{Symbol, Any})
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:35
[4] macro expansion
# C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:135 [inlined]
[5] apply_recipe(plotattributes::AbstractDict{Symbol, Any}, #unused#::Type{RecipesPipeline.SliceIt}, x::Any, y::Any, z::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesBase\qpxEX\src\RecipesBase.jl:289
[6] _process_userrecipes!(plt::Any, plotattributes::Any, args::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\user_recipe.jl:36
[7] recipe_pipeline!(plt::Any, plotattributes::Any, args::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\RecipesPipeline.jl:70
[8] _plot!(plt::Plots.Plot, plotattributes::Any, args::Any)
# Plots C:\Users\Shayan\.julia\packages\Plots\lW9ll\src\plot.jl:209
[9] #plot#145
# C:\Users\Shayan\.julia\packages\Plots\lW9ll\src\plot.jl:91 [inlined]
[10] boxplot(::Any, ::Vararg{Any}; kw::Base.Pairs{Symbol, V, Tuple{Vararg{Symbol, N}}, NamedTuple{names, T}} where {V, N, names, T<:Tuple{Vararg{Any, N}}})
# Plots C:\Users\Shayan\.julia\packages\RecipesBase\qpxEX\src\RecipesBase.jl:410
[11] add_label(::Vector{String}, ::typeof(boxplot), ::Vector{String}, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:fillalpha, :linewidth), Tuple{Float64, Int64}}}) # StatsPlots C:\Users\Shayan\.julia\packages\StatsPlots\faFN5\src\df.jl:153
[12] (::var"#33#34"{String})(349::DataFrame)
# Main .\none:0
[13] BoxPlotColumn(col::String, df::DataFrame)
# Main c:\Users\Shayan\Documents\Python Scripts\test2.jl:15
[14] top-level scope
# c:\Users\Shayan\Documents\Python Scripts\test2.jl:22
Which is because of this : #df df boxplot([col], Symbol(col), fillalpha=0.75, linewidth=2)
How can I fix this? Why does this happen? I wrote the same thing just in a function.
I wrote the same thing just in a function.
You have not written the same thing. In your original code you use string and Symbol literals, and in function you pass a variable. This is the key difference.
To fix this I recommend you to use #with from DataFramesMeta.jl:
BoxPlotColumn(col::Union{Symbol, String}, df::DataFrame) =
#with df boxplot([string(col)], $col, fillalpha=0.75, linewidth=2)
which does what you want, as #with supports working with column names programmatically with $.
EDIT
Why Julia doesn't operate when we say boxplot(..., col, ...)
It does not operate because both #df and #which are macros. Since they are macros they transform code into other code that is only later executed. These macros are designed in a way that when they see a symbol literal, e.g. :a they treat it in a special way and consider it to be a column of a data frame. When they see a variable col they cannot know that this variable points to a symbol as the macro is executed before code is evaluated (remember - macro is a method to transform code into other code before this code is executed). See https://docs.julialang.org/en/v1/manual/metaprogramming/#man-macros
MethodError: no method matching isfinite(::String15)
Most likely you have a column with strings not numbers, instead write e.g. names(df, Real) to only get a list of columns that store real numbers (without missing). If you want to allow missing then write names(df, Union{Missing,Real}).

MethodError: no method matching isless(::Matrix{Float64}, ::Matrix{Float64})

Code - Quasi Newton Problem
This is the code for Quasi Newton Problem. For this, I am getting an error
MethodError: no method matching isless(::Matrix{Float64}, ::Matrix{Float64})
Closest candidates are:
isless(::Any, ::Missing) at missing.jl:88
isless(::Missing, ::Any) at missing.jl:87
Stacktrace:
[1] <(x::Matrix{Float64}, y::Matrix{Float64})
# Base .\operators.jl:279
[2] >(x::Matrix{Float64}, y::Matrix{Float64})
# Base .\operators.jl:305
[3] bracket_minimum(f::var"#45#46"{typeof(k), Matrix{Float64}, Matrix{Float64}}, x::Int64; s::Float64, k::Float64)
# Main .\In[122]:12
[4] bracket_minimum(f::Function, x::Int64) (repeats 2 times)
# Main .\In[122]:10
[5] line_search(f::typeof(k), x::Matrix{Float64}, d::Matrix{Float64})
# Main .\In[122]:35
[6] step!(M::BFGS, f::Function, ∇f::typeof(l), x::Matrix{Float64})
# Main .\In[122]:48
[7] quasi_method(f::typeof(k), g::Function, x0::Matrix{Float64})
# Main .\In[122]:67
[8] top-level scope
# In[128]:3
[9] eval
# .\boot.jl:360 [inlined]
[10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:1116
I still couldn't figure what's the issue with line 10 and 12. So help me out here? Maybe it's due to Matrix we are using or some other issues, I couldn't debug.
You forgot to vectorize your code.
See the example below for reference:
julia> a=[1. 2.; 3. 4.]; b=[1. 3.; 5. 7.];
julia> a < b
ERROR: MethodError: no method matching isless(::Matrix{Float64}, ::Matrix{Float64})
Now let us add dot to vectorize:
julia> a .< b
2×2 BitMatrix:
0 1
1 1
If you want to check if all elements are lower use all:
julia> all(a .< b)
false
In your code bracket_minimum does the comparison yc > yb and those values are matrices while your code expects scalars.

Evaluate simple RNN in Julia Flux

I'm trying to learn Recurrent Neural Networks (RNN) with Flux.jl in Julia by following along some tutorials, like Char RNN from the FluxML/model-zoo.
I managed to build and train a model containing some RNN cells, but am failing to evaluate the model after training.
Can someone point out what I'm missing for this code to evaluate a simple (untrained) RNN?
julia> using Flux
julia> simple_rnn = Flux.RNN(1, 1, (x -> x))
julia> simple_rnn.([1, 2, 3])
ERROR: MethodError: no method matching (::Flux.RNNCell{var"#1#2", Matrix{Float32}, Vector{Float32}, Matrix{Float32}})(::Matrix{Float32}, ::Int64)
Closest candidates are:
(::Flux.RNNCell{F, A, V, var"#s263"} where var"#s263"<:AbstractMatrix{T})(::Any, ::Union{AbstractMatrix{T}, AbstractVector{T}, Flux.OneHotArray}) where {F, A, V, T} at C:\Users\UserName\.julia\packages\Flux\6o4DQ\src\layers\recurrent.jl:83
Stacktrace:
[1] (::Flux.Recur{Flux.RNNCell{var"#1#2", Matrix{Float32}, Vector{Float32}, Matrix{Float32}}, Matrix{Float32}})(x::Int64)
# Flux C:\Users\UserName\.julia\packages\Flux\6o4DQ\src\layers\recurrent.jl:34
[2] _broadcast_getindex_evalf
# .\broadcast.jl:648 [inlined]
[3] _broadcast_getindex
# .\broadcast.jl:621 [inlined]
[4] getindex
# .\broadcast.jl:575 [inlined]
[5] copy
# .\broadcast.jl:922 [inlined]
[6] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, Flux.Recur{Flux.RNNCell{var"#1#2", Matrix{Float32}, Vector{Float32}, Matrix{Float32}}, Matrix{Float32}}, Tuple{Vector{Int64}}})
# Base.Broadcast .\broadcast.jl:883
[7] top-level scope
# REPL[3]:1
[8] top-level scope
# C:\Users\UserName\.julia\packages\CUDA\LTbUr\src\initialization.jl:81
I'm using Julia 1.6.1 on Windows 10.
Turns out it's just a problem with the input type.
Doing something like this will work:
julia> v = Vector{Vector{Float32}}([[1], [2], [3]])
julia> simple_rnn.(v)
3-element Vector{Vector{Float32}}:
[9.731078]
[16.657223]
[28.398548]
I tried a lot of combinations until I found the working one. There is probably a way to automatically convert the input with some evaluation function.

How to use `hasmethod` (or any other way) to find if `hasmethod(fn, Tuple{Type1, Type2, **Any Type**})`

I want to use the hasmethod function to find if an object t::T supports t[!, something] syntax.
The key is something can be of many types and I don't want to check them all, I just want a way to express that hasmethod(getindex, Tuple{T, typeof{!}, S}) regardless of what S is.
How do I do that?
I think the way to get the list of methods is:
methods(getindex, Tuple{Any, typeof(!), Any})
The only problem is that if the second argument allowed by getindex is a supertype of typeof(!) it will be also listed. I do not think it can be avoided though, as you cannot rule out that such getindex definition actually allows ! to be passed as a first argument.
For instance if no packages are loaded the following is the result of the call above:
julia> methods(getindex, Tuple{Any, typeof(!), Any})
# 10 methods for generic function "getindex":
[1] getindex(md::Markdown.MD, args...) in Markdown at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Markdown\src\parse\parse.jl:24
[2] getindex(r::Distributed.Future, args...) in Distributed at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Distributed\src\remotecall.jl:624
[3] getindex(::Type{Any}, vals...) in Base at array.jl:357
[4] getindex(::Type{T}, x, y) where T in Base at array.jl:353
[5] getindex(::Type{T}, vals...) where T in Base at array.jl:344
[6] getindex(A::SparseArrays.SparseMatrixCSC, i, ::Colon) in SparseArrays at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\SparseArrays\src\sparsematrix.jl:1879
[7] getindex(A::AbstractArray, I...) in Base at abstractarray.jl:979
[8] getindex(t::AbstractDict, k1, k2, ks...) in Base at abstractdict.jl:476
[9] getindex(itr::Base.SkipMissing, I...) in Base at missing.jl:232
[10] getindex(r::Distributed.RemoteChannel, args...) in Distributed at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Distributed\src\remotecall.jl:626
All of the cases are the ones where getindex does not put restrictions on the second argument so you cannot rule out that they could possibly allow ! as a valid value.
But if e.g. you load DataFrames.jl and restrict the first argument to AbstractDataFrame you get:
julia> methods(getindex, Tuple{AbstractDataFrame, typeof(!), Any})
# 5 methods for generic function "getindex":
[1] getindex(df::DataFrame, ::typeof(!), col_ind::Symbol) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:367
[2] getindex(df::DataFrame, ::typeof(!), col_ind::Union{Signed, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:358
[3] getindex(df::DataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:405
[4] getindex(sdf::SubDataFrame, ::typeof(!), colind::Union{Signed, Symbol, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:127
[5] getindex(df::SubDataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:137
which is now more informative, because in DataFrames.jl we try to be careful not to leave indexing arguments free (i.e. allow them to be Any and only internally check what is valid).
Additionally you can use methodswith to check which methods accept ! explicitly and filter only getindex instances. Here is the result after loading DataFrames.jl:
julia> filter(x -> x.name == :getindex, methodswith(typeof(!)))
[1] getindex(df::DataFrame, ::typeof(!), col_ind::Symbol) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:367
[2] getindex(df::DataFrame, ::typeof(!), col_ind::Union{Signed, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:358
[3] getindex(df::DataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:405
[4] getindex(sdf::SubDataFrame, ::typeof(!), colind::Union{Signed, Symbol, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:127
[5] getindex(df::SubDataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:137
Finally note that x[!, y] syntax can also mean a view (if preceded by e.g. #view) or a setindex! operation (if it appears on LHS of assignment) so you might wan to check also these functions if they accept ! (and in DataFrames.jl they actually do).
You can do that in theory, if you get the syntax right:
julia> hasmethod(getindex, Tuple{Vector{Int}, typeof(!), Any})
true
This should work since Tuples are covariant.
But its returning true is obviously nonsense:
julia> getindex([1], !, 1)
ERROR: ArgumentError: invalid index: ! of type typeof(!)
Stacktrace:
[1] to_index(::Function) at ./indices.jl:270
[2] to_index(::Array{Int64,1}, ::Function) at ./indices.jl:247
[3] to_indices(::Array{Int64,1}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{typeof(!),Int64}) at ./indices.jl:298
[4] to_indices at ./indices.jl:294 [inlined]
[5] getindex(::Array{Int64,1}, ::Function, ::Int64) at ./abstractarray.jl:981
[6] top-level scope at REPL[26]:1
The reason is that it delegates to other methods internally.
I think it's better to constrain T to some abstract type(s) for which this kind of indexing is known to work. Or, as a last resort, use try.

How to concatenate (merge) AAStringSets by name?

In bioinformatics/microbial ecology literature a fairly common practice is to concatenate multiple sequence alignments of multiple genes prior to building phylogenetic trees. In R terminology it may be clearer to say 'merge' these sequences by the organism they came from, but I'm sure examples are better.
Say these are two multiple sequence alignments.
library(Biostrings)
set1<-AAStringSet(c("IVR", "RDG", "LKS"))
names(set1)<-paste("org", 1:3, sep="_")
set2<-AAStringSet(c("VRT", "RKG", "AST"))
names(set2)<-paste("org", 2:4, sep="_")
set1
A AAStringSet instance of length 3
width seq names
[1] 3 IVR org_1
[2] 3 RDG org_2
[3] 3 LKS org_3
set2
A AAStringSet instance of length 3
width seq names
[1] 3 VRT org_2
[2] 3 RKG org_3
[3] 3 AST org_4
The correct concatenation of these sequences would be
A AAStringSet instance of length 4
width seq names
[1] 6 IVR--- org_1
[2] 6 RDGVRT org_2
[3] 6 LKSRKG org_3
[4] 6 ---AST org_4
The "-" notes a 'gap' (lack of amino acid) in that position, or in this case a lack of a gene to concatenate.
I thought there would be a function to do this in BioStrings, MSA, DECIPHER, or other related packages, but have been unable to find one.
I found the following Q&As, each does not provide the desired output as described.
1: https://support.bioconductor.org/p/38955/
output
A AAStringSet instance of length 6
width seq names
[1] 3 IVR org_1
[2] 3 RDG org_2
[3] 3 LKS org_3
[4] 3 VRT org_2
[5] 3 RKG org_3
[6] 3 AST org_4
May be better described as 'appending' the sequences (joins the two sets vertically).
2: https://support.bioconductor.org/p/39878/
output
A AAStringSet instance of length 2
width seq
[1] 9 IVRRDGLKS
[2] 9 VRTRKGAST
Concatenates sequences in each set, a complete chimera of each set (certainly not desired).
3: How to concatenate two DNAStringSet sequences per sample in R?
output
A AAStringSet instance of length 3
width seq
[1] 6 IVRVRT
[2] 6 RDGRKG
[3] 6 LKSAST
Creates chimeras of sequences by the order they are in. Even worse with different number of sequences (loops and concatenates shorter set...)
4: https://www.biostars.org/p/115192/
Output
A AAStringSet instance of length 2
width seq
[1] 3 IVR
[2] 3 VRT
Only appends the first sequence from each set, not sure why anyone wants this...
I would normally think these kinds of processes would be done with some combination of bash and Python, but I'm using the DECIPHER multiple sequence aligner in R, so it makes sense to do the rest of the processing in R. In the process of writing up this question I came up with an answer that I will post, but I'm kind of expecting someone to point me to the manual I missed that describes the function that does this. Thanks!
So I am a somewhat fanatical user of data.table in R, among many things it is great to merge datasets by names. I found Biostrings::AAStringSets can be converted to matrices using as.matrix and these can be converted to data.table and merged.
set1.dt<-data.table(as.matrix(set1), keep.rownames = TRUE)
set2.dt<-data.table(as.matrix(set2), keep.rownames = TRUE)
set12.dt<-merge(set1.dt, set2.dt, by="rn", all=TRUE)
set12.dt
rn V1.x V2.x V3.x V1.y V2.y V3.y
1: org_1 I V R <NA> <NA> <NA>
2: org_2 R D G V R T
3: org_3 L K S R K G
4: org_4 <NA> <NA> <NA> A S T
This is the correct merge, but needs more work to get the final result.
Need to replace "NA" with "-". I always need to look up this question to remember the best way to do this with a data.table.
Fastest way to replace NAs in a large data.table
#slightly modified from original, added arg "x"
f_dowle = function(dt, x) { # see EDIT later for more elegant solution
na.replace = function(v,value=x) { v[is.na(v)] = value; v }
for (i in names(dt))
eval(parse(text=paste("dt[,",i,":=na.replace(",i,")]")))
}
f_dowle(set12.dt, "-")
Concatenate the sequences (not included the names with !"rn")
set12<-apply(set12.dt[ ,!"rn"], 1, paste, collapse="")
Convert back to AAStringSet and add back names
set12<-AAStringSet(set12)
names(set12)<-set12.dt$rn
Desired output
set12
A AAStringSet instance of length 4
width seq names
[1] 6 IVR--- org_1
[2] 6 RDGVRT org_2
[3] 6 LKSRKG org_3
[4] 6 ---AST org_4
This works, but seems quite cumbersome, especially converting between different data formats. Obviously can wrap it into a function to use more easily, but again seems like this should already be a function in some Bioconductor package...

Resources