Related
My question is instead of F = svd(A), can one first allocate an appropriate memory for an SVD structure, and then do F .= svd(A) ?
What I had in mind is something like the following:
function main()
F = Vector{SVD}(undef,10)
# how to preallocate F?
test(F)
end
function test(F::Vector{SVD})
for i in 1:10
F .= svd(rand(3,3))
end
end
Your code almost works. But what you probably wanted was this:
using LinearAlgebra
function main()
F = Vector{SVD}(undef, 10)
test(F)
end
function test(F::Vector{SVD})
for i in 1:10
F[i] = svd(rand(3, 3))
end
return F
end
The line that you had in the for loop was this:
F .= svd(rand(3,3))
which does the same operation on every loop, since you were not indexing into F. In particular, this operation was trying to broadcast a single SVD object into all the fields of F on each iteration of the loop. (And that broadcast operation failed because by default structs are treated as iterable objects with a length method, but SVD does not have a length method.)
However, I would recommend against pre-allocating a vector in this situation. First, let's look at the type of F:
julia> typeof(Vector{SVD}(undef, 10))
Array{SVD,1}
The problem with this vector is that it is parameterized by an abstract type. There is a section in the Performance Tips chapter of the manual that advises against this. SVD is an abstract type because the types of its parameters have not been specified. To make it concrete, you need to specify the types of the parameters, like this:
julia> SVD{Float64,Float64,Array{Float64,2}}
SVD{Float64,Float64,Array{Float64,2}}
julia> Vector{SVD{Float64,Float64,Array{Float64,2}}}(undef, 2)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
#undef
#undef
As you can see, it is difficult to correctly specify the concrete type when you are working with complicated types like SVD. Additionally, if you do so, your code will not be as generic as it could be.
A better approach for a problem like this is to use mapping, broadcasting, or a list comprehension. Then the correct output type will automatically be inferred. Here are some examples:
List comprehension
julia> [svd(rand(3, 3)) for _ in 1:2]
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
SVD{Float64,Float64,Array{Float64,2}}([-0.6357040496635746 -0.2941425771794837 -0.7136949667270628; -0.45459999623274916 -0.6045700314848496 0.654090147040599; -0.6238743500629883 0.7402534845042064 0.2506104028424691], [1.4535849689665463, 0.7212190827260345, 0.05010669163393896], [-0.5975505057447164 -0.588792736048385 -0.5442945039782142; 0.7619724725128861 -0.6283345569895092 -0.15682358121595258; -0.2496624605679292 -0.5084474392397449 0.8241054891903787])
SVD{Float64,Float64,Array{Float64,2}}([-0.5593632049776268 0.654338345992878 -0.5088753618327984; -0.6687620652652163 -0.7189576326033171 -0.18936003428293915; -0.4897653570633183 0.23439550227070827 0.8397551092645418], [1.8461274187259178, 0.21226179692488983, 0.14194607536315287], [-0.29089551972856004 -0.7086270946133293 -0.6428276887173754; -0.9203610429640889 0.023709029028269546 0.390350397126212; 0.2613720474647311 -0.7051847436823973 0.6590896221923739])
Map
julia> map(_ -> svd(rand(3, 3)), 1:2)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
SVD{Float64,Float64,Array{Float64,2}}([-0.5807809149601634 0.5635242755434755 0.5874809951745127; -0.6884131975465821 0.0451903888051729 -0.7239095925620322; -0.43448912329507794 -0.8248625459025509 0.3616918330643316], [1.488618654040125, 0.4122166626927311, 0.004235624485479941], [-0.6721098925787947 -0.2684664121709399 -0.6900681689759235; -0.7384292974335966 0.31185073633575333 0.5978890289498324; -0.05468514413847799 -0.9114136842196914 0.4078414290231468])
SVD{Float64,Float64,Array{Float64,2}}([-0.3677873424759118 0.8090638526628051 -0.4584191892023337; -0.43071684640222546 -0.5851169278783189 -0.6871107472129654; -0.8241452960126802 -0.055261768200600137 0.5636760310989947], [1.6862363968739773, 0.5899255050748418, 0.24246688716190598], [-0.3751742784957875 -0.7172409091515735 -0.5872050229643736; 0.8600668700980193 -0.505618838823938 0.06807766730822862; -0.3457300098559026 -0.4794945964927631 0.8065703268899])
Broadcasting
julia> g = (rand(3, 3) for _ in 1:2)
Base.Generator{UnitRange{Int64},var"#17#18"}(var"#17#18"(), 1:2)
julia> svd.(g)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
SVD{Float64,Float64,Array{Float64,2}}([-0.7988295268840152 0.5443221484534134 -0.256095266807727; -0.5436890668169485 -0.8354777569473182 -0.0798693700362902; -0.257436566171119 0.07543418554831638 0.963346302244777], [1.8188722412547844, 0.3934389096422389, 0.2020398396772306], [-0.7147404794808727 -0.37763644211761316 -0.5886737335538281; -0.6944558966482991 0.4830041206449164 0.5333273169925189; -0.08292800854873916 -0.7899985677359054 0.607474450798845])
SVD{Float64,Float64,Array{Float64,2}}([-0.5910620103531503 0.3599866268397522 0.7218416228050514; -0.7367495542691711 0.12340124384185132 -0.664809918173956; -0.3283988340440176 -0.9247603805931685 0.1922821996018057], [1.826019614357666, 0.5333148215847028, 0.11639139812894106], [-0.6415954756495915 -0.6888196183142843 -0.33746522643279503; -0.5845558664639438 0.7239484700883465 -0.3663236978948133; -0.4966383841474222 0.037764349353666515 0.8671356118331964])
Furthermore, mapping, broadcasting, and list comprehensions should be just as efficient as pre-allocating the vector. If you're doing a simple mapping, then it's usually easier and more readable to use mapping, broadcasting, or list comprehensions. Pre-allocating vectors is a tool I reserve for writing custom algorithms from scratch.
A final note. In most cases, type parameters are considered an implementation detail and are not a part of the public API for a type. As such, it's best to use generic programming approaches that do not rely on fixing the types for type parameters. Of course there are some exceptions to this rule of thumb, like Array{T,N} and Dict{K,V}.
There's a differnent way of preallocation -- you can reuse the input array by always overwriting it, with both the rand call and svd's internal needs:
function test!(F::Vector{SVD})
A = Matrix{Float64}(undef, 3, 3)
for i in 1:10
rand!(A)
F[i] = svd!(A)
end
end
Cameron's advice still holds. I'd probably use something like
function test()
A = Matrix{Float64}(undef, 3, 3)
return map(1:10) do i
svd!(rand!(A))
end
end
given that the number of loops seems not be the critical part.
Consider an existing function in Base, which takes in a variable number of arguments of some abstract type T. I have defined a subtype S<:T and would like to write a method which dispatches if any of the arguments is my subtype S.
As an example, consider function Base.cat, with T being an AbstractArray and S being some MyCustomArray <: AbstractArray.
Desired behaviour:
julia> v = [1, 2, 3];
julia> cat(v, v, v, dims=2)
3×3 Array{Int64,2}:
1 1 1
2 2 2
3 3 3
julia> w = MyCustomArray([1,2,3])
julia> cat(v, v, w, dims=2)
"do something fancy"
Attempt:
function Base.cat(w::MyCustomArray, a::AbstractArray...; dims)
pritnln("do something fancy")
end
But this only works if the first argument is MyCustomArray.
What is an elegant way of achieving this?
I would say that it is not possible to do it cleanly without type piracy (but if it is possible I would also like to learn how).
For example consider cat that you asked about. It has one very general signature in Base (actually not requiring A to be AbstractArray as you write):
julia> methods(cat)
# 1 method for generic function "cat":
[1] cat(A...; dims) in Base at abstractarray.jl:1654
You could write a specific method:
Base.cat(A::AbstractArray...; dims) = ...
and check if any of elements of A is your special array, but this would be type piracy.
Now the problem is that you cannot even write Union{S, T} as since S <: T it will be resolved as just T.
This would mean that you would have to use S explicitly in the signature, but then even:
f(::S, ::T) = ...
f(::T, ::S) = ...
is problematic and a compiler will ask you to define f(::S, ::S) as the above definitions lead to dispatch ambiguity. So, even if you wanted to limit the number of varargs to some maximum number you would have to annotate types for all divisions of A into subsets to avoid dispatch ambiguity (which is doable using macros, but grows the number of required methods exponentially).
For general usage, I concur with Bogumił, but let me make an additional comment. If you have control over how cat is called, you can at least write some kind of trait-dispatch code:
struct MyCustomArray{T, N} <: AbstractArray{T, N}
x::Array{T, N}
end
HasCustom() = Val(false)
HasCustom(::MyCustomArray, rest...) = Val(true)
HasCustom(::AbstractArray, rest...) = HasCustom(rest...)
# `IsCustom` or something would be more elegant, but `Val` is quicker for now
Base.cat(::Val{true}, args...; dims) = println("something fancy")
Base.cat(::Val{false}, args...; dims) = cat(args...; dims=dims)
And the compiler is cool enough to optimize that away:
julia> args = (v, v, w);
julia> #code_warntype cat(HasCustom(args...), args...; dims=2);
Variables
#self#::Core.Compiler.Const(cat, false)
#unused#::Core.Compiler.Const(Val{true}(), false)
args::Tuple{Array{Int64,1},Array{Int64,1},MyCustomArray{Int64,1}}
Body::Nothing
1 ─ %1 = Main.println("something fancy")::Core.Compiler.Const(nothing, false)
└── return %1
If you don't have control over calls to cat, the only resort I can think of to make the above technique work is to overdub methods containing such call, to replace matching calls by the custom implementation. In which case you don't even need to overload cat, but can directly replace it by some mycat doing your fancy stuff.
What I am trying to do is
i = occursin("ENTITIES\n", lines)
i != 0 || error("ENTITIES section not found")
The error information is
ERROR: LoadError: LoadError: MethodError: no method matching occursin(::String, ::Array{String,1})
Closest candidates are:
occursin(::Union{AbstractChar, AbstractString}, ::AbstractString) at strings/search.jl:452
This is a piece of julia v0.6 code. I am using v1.1 now. I am new to julia and don't know what's the proper subsititute function for this. Please help.
You can broadcast orrursin like this (add a . after function name):
julia> x = "abc"
"abc"
julia> y = ["abc", "xyz"]
2-element Array{String,1}:
"abc"
"xyz"
julia> b = occursin.(x, y)
2-element BitArray{1}:
true
false
julia> findall(b)
1-element Array{Int64,1}:
1
julia> findfirst(b)
1
Note that although String can be iterated over it is treated by broadcast as a scalar.
Also it is worth to remember that occursin returns Bool value so that you can use it directly in logical tests e.g. i || error("ENTITIES section not found") in the code from your question.
In order to locate the index in the collection of the occurrence of true in the return value of broadcasted occursin use findall or findfirst functions (there is also findlast). The difference is that findall returns a vector of entries where true is encountered in the collection, while findfirst returns the first such entry only. Also note the difference when you pass all falses to it. findall will return an empty vector and findfirst will return nothing.
If you do not want to retain the vector b in the code above, you can get the indices directly (this should be faster) by passing a predicate as a first argument to findall/findfirst:
julia> findall(t -> occursin(x, t), y)
1-element Array{Int64,1}:
1
julia> findfirst(t -> occursin(x, t), y)
1
Is there a way to check if a function has keywords arguments in Julia? I am looking for something like has_kwargs(fun::Function) that would return true if fun has a method with keyword arguments.
The high level idea is to build a function:
function master_fun(foo::Any, fun::Function, ar::Tuple, kw::Tuple)
if has_kwargs(fun)
fun(ar... ; kw...)
else
fun(ar...)
end
end
Basically, #Michael K. Borregaard's suggestion to use try-catch is correct and officially works.
Looking into the unofficial implementation details, I came up with the followng:
haskw(f,tup) = isdefined(typeof(f).name.mt,:kwsorter) &&
length(methods(typeof(f).name.mt.kwsorter,(Vector{Any},typeof(f),tup...)))>0
This function first looks if there is any keyword processing on any method of the generic function, and if so, looks at the specific tuple of types.
For example:
julia> f(x::Int) = 1
f (generic function with 1 method)
julia> f(x::String ; y="value") = 2
f (generic function with 2 methods)
julia> haskw(f,(Int,))
false
julia> haskw(f,(String,))
true
This should be tested for the specific application, as it probably doesn't work when non-leaf types are involved. As Michael commented, in the question's context the statement would be:
if haskw(fun, typeof.(ar))
...
I don't think you can guarantee that a given function has keyword arguments. Check
f(;x = 3) = println(x)
f(x) = println(2x)
f(3)
#6
f(x = 3)
#3
f(3, x = 3)
#ERROR: MethodError: no method matching f(::Int64; x=3)
#Closest candidates are:
# f(::Any) at REPL[2]:1 got unsupported keyword argument "x"
# f(; x) at REPL[1]:1
So, does the f function have keywords? You can only check for a given method. Note that, in your example above, you'd normally just do
function master_fun(foo, fun::Function, ar::Tuple, kw....)
fun(ar... ; kw...)
end
which should work, and if keywords are passed to a function that does not take them you'd just leave the error reporting to fun. If that is not acceptable you could try to wrap the fun(ar...; kw...) in a try-catch block.
Suppose I have a Dict defined as follows:
x = Dict{AbstractString,Array{Integer,1}}("A" => [1,2,3], "B" => [4,5,6])
I want to convert this to a DataFrame object (from the DataFrames module). Constructing a DataFrame has a similar syntax to constructing a dictionary. For example, the above dictionary could be manually constructed as a data frame as follows:
DataFrame(A = [1,2,3], B = [4,5,6])
I haven't found a direct way to get from a dictionary to a data frame but I figured one could exploit the syntactic similarity and write a macro to do this. The following doesn't work at all but it illustrates the approach I had in mind:
macro dict_to_df(x)
typeof(eval(x)) <: Dict || throw(ArgumentError("Expected Dict"))
return quote
DataFrame(
for k in keys(eval(x))
#eval ($k) = $(eval(x)[$k])
end
)
end
end
I also tried writing this as a function, which does work when all dictionary values have the same length:
function dict_to_df(x::Dict)
s = "DataFrame("
for k in keys(x)
v = x[k]
if typeof(v) <: AbstractString
v = string('"', v, '"')
end
s *= "$(k) = $(v),"
end
s = chop(s) * ")"
return eval(parse(s))
end
Is there a better, faster, or more idiomatic approach to this?
Another method could be
DataFrame(Any[values(x)...],Symbol[map(symbol,keys(x))...])
It was a bit tricky to get the types in order to access the right constructor. To get a list of the constructors for DataFrames I used methods(DataFrame).
The DataFrame(a=[1,2,3]) way of creating a DataFrame uses keyword arguments. To use splatting (...) for keyword arguments the keys need to be symbols. In the example x has strings, but these can be converted to symbols. In code, this is:
DataFrame(;[Symbol(k)=>v for (k,v) in x]...)
Finally, things would be cleaner if x had originally been with symbols. Then the code would go:
x = Dict{Symbol,Array{Integer,1}}(:A => [1,2,3], :B => [4,5,6])
df = DataFrame(;x...)