How to speedup multiple broadcasts in Julia - julia

This Julia function seems to be quite inefficient (an order of magnitude slower than the equivalent Pythran / C++ code, even after the Julia warmup)...
function my_multi_broadcast(a)
10 * (2*a.^2 + 4*a.^3) + 2 ./ a
end
arr = ones(1000, 1000)
my_multi_broadcast(arr)
I guess it is only that I don't write it correctly... How can one speedup such "multi broadcasts" in Julia? I guess/hope I don't need to expend the loops...
Edit after the first answer
Thank you! With my setup, the Pythran solutions (in place and out of place) are still 1.5 to 2 times faster (without OpenMP). Is there a way to activate SIMD instructions in Julia? Or another way to speed up such CPU computations?
The Python code:
from transonic import jit
#jit
def broadcast(a):
return 10 * (2*a**2 + 4*a**3) + 2 / a
#jit
def broadcast_inplace(a):
a[:] = 10 * (2*a**2 + 4*a**3) + 2 / a
Edit after the #simd suggestion
It seems that #simd does not work out of the box, i.e. just by adding it at the beginning of the line.
ERROR: LoadError: LoadError: Base.SimdLoop.SimdError("for loop expected")
Stacktrace:
[1] compile(::Expr, ::Bool) at ./simdloop.jl:54
[2] #simd(::LineNumberNode, ::Module, ::Any) at ./simdloop.jl:126
[3] include at ./boot.jl:317 [inlined]
[4] include_relative(::Module, ::String) at ./loading.jl:1044
[5] include(::Module, ::String) at ./sysimg.jl:29
[6] exec_options(::Base.JLOptions) at ./client.jl:231
[7] _start() at ./client.jl:425
I guess that one would have to expand the for loops, but then the code (i) becomes much less readable and (ii) is no longer independent of the dimension.
It seems that we have a case for which simple Python/Numpy code can get accelerated with Pythran faster than what we get with Julia (except if there is a way to accelerate this in Julia? and a future Julia version may solve this). Interesting...

Broadcast all operations like this:
julia> function my_multi_broadcast2(a)
#. 10 * (2*a^2 + 4*a^3) + 2 / a
end
my_multi_broadcast2 (generic function with 1 method)
The difference is that in 10 * (2*a.^2 + 4*a.^3) + 2 ./ a you actually do not take advantage of broadcast fusion as * and two + are not broadcasted.
Writing #. 10 * (2*a^2 + 4*a^3) + 2 / a is equivalent to 10 .* (2 .* a.^2 .+ 4 .* a.^3) .+ 2 ./ a.
And here is the comparison of performance
julia> #btime my_multi_broadcast($arr);
58.146 ms (18 allocations: 61.04 MiB)
julia> #btime my_multi_broadcast2($arr);
5.982 ms (4 allocations: 7.63 MiB)
How does it compare to Pythran / C++, as we get roughly 10x speedup?
Finally note that if you could mutate arr in place by writing:
julia> function my_multi_broadcast3(a)
#. a = 10 * (2*a^2 + 4*a^3) + 2 / a
end
my_multi_broadcast3 (generic function with 1 method)
julia> #btime my_multi_broadcast3($arr);
1.840 ms (0 allocations: 0 bytes)
which is yet faster and does zero allocations (I do not know if you want to modify arr in place or create a new array so I show both approaches).

Related

Is it possible to implement effect handlers using Julia's coroutines?

I am new to each of: Julia, coroutines and effect handlers, so what I am going to ask might be misguided, but is it possible to implement effect handlers using coroutines? I think that Scheme's coroutines would allow you to grab the rest of the computation block for later resumption which would allow implementing effect handlers, but Julia's coroutines seem to not have that functionality. Is that wrong, or is the only choice to do the CPS transform like the library I linked to and base the EH implementation on that?
There's a lot I don't know about your question, but in Julia the "low level" way to implement Task control-flow is via a Channel. There's a nice page on this in the manual. When constructed with size 0, channel manipulations are blocking until removal is completed. Here's an example:
julia> c = Channel{Int}() # specify the type (here `Int`) for better performance, if the type is always the same
Channel{Int64}(0) (empty)
julia> sender(x) = put!(c, x)
sender (generic function with 1 method)
julia> receiver() = println("got ", take!(c))
receiver (generic function with 1 method)
julia> receiver() # this blocked until I hit Ctrl-C
^CERROR: InterruptException:
Stacktrace:
[1] poptask(W::Base.InvasiveLinkedListSynchronized{Task})
# Base ./task.jl:760
[2] wait()
# Base ./task.jl:769
[3] wait(c::Base.GenericCondition{ReentrantLock})
# Base ./condition.jl:106
[4] take_unbuffered(c::Channel{Int64})
# Base ./channels.jl:405
[5] take!(c::Channel{Int64})
# Base ./channels.jl:383
[6] receiver()
# Main ./REPL[3]:1
[7] top-level scope
# REPL[4]:1
julia> t = #async receiver()
Task (runnable) #0x00007f0b288f4d30
julia> sender(5)
got 5
5
julia> sender(-8) # this blocks because the `receiver` task finished
^CERROR: InterruptException:
Stacktrace:
[1] poptask(W::Base.InvasiveLinkedListSynchronized{Task})
# Base ./task.jl:760
[2] wait()
# Base ./task.jl:769
[3] wait(c::Base.GenericCondition{ReentrantLock})
# Base ./condition.jl:106
[4] put_unbuffered(c::Channel{Int64}, v::Int64)
# Base ./channels.jl:341
[5] put!(c::Channel{Int64}, v::Int64)
# Base ./channels.jl:316
[6] sender(x::Int64)
# Main ./REPL[2]:1
[7] top-level scope
# REPL[7]:1
julia> t = #async while true receiver() end # run the receiver "forever"
Task (runnable) #0x00007f0b28a4da50
julia> sender(-8)
got -8
-8
julia> sender(11)
got 11
11
In a real application, you should ensure that c isn't a non-const global, see the performance tips.

Julia Metaprogramming: Function for Mathematical Series

I'm trying to build a function that will output an expression to be assigned to a new in-memory function. I might be misinterpreting the capability of metaprogramming but, I'm trying to build a function that generates a math series and assigns it to a function such as:
main.jl
function series(iter)
S = ""
for i in 1:iter
a = "x^$i + "
S = S*a
end
return chop(S, tail=3)
end
So, this will build the pattern and I'm temporarily working with it in the repl:
julia> a = Meta.parse(series(4))
:(x ^ 1 + x ^ 2 + x ^ 3 + x ^ 4)
julia> f =eval(Meta.parse(series(4)))
120
julia> f(x) =eval(Meta.parse(series(4)))
ERROR: cannot define function f; it already has a value
Obviously eval isn't what I'm looking for in this case but, is there another function I can use? Or, is this just not a viable way to accomplish the task in Julia?
The actual error you get has to do nothing with metaprogramming, but with the fact that you are reassigning f, which was assigned a value before:
julia> f = 10
10
julia> f(x) = x + 1
ERROR: cannot define function f; it already has a value
Stacktrace:
[1] top-level scope at none:0
[2] top-level scope at REPL[2]:1
It just doesn't like that. Call either of those variables differently.
Now to the conceptual problem. First, what you do here is not "proper" metaprogramming in Julia: why deal with strings and parsing at all? You can work directly on expressions:
julia> function series(N)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
series (generic function with 1 method)
julia> series(3)
:(x ^ 1 + x ^ 2 + x ^ 3)
This makes use of the fact that + belongs to the class of expressions that are automatically collected in repeated applications.
Second, you don't call eval at the appropriate place. I assume you meant to say "give me the function of x, with the body being what series(4) returns". Now, while the following works:
julia> f3(x) = eval(series(4))
f3 (generic function with 1 method)
julia> f3(2)
30
it is not ideal, as you newly compile the body every time the function is called. If you do something like that, it is preferred to expand the code once into the body at function definition:
julia> #eval f2(x) = $(series(4))
f2 (generic function with 1 method)
julia> f2(2)
30
You just need to be careful with hygiene here. All depends on the fact that you know that the generated body is formulated in terms of x, and the function argument matches that. In my opinion, the most Julian way of implementing your idea is through a macro:
julia> macro series(N::Int, x)
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :($x ^ $i))
end
return S
end
#series (macro with 1 method)
julia> #macroexpand #series(4, 2)
:(2 ^ 1 + 2 ^ 2 + 2 ^ 3 + 2 ^ 4)
julia> #series(4, 2)
30
No free variables remaining in the output.
Finally, as has been noted in the comments, there's a function (and corresponding macro) evalpoly in Base which generalizes your use case. Note that this function does not use code generation -- it uses a well-designed generated function, which in combination with the optimizations results in code that is usually equal to the macro-generated code.
Another elegant option would be to use the multiple-dispatch mechanism of Julia and dispatch the generated code on type rather than value.
#generated function series2(p::Val{N}, x) where N
S = Expr(:call, :+)
for i in 1:N
push!(S.args, :(x ^ $i))
end
return S
end
Usage
julia> series2(Val(20), 150.5)
3.5778761722367333e43
julia> series2(Val{20}(), 150.5)
3.5778761722367333e43
This task can be accomplished with comprehensions. I need to RTFM...
https://docs.julialang.org/en/v1/manual/arrays/#Generator-Expressions

Getting the whole AST of the file / complex code

Julia manual states:
Every Julia program starts life as a string:
julia> prog = "1 + 1"
"1 + 1"
I can easily get the AST of the simple expression, or even a function with the help of quote / code_*, or using Meta.parse / Meta.show_sexpr if I have the expression in a string.
The question: Is there any way to get the whole AST of the codepiece, possibly including several atomic expressions? Like, read the source file and convert it to AST?
If you want to do this from Julia instead of FemtoLisp, you can do
function parse_file(path::AbstractString)
code = read(path, String)
Meta.parse("begin $code end")
end
This takes in a file path, reads it and parses it to a big expression that can be evaluated.
This comes from #NHDaly's answer, here:
https://stackoverflow.com/a/54317201/751061
If you already have your file as a string and don’t want to have to read it again, you can instead do
parse_all(code::AbstractString) = Meta.parse("begin $code end")
It was pointed out on Slack by Nathan Daly and Taine Zhao that this code won't work for modules:
julia> eval(parse_all("module M x = 1 end"))
ERROR: syntax: "module" expression not at top level
Stacktrace:
[1] top-level scope at REPL[50]:1
[2] eval at ./boot.jl:331 [inlined]
[3] eval(::Expr) at ./client.jl:449
[4] |>(::Expr, ::typeof(eval)) at ./operators.jl:823
[5] top-level scope at REPL[50]:1
This can be fixed as follows:
julia> eval_all(ex::Expr) = ex.head == :block ? for e in ex eval_all(e) end : eval(e);
julia> eval_all(ex::Expr) = ex.head == :block ? eval.(ex.args) : eval(e);
julia> eval_all(parse_all("module M x = 1 end"));
julia> M.x
1
Since the question asker is not convinced that the above code produces a tree, here is a graph representation of the output of parse_all, clearly showing a tree structure.
In case you're curious, those leaves labelled #= none:1 =# are line number nodes, indicating the line on which each following expression takes place.
As suggested in the comments, one can also apply Meta.show_sexpr to an Expr object to get a more "lispy" representation of the AST without all the pretty printing julia does by default:
julia> (Meta.show_sexpr ∘ Meta.parse)("begin x = 1\n y = 2\n z = √(x^2 + y^2)\n end")
(:block,
:(#= none:1 =#),
(:(=), :x, 1),
:(#= none:2 =#),
(:(=), :y, 2),
:(#= none:3 =#),
(:(=), :z, (:call, :√, (:call, :+, (:call, :^, :x, 2), (:call, :^, :y, 2))))
)
There's jl-parse-file in the FemtoLisp implementation of the Julia parser. You can call it from the Lisp REPL (julia --lisp), and it returns an S-expression for the whole file. Since Julia's Expr is not much different from Lisp S-expressions, that might be enough for you purposes.
I still wonder how one would access the result of this from within Julia. If I understand correctly, the Lisp functions are not exported from libjulia, so there's no direct way to just use a ccall. But maybe a variant of jl_parse_eval_all can be implemented.

Change julia promt to include evalutation numbers

When debugging or running julia code in REPL, I usually see error messages showing ... at ./REPL[161]:12 [inlined].... The number 161 means the 161-th evaluation in REPL, I guess. So my question is could we show this number in julia's prompt, i.e. julia [161]> instead of julia>?
One of the advantages of Julia is its ultra flexibility. This is very easy in Julia 0.7 (nightly version).
julia> repl = Base.active_repl.interface.modes[1]
"Prompt(\"julia> \",...)"
julia> repl.prompt = () -> "julia[$(length(repl.hist.history) - repl.hist.start_idx + 1)] >"
#1 (generic function with 1 method)
julia[3] >
julia[3] >2
2
julia[4] >f = () -> error("e")
#3 (generic function with 1 method)
julia[5] >f()
ERROR: e
Stacktrace:
[1] error at .\error.jl:33 [inlined]
[2] (::getfield(, Symbol("##3#4")))() at .\REPL[4]:1
[3] top-level scope
You just need to put the first 2 lines onto your ~/.juliarc and enjoy~
Since there are several changes in the REPL after julia 0.7, these codes do not work in old versions.
EDIT: Well, actually there need a little bit more efforts to make it work in .juliarc.jl. Try this code:
atreplinit() do repl
repl.interface = Base.REPL.setup_interface(repl)
repl = Base.active_repl.interface.modes[1]
repl.prompt = () -> "julia[$(length(repl.hist.history) - repl.hist.start_idx + 1)] >"
end

Julia: invoke a function by a given string

Does Julia support the reflection just like java?
What I need is something like this:
str = ARGS[1] # str is a string
# invoke the function str()
The Good Way
The recommended way to do this is to convert the function name to a symbol and then look up that symbol in the appropriate namespace:
julia> fn = "time"
"time"
julia> Symbol(fn)
:time
julia> getfield(Main, Symbol(fn))
time (generic function with 2 methods)
julia> getfield(Main, Symbol(fn))()
1.448981716732318e9
You can change Main here to any module to only look at functions in that module. This lets you constrain the set of functions available to only those available in that module. You can use a "bare module" to create a namespace that has only the functions you populate it with, without importing all name from Base by default.
The Bad Way
A different approach that is not recommended but which many people seem to reach for first is to construct a string for code that calls the function and then parse that string and evaluate it. For example:
julia> eval(parse("$fn()")) # NOT RECOMMENDED
1.464877410113412e9
While this is temptingly simple, it's not recommended since it is slow, brittle and dangerous. Parsing and evaling code is inherently much more complicated and thus slower than doing a name lookup in a module – name lookup is essentially just a hash table lookup. In Julia, where code is just-in-time compiled rather than interpreted, eval is much slower and more expensive since it doesn't just involve parsing, but also generating LLVM code, running optimization passes, emitting machine code, and then finally calling a function. Parsing and evaling a string is also brittle since all intended meaning is discarded when code is turned into text. Suppose, for example, someone accidentally provides an empty function name – then the fact that this code is intended to call a function is completely lost by accidental similarity of syntaxes:
julia> fn = ""
""
julia> eval(parse("$fn()"))
()
Oops. That's not what we wanted at all. In this case the behavior is fairly harmless but it could easily be much worse:
julia> fn = "println(\"rm -rf /important/directory\"); time"
"println(\"rm -rf /important/directory\"); time"
julia> eval(parse("$fn()"))
rm -rf /important/directory
1.448981974309033e9
If the user's input is untrusted, this is a massive security hole. Even if you trust the user, it is still possible for them to accidentally provide input that will do something unexpected and bad. The name lookup approach avoids these issues:
julia> getfield(Main, Symbol(fn))()
ERROR: UndefVarError: println("rm -rf /important/directory"); time not defined
in eval(::Module, ::Any) at ./boot.jl:225
in macro expansion at ./REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:46
The intent of looking up a name and then calling it as a function is explicit, instead of implicit in the generated string syntax, so at worst one gets an error about a strange name being undefined.
Performance
If you're going to call a dynamically specified function in an inner loop or as part of some recursive computation, you will want to avoid doing a getfield lookup every time you call the function. In this case all you need to do is make a const binding to the dynamically specified function before defining the iterative/recursive procedure that calls it. For example:
fn = "deg2rad" # converts angles in degrees to radians
const f = getfield(Main, Symbol(fn))
function fast(n)
t = 0.0
for i = 1:n
t += f(i)
end
return t
end
julia> #time fast(10^6) # once for JIT compilation
0.010055 seconds (2.97 k allocations: 142.459 KB)
8.72665498661791e9
julia> #time fast(10^6) # now it's fast
0.003055 seconds (6 allocations: 192 bytes)
8.72665498661791e9
julia> #time fast(10^6) # see?
0.002952 seconds (6 allocations: 192 bytes)
8.72665498661791e9
The binding f must be constant for optimal performance, since otherwise the compiler can't know that you won't change f to point at another function at any time (or even something that's not a function), so it has to emit code that looks f up dynamically on every loop iteration – effectively the same thing as if you manually call getfield in the loop. Here, since f is const, the compiler knows f can't change so it can emit fast code that just calls the right function directly. But the compiler can sometimes do even better than that – in this case it actually inlines the implementation of the deg2rad function, which is just a multiplication by pi/180:
julia> #code_llvm fast(100000)
define double #julia_fast_51089(i64) #0 {
top:
%1 = icmp slt i64 %0, 1
br i1 %1, label %L2, label %if.preheader
if.preheader: ; preds = %top
br label %if
L2.loopexit: ; preds = %if
br label %L2
L2: ; preds = %L2.loopexit, %top
%t.0.lcssa = phi double [ 0.000000e+00, %top ], [ %5, %L2.loopexit ]
ret double %t.0.lcssa
if: ; preds = %if.preheader, %if
%t.04 = phi double [ %5, %if ], [ 0.000000e+00, %if.preheader ]
%"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ]
%2 = add i64 %"#temp#.03", 1
%3 = sitofp i64 %"#temp#.03" to double
%4 = fmul double %3, 0x3F91DF46A2529D39 ; deg2rad(x) = x*(pi/180)
%5 = fadd double %t.04, %4
%6 = icmp eq i64 %"#temp#.03", %0
br i1 %6, label %L2.loopexit, label %if
}
If you need to do this with many different dynamically specified functions, then you can even pass the function to be called in as an argument:
function fast(f,n)
t = 0.0
for i = 1:n
t += f(i)
end
return t
end
julia> #time fast(getfield(Main, Symbol(fn)), 10^6)
0.007483 seconds (1.70 k allocations: 76.670 KB)
8.72665498661791e9
julia> #time fast(getfield(Main, Symbol(fn)), 10^6)
0.002908 seconds (6 allocations: 192 bytes)
8.72665498661791e9
This generates the same fast code as single-argument fast above, but will generate a new version for every different function f that you call it with.

Resources