Consider the following functions whose definitions contain re-definitions of functions.
function foo1()
x = 1
if x != 1
error("Wrong")
end
x = 2
end
function foo()
function p(t) return t + 1 end
if p(1) != 2
error("Wrong")
end
function p(t) return 1 end
end
foo1() runs without error, but foo() gives the error Wrong. I thought it may have something to do with Julia not supporting redefining functions in general, but I'm not sure. Why is this happening?
I would say that this falls into a more general known problem https://github.com/JuliaLang/julia/issues/15602.
In your case consider a simpler function:
function f()
p() = "before"
println(p())
p() = "after"
nothing
end
Calling f() will print "after".
In your case you can inspect what is going on with foo in the following way:
julia> #code_typed foo()
CodeInfo(
4 1 ─ invoke Main.error("Wrong"::String)::Union{} │
│ $(Expr(:unreachable))::Union{} │
└── $(Expr(:unreachable))::Union{} │
) => Union{}
and you see that Julia optimizes out all internal logic and just calls error.
If you inspect it one step earlier you can see:
julia> #code_lowered foo()
CodeInfo(
2 1 ─ p = %new(Main.:(#p#7)) │
3 │ %2 = (p)(1) │
│ %3 = %2 != 2 │
└── goto #3 if not %3 │
4 2 ─ (Main.error)("Wrong") │
6 3 ─ return p │
)
any you see that p in top line is assigned only once. Actually the second definition is used (which is not visible here, but could be seen above).
To solve your problem use anonymous functions like this:
function foo2()
p = t -> t + 1
if p(1) != 2
error("Wrong")
end
p = t -> 1
end
and all will work as expected. The limitation of this approach is that you do not get multiple dispatch on name p (it is bound to a concrete anonymous function, but I guess you do not need multiple dispatch in your example).
Related
I want to have a curried version of a function. So, I write the code as follows:
f(x::Int64, y::Int64) = x + y
f(x::Int64) = (y::Int64) -> f(x, y)
But I am not sure if Julia considers this an example of a type-unstable definition. On the face of it, one of the methods returns an anonymous function, while another returns an Int64. Yet, when the curried version is applied, the final result is also an Int64.
So, my questions are:
Is this code type-stable?
If not, is there a way to have a curried version of a function without writing type-unstable code?
Thanks in advance.
Yes, it is.
According to the official doc, you can investigate it by using the #code_warntype macro:
julia> #code_warntype f(1, 5)
MethodInstance for f(::Int64, ::Int64)
from f(x::Int64, y::Int64) in Main at REPL[2]:1
Arguments
#self#::Core.Const(f)
x::Int64
y::Int64
Body::Int64
1 ─ %1 = (x + y)::Int64
└── return %1
The arguments of this function have the exact type Int64, and as we can see in the Body::Int64, the inferred return type function is Int64.
Furthermore, we have f(x) which is based on the type-stable function f(x, y):
julia> #code_warntype f(1)
MethodInstance for f(::Int64)
from f(x::Int64) in Main at REPL[15]:1
Arguments
#self#::Core.Const(f)
x::Int64
Locals
#3::var"#3#4"{Int64}
Body::var"#3#4"{Int64}
1 ─ %1 = Main.:(var"#3#4")::Core.Const(var"#3#4")
│ %2 = Core.typeof(x)::Core.Const(Int64)
│ %3 = Core.apply_type(%1, %2)::Core.Const(var"#3#4"{Int64})
│ (#3 = %new(%3, x))
└── return #3
Here as well, there's not any unstable defined parameter type.
Look at the following as an example of an unstable-typed function:
julia> unstF(X) = x*5
unstF (generic function with 1 method)
julia> #code_warntype unstF(1)
MethodInstance for unstF(::Int64)
from unstF(X) in Main at REPL[17]:1
Arguments
#self#::Core.Const(unstF)
X::Int64
Body::Any
1 ─ %1 = (Main.x * 5)::Any
└── return %1
If you try this in the REPL, you'll see the Any appears with a red color. Since we have the Body::Any (Any with the red color), we can conclude that the returned object by this function is a non-concrete type object. Because the compiler doesn't know what is the x (Note that the input is X). So the result can be Anything! So this function is type-unstable for (here, for integer inputs. Note that you should investigate the type-stability of your function by your desired input(s). E.g., #code_warntype f(5) and #code_warntype f(5.) should be observed if I can pass it Float64 or Int64 either).
Consider
function foo(x)
x isa Bar || throw(ArgumentError("x is not a Int64..."))
dosomething(x)
end
as opposed to a traditional
function foo(x)
if !(x isa Bar)
throw(ArgumentError("x is not a Bar..."))
end
dosomething(x)
end
(functionally equivalent is !(x isa Int64) && ...) Poking around a few packages, it seems this sort of conditional evaluation is not popular -- at face value, it seems convenient, and I prefer it a bit stylistically.
Is there a consensus on this stylistically? I understand that using the Unicode \in is discouraged over the word "in" by at least the Blue style guide for the sake of readability/ compatibility -- is this something similar?
Is this less performant? At face value, it seems like it would take about the same number of operations. I came across this post as well, but the answers aren't very satisfying, even ignoring that it's now likely outdated.
When asking yourself performance question like this it is usually good idea to peek into compiled code via #code_lowered, #code_typed, #code_llvm, #code_native. Each of those macros explains one step further in the compilation process.
Consider
function x5a(x)
x < 5 && (x=5)
x
end
function x5b(x)
if x < 5
x=5
end
x
end
Let's try #code_lowered
julia> #code_lowered x5a(3)
CodeInfo(
1 ─ x#_3 = x#_2
│ %2 = x#_3 < 5
└── goto #3 if not %2
2 ─ x#_3 = 5
└── goto #3
3 ┄ return x#_3
)
julia> #code_lowered x5b(3)
CodeInfo(
1 ─ x#_3 = x#_2
│ %2 = x#_3 < 5
└── goto #3 if not %2
2 ─ x#_3 = 5
3 ┄ return x#_3
)
Almost identical - will it get simplified further in the compilation process? Let's see!
julia> #code_typed x5a(3)
CodeInfo(
1 ─ %1 = Base.slt_int(x#_2, 5)::Bool
└── goto #3 if not %1
2 ─ nothing::Nothing
3 ┄ %4 = φ (#2 => 5, #1 => x#_2)::Int64
└── return %4
) => Int64
julia> #code_typed x5b(3)
CodeInfo(
1 ─ %1 = Base.slt_int(x#_2, 5)::Bool
└── goto #3 if not %1
2 ─ nothing::Nothing
3 ┄ %4 = φ (#2 => 5, #1 => x#_2)::Int64
└── return %4
) => Int64
Both functions have identical lowered codes which means that they will result in an identical assembly code and hence execute identical sets of CPU instructions.
Regarding the style - it is not documented in official style guidelines so code readability is the criterion and like Bogumil said this style is quite popular.
From my experience:
There should be no performance difference between if and short-circuting evaluation. The choice should be purely stylistic.
the style cond || ... and cond && ... is quite popular. Just if the expression following the || or && is long so that it would not fit a single line then if is used.
i am new in Julia. i read a doc about julia static-analysis. it gives a function.
function foo(x,y)
z = x + y
return 2 * z
end
and use the julia introspection function code_typed get output:
code_typed(foo,(Int64,Int64))
1-element Array{Any,1}:
:($(Expr(:lambda, {:x,:y}, {{:z},{{:x,Int64,0},{:y,Int64,0},{:z,Int64,18}},{}},
:(begin # none, line 2:
z = (top(box))(Int64,(top(add_int))(x::Int64,y::Int64))::Int64 # line 3:
return (top(box))(Int64,(top(mul_int))(2,z::Int64))::Int64
end::Int64))))
it has an Expr . but when i call code_typed, the output is :
code_typed(foo, (Int64,Int64))
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = Base.add_int(x, y)::Int64
│ %2 = Base.mul_int(2, %1)::Int64
└── return %2
) => Int64
it has a CodeInfo.it is different with the output in doc.
does julia have some change? and how can i get the Exprs according to my function and argtypes?
That code snippet appears to be taken from https://www.aosabook.org/en/500L/static-analysis.html, which was published in 2016 (about two years before the release of Julia 1.0) and references Julia version 0.3 from 2015.
Julia 1.0 gives
julia> code_typed(foo,(Int64,Int64))
1-element Array{Any,1}:
CodeInfo(
2 1 ─ %1 = (Base.add_int)(x, y)::Int64 │╻ +
3 │ %2 = (Base.mul_int)(2, %1)::Int64 │╻ *
└── return %2 │
) => Int64
while more current versions such as 1.6 and 1.7 give
julia> code_typed(foo,(Int64,Int64))
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = Base.add_int(x, y)::Int64
│ %2 = Base.mul_int(2, %1)::Int64
└── return %2
) => Int64
(a much more minor change, but nonetheless)
If for any reason, you want this result in the form of an array or vector of Exprs, you seem to be able to get this using (e.g.)
julia> t = code_typed(foo,(Int64,Int64));
julia> t[1].first.code
3-element Vector{Any}:
:(Base.add_int(_2, _3))
:(Base.mul_int(2, %1))
:(return %2)
though this is likely considered an implementation detail and liable to change between minor versions of Julia.
I have a number of very large CSV files which I would like to parse into custom data structures for subsequent processing. My current approach involves CSV.File and then converting each CSV.Row into the custom data structure. It works well for small test cases but gets really inefficient for the large files (GC very high). The problem is in the second step and I suspect is due to type instability. I'm providing a mock example below.
(I'm new to Julia, so apologies if I misunderstood something)
Define data structure and conversion logic:
using CSV
struct Foo
a::Int32
b::Float32
end
Foo(csv_row::CSV.Row) = Foo(csv_row.a, csv_row.b)
Using the default constructor causes 0 allocations:
julia> #allocated foo1 = Foo(1, 2.5)
0
However, when creating the object from CSV.Row all of a sudden 80 bytes are allocated:
julia> data = CSV.File(Vector{UInt8}("a,b\n1,2.5"); threaded = false)
1-element CSV.File{false}:
CSV.Row: (a = 1, b = 2.5f0)
julia> #allocated foo2 = Foo(data[1])
80
In the first case all types are stable:
julia> #code_warntype Foo(1, 2)
Variables
#self#::Core.Compiler.Const(Foo, false)
a::Int64
b::Int64
Body::Foo
1 ─ %1 = Main.Foo::Core.Compiler.Const(Foo, false)
│ %2 = Core.fieldtype(%1, 1)::Core.Compiler.Const(Int32, false)
│ %3 = Base.convert(%2, a)::Int32
│ %4 = Core.fieldtype(%1, 2)::Core.Compiler.Const(Float32, false)
│ %5 = Base.convert(%4, b)::Float32
│ %6 = %new(%1, %3, %5)::Foo
└── return %6
Whereas in the second case they are not:
julia> #code_warntype Foo(data[1])
Variables
#self#::Core.Compiler.Const(Foo, false)
csv_row::CSV.Row
Body::Foo
1 ─ %1 = Base.getproperty(csv_row, :a)::Any
│ %2 = Base.getproperty(csv_row, :b)::Any
│ %3 = Main.Foo(%1, %2)::Foo
└── return %3
So I guess my question is: How can I make the second case type-stable and avoid the allocations?
Providing the types explicitly in CSV.File does not make a difference by the way.
While this does not focus on type stability, I would expect the highest performance combined with flexibility from the following code:
d = DataFrame!(CSV.File(Vector{UInt8}("a,b\n1,2.5\n3,4.0"); threaded = false))
The above efficiently transforms a CSV.File into a type stable structure, additionally avoiding data copying in this process. This should work for your case of huge CSV files.
And now:
julia> Foo.(d.a, d.b)
2-element Array{Foo,1}:
Foo(1, 2.5f0)
Foo(3, 4.0f0)
I would like to see devectorized code of some expression say here
obj.mask = exp.(1.0im*lambda*obj.dt/2.);
How one could print a generic expression in devectorized form in Julia?
I don't think that what you ask for exists (please proof me wrong if I'm mistaken!).
The best you can do is use #code_lowered, #code_typed, #code_llvm, #code_native macros (in particular #code_lowered) to see what happens to your Julia code snippet. However, as Julia isn't translating all dots to explicit for loops internally, non of these snippets will show you a for-loop version of your code.
Example:
julia> a,b = rand(3), rand(3);
julia> f(a,b) = a.*b
f (generic function with 1 method)
julia> #code_lowered f(a,b)
CodeInfo(
1 1 ─ %1 = Base.Broadcast.materialize │
│ %2 = Base.Broadcast.broadcasted │
│ %3 = (%2)(Main.:*, a, b) │
│ %4 = (%1)(%3) │
└── return %4 │
)
So, Julia translates the .* into a Base.Broadcast.broadcasted call. Of course, we can go further and do
julia> #which Base.Broadcast.broadcasted(Main.:*, a, b)
broadcasted(f, arg1, arg2, args...) in Base.Broadcast at broadcast.jl:1139
and check broadcast.jl in line 1139 and so on to trace the actual broadcasted method that will be called (maybe Tim Holy's Rebugger is useful here :D). But as I said before, there won't be a for-loop in it. Instead you will find something like this:
broadcasted(::DefaultArrayStyle{1}, ::typeof(*), r::AbstractRange, x::Number) = range(first(r)*x, step=step(r)*x, length=length(r))
broadcasted(::DefaultArrayStyle{1}, ::typeof(*), r::StepRangeLen{T}, x::Number) where {T} =
StepRangeLen{typeof(T(r.ref)*x)}(r.ref*x, r.step*x, length(r), r.offset)
broadcasted(::DefaultArrayStyle{1}, ::typeof(*), r::LinRange, x::Number) = LinRange(r.start * x, r.stop * x, r.len)
Update
Ok, eventually, I found for-loops in copyto! in broadcast.jl. But this is probably to deep into the rabbit hole.