I wanted make a macro that creates some code for me. E.g.
I have a vector x = [9,8,7] and I want to use a macro to generate this piece of code vcat(x[1], x[2], x[3]) and run it. And I want it to work for arbitrary length vectors.
I have made the macro as below
macro some_macro(a)
quote
astr = $(string(a))
s = mapreduce(aa -> string(astr,"[",aa,"],"), string, 1:length($(a)))
eval(parse(string("vcat(", s[1:(end-1)],")")))
end
end
x = [7,8,9]
#some_macro x
The above works. But when I try to wrap it inside a function
function some_fn(y)
#some_macro y
end
some_fn([4,5,6])
It doesn't work and gives error
UndefVarError: y not defined
and it highlights the below as the culprit
s = mapreduce(aa -> string(astr,"[",aa,"],"), string, 1:length($(a)))
Edit
See julia: efficient ways to vcat n arrays
for advanced example why I want to do instead of using the splat operator
You don't really need macros or generated functions for this. Just use vcat(x...). The three dots are the "splat" operator — it unpacks all the elements of x and passes each as a separate argument to vcat.
Edit: to more directly answer the question as asked: this cannot be done in a macro. Macros are expanded at parse time, but this transformation requires you to know the length of the array. At global scope and in simple tests it may appear that it's working, but it's only working because the argument is defined at parse time. In a function or in any real use-cases, however, that's not the case. Using eval inside a macro is a major red flag and really shouldn't be done.
Here's a demo. You can create a macro that vcats three arguments safely and easily. Note that you should not construct strings of "code" at all here, you can just construct an array of expressions with the :( ) expression quoting syntax:
julia> macro vcat_three(x)
args = [:($(esc(x))[$i]) for i in 1:3]
return :(vcat($(args...)))
end
#vcat_three (macro with 1 method)
julia> #macroexpand #vcat_three y
:((Main.vcat)(y[1], y[2], y[3]))
julia> f(z) = #vcat_three z
f([[1 2], [3 4], [5 6], [7 8]])
3×2 Array{Int64,2}:
1 2
3 4
5 6
So that works just fine; we esc(x) to get hygiene right and splat the array of expressions directly into the vcat call to generate that argument list at parse time. It's efficient and fast. But now let's try to extend it to support length(x) arguments. Should be simple enough. We'll just need to change 1:3 to 1:n, where n is the length of the array.
julia> macro vcat_n(x)
args = [:($(esc(x))[$i]) for i in 1:length(x)]
return :(vcat($(args...)))
end
#vcat_n (macro with 1 method)
julia> #macroexpand #vcat_n y
ERROR: LoadError: MethodError: no method matching length(::Symbol)
But that doesn't work — x is just a symbol to the macro, and of course length(::Symbol) doesn't mean what we want. It turns out that there's absolutely nothing you can put there that works, simply because Julia cannot know how large x is at compile time.
Your attempt is failing because your macro returns an expression that constructs and evals a string at run-time, and eval does not work in local scopes. Even if that could work, it'd be devastatingly slow… much slower than splatting.
If you want to do this with a more complicated expression, you can splat a generator: vcat((elt[:foo] for elt in x)...).
FWIW, here is the #generated version I mentioned in the comment:
#generated function vcat_something(x, ::Type{Val{N}}) where N
ex = Expr(:call, vcat)
for i = 1:N
push!(ex.args, :(x[$i]))
end
ex
end
julia> vcat_something(x, Val{length(x)})
5-element Array{Float64,1}:
0.670889
0.600377
0.218401
0.0171423
0.0409389
You could also remove #generated prefix to see what Expr it returns:
julia> vcat_something(x, Val{length(x)})
:((vcat)(x[1], x[2], x[3], x[4], x[5]))
Take a look at the benchmark results below:
julia> using BenchmarkTools
julia> x = rand(100)
julia> #btime some_fn($x)
190.693 ms (11940 allocations: 5.98 MiB)
julia> #btime vcat_something($x, Val{length(x)})
960.385 ns (101 allocations: 2.44 KiB)
The huge performance gap is mainly due to the fact that #generated function is firstly executed and executed only once at compile time(after the type inference stage) for each N that you passed to it. When calling it with a vector x having the same length N, it won't run the for-loop, instead, it'll directly run the specialized compiled code/Expr:
julia> x = rand(77); # x with a different length
julia> #time some_fn(x);
0.150887 seconds (7.36 k allocations: 2.811 MiB)
julia> #time some_fn(x);
0.149494 seconds (7.36 k allocations: 2.811 MiB)
julia> #time vcat_something(x, Val{length(x)});
0.061618 seconds (6.25 k allocations: 359.003 KiB)
julia> #time vcat_something(x, Val{length(x)});
0.000023 seconds (82 allocations: 2.078 KiB)
Note that, we need to pass the length of x to it ala a value type(Val), since Julia can't get that information(unlike NTuple, Vector only has one type parameter) at compile time.
EDIT:
see Matt's answer for the right and simplest way to solve the problem, I gonna leave the post here since it's relevant and might be helpful when dealing with splatting penalty.
Related
Is there in Julia a Collection type from which both Set and Array derive ?
I have both:
julia> supertype(Array)
DenseArray{T,N} where N where T
julia> supertype(DenseArray)
AbstractArray{T,N} where N where T
julia> supertype(AbstractArray)
Any
And:
julia> supertype(Set)
AbstractSet{T} where T
julia> supertype(AbstractSet)
Any
What I try to achieve is to write function that can take both Array or Set as argument, because the type of collection doesn't matter as long as I can iterate over it.
function(Collection{SomeOtherType} myCollection)
for elem in myCollection
doSomeStuff(elem)
end
end
No, there is no Collection type, nor is there an Iterable one.
In theory, what you ask can be accomplished through traits, which you can read about elsewhere. However, I would argue that you should not use traits here, and instead simply refrain from restricting the type of your argument to the function. That is, instead of doing
foo(x::Container) = bar(x)
, do
foo(x) = bar(x)
There will be no performance difference.
If you want to restrict your argument types you could create a type union:
julia> ty = Union{AbstractArray,AbstractSet}
Union{AbstractSet, AbstractArray}
julia> f(aarg :: ty) = 5
f (generic function with 1 method)
This will work on both sets and arrays
julia> f(1:10)
5
julia> f(rand(10))
5
julia> f(Set([1,2,5]))
5
But not on numbers, for example
julia> f(5)
ERROR: MethodError: no method matching f(::Int64)
Closest candidates are:
f(::Union{AbstractSet, AbstractArray}) at REPL[2]:1
This snippet is from the implementation of Rational Numbers in Julia:
# Rational.jl
# ...
Rational{T<:Integer}(n::T, d::T) = Rational{T}(n,d)
Rational(n::Integer, d::Integer) = Rational(promote(n,d)...)
Rational(n::Integer) = Rational(n,one(n))
//(x::Rational, y::Integer) = x.num // (x.den*y) <--- HERE!
# ...
See how the // function is implemented and then used with infix notation? How does this actually return a value?
When I saw this code I interpreted it like this:
The // function is called with a Rational and an Integer.
But then it makes a recursive call with no other arguments.
#2 is the one that really confuses me. Where does the recursion within data structure end? How does // return a value if it is constantly evaluating nothing?
Please help me understand this.
This works because of one of the most fundamental features of Julia: multiple dispatch. In Julia, functions can have many methods which apply to various combinations of argument types, and when you call a function, Julia invokes the most specific method which matches the type of all the arguments that you called it with. The // call in the method definition you posted defines rational-integer // in terms of integer-integer // – so it isn't actually recursive because the method doesn't call itself, it calls a different method that is part of the same "generic function".
To understand how multiple dispatch works in this case, let's consider the evaluation of the expression (3//4)//6. We'll use the #which macro to see which method each function call invokes.
julia> #which (3//4)//6
//(x::Rational{T<:Integer}, y::Integer) at rational.jl:25
Since 3//4 is a Rational{Int} <: Rational and 6 is an Int <: Integer, and no other more specific methods apply, this method is called:
//(x::Rational, y::Integer) = x.num // (x.den*y)
The current version of the method is actually slightly more complicated than what you posted because it's been modified to check for integer overflow – but it's essentially the same, and it's easier to understand the older, simpler version, so I'll use that. Let's assign x and y to the arguments and see what method the definition calls:
julia> x, y = (3//4), 6
(3//4,6)
julia> x.num
3
julia> x.den*y
24
julia> x.num // (x.den*y)
1//8
julia> #which x.num // (x.den*y)
//(n::Integer, d::Integer) at rational.jl:22
As you can see, this expression doesn't call the same method, it calls a different method:
//(n::Integer, d::Integer) = Rational(n,d)
This method simply calls the Rational constructor which puts the ratio of n and d into lowest terms and creates a Rational number object.
It is quite common to define one method of a function in terms of another method of the same function, in Julia. This is how argument defaults work, for example. Consider this definition:
julia> f(x, y=1) = 2x^y
f (generic function with 2 methods)
julia> methods(f)
# 2 methods for generic function "f":
f(x) at none:1
f(x, y) at none:1
julia> f(1)
2
julia> f(2)
4
julia> f(2,2)
8
The default argument syntax simply generates a second method with only onee argument, which calls the two-argument form with the default value. So f(x, y=1) = 2x^y is exactly equivalent to defining two methods, where the unary method just calls the binary method, supplying a default value for the second argument:
julia> f(x, y) = 2x^y
f (generic function with 1 method)
julia> f(x) = f(x, 1)
f (generic function with 2 methods)
According to http://julia.readthedocs.org/en/latest/manual/integers-and-floating-point-numbers/, one should be able to do this:
julia> Float32(-1.5)
-1.5f0
Instead, I get:
julia> Float32(-1.5)
ERROR: type cannot be constructed
This happens for all other attempts to use this syntax, e.g. x = Int16(1)
I'm on 0.3.10.
You are on 0.3.10 but reading the manual for 0.5. In the manual for 0.3 http://julia.readthedocs.org/en/release-0.3/manual/integers-and-floating-point-numbers/
Values can be converted to Float32 easily:
julia> float32(-1.5)
-1.5f0
julia> typeof(ans)
Float32
In MatLab/Octave you could send a command "format long g" and have default numerical output in the REPL formatted like the following:
octave> 95000/0.05
ans = 1900000
Is it possible to get a similar behavior in Julia? Currently with julia
Version 0.3.0-prerelease+3930 (2014-06-28 17:54 UTC)
Commit bdbab62* (6 days old master)
x86_64-redhat-linux
I get the following number format.
julia> 95000/0.05
1.9e6
You can use the #printf macro to format. It behaves like the C printf, but unlike printf for C the type need not agree but is rather converted as necessary. For example
julia> using Printf
julia> #printf("Integer Format: %d",95000/0.05);
Integer Format: 1900000
julia> #printf("As a String: %s",95000/0.05);
As a String: 1.9e6
julia> #printf("As a float with column sized larger than needed:%11.2f",95000/0.05);
As a float with column sized larger than needed: 1900000.00
It is possible to use #printf as the default mechanism in the REPL because the REPL is implemented in Julia in Base.REPL, and in particular the following function:
function display(d::REPLDisplay, ::MIME"text/plain", x)
io = outstream(d.repl)
write(io, answer_color(d.repl))
writemime(io, MIME("text/plain"), x)
println(io)
end
To modify the way Float64 is displayed, you merely need to redefine writemime for Float64.
julia> 95000/0.05
1.9e6
julia> Base.Multimedia.writemime(stream,::MIME"text/plain",x::Float64)=#printf("%1.2f",x)
writemime (generic function with 13 methods)
julia> 95000/0.05
1900000.00
Apologies if this rather general - albeit still a coding question.
With a bit of time on my hands I've been trying to learn a bit of Julia. I thought a good start would be to copy the R microbenchmark function - so I could seamlessly compare R and Julia functions.
e.g. this is microbenchmark output for 2 R functions that I am trying to emulate:
Unit: seconds
expr min lq median uq max neval
vectorised(x, y) 0.2058464 0.2165744 0.2610062 0.2612965 0.2805144 5
devectorised(x, y) 9.7923054 9.8095265 9.8097871 9.8606076 10.0144012 5
So thus far in Julia I am trying to write idiomatic and hopefully understandable/terse code. Therefore I replaced a double loop with a list comprehension to create an array of timings, like so:
function timer(fs::Vector{Function}, reps::Integer)
# funs=length(fs)
# times = Array(Float64, reps, funs)
# for funsitr in 1:funs
# for repsitr in 1:reps
# times[reps, funs] = #elapsed fs[funs]()
# end
# end
times= [#elapsed fs[funs]() for x=1:reps, funs=1:length(fs)]
return times
end
This gives an array of timings for each of 2 functions:
julia> test=timer([vec, devec], 10)
10x2 Array{Float64,2}:
0.231621 0.173984
0.237173 0.210059
0.26722 0.174007
0.265869 0.208332
0.266447 0.174051
0.266637 0.208457
0.267824 0.174044
0.26576 0.208687
0.267089 0.174014
0.266926 0.208741
My question (finally) is how do I idiomatically apply a function such as min, max, median across columns (or rows) of an array without using a loop?
I can of course do it easily for this simple case with a loop (sim to that I crossed out above)- but I can't find anything in the docs which is equivalent to say apply(array,1, fun) or even colMeans.
The closest generic sort of function I can think of is
julia> [mean(test[:,col]) for col=1:size(test)[2]]
2-element Array{Any,1}:
0.231621
0.237173
.. but the syntax really really doesn't appeal. Is there a more natural way to apply functions across columns or rows of a multidimensional array in Julia?
The function you want is mapslices.
Anonymous functions was are currently slow in julia, so I would not use them for benchmarking unless you benchmark anonymous functions. That will give wrong performance prediction for code that does not use anonymous functions in performance critical parts of the code.
I think you want the two argument version of the reduction functions, like sum(arr, 1) to sum over the first dimension. If there isn't a library function available, you might use reducedim
I think #ivarne has the right answer (and have ticked it) but I just add that I made an apply like function:
function aaply(fun::Function, dim::Integer, ar::Array)
if !(1 <= dim <= 2)
error("rows is 1, columns is 2")
end
if(dim==1)
res= [fun(ar[row, :]) for row=1:size(ar)[dim]]
end
if(dim==2)
res= [fun(ar[:,col]) for col=1:size(ar)[dim]]
end
return res
end
this then gets what I want like so:
julia> aaply(quantile, 2, test)
2-element Array{Any,1}:
[0.231621,0.265787,0.266542,0.267048,0.267824]
[0.173984,0.174021,0.191191,0.20863,0.210059]
where quantile is a built-in that gives min, lq, median, uq, and max.. just like microbenchmark.
EDIT Following the advice here I tested the new function mapslice which works pretty much like R apply and benchmarked it against the function above. Note that mapslice has dim=1 as by column slice whilst test[:,1] is the first column... so the opposite of R though it has the same indexing?
# nonsense test data big columns
julia> ar=ones(Int64,1000000,4)
1000000x4 Array{Int64,2}:
# built in function
julia> ms()=mapslices(quantile,ar,1)
ms (generic function with 1 method)
# my apply function
julia> aa()=aaply(quantile, 2, ar)
aa (generic function with 1 method)
# compare both functions
julia> aaply(quantile, 2, timer1([ms, aa], 40))
2-element Array{Any,1}:
[0.23566,0.236108,0.236348,0.236735,0.243008]
[0.235401,0.236058,0.236257,0.236686,0.238958]
So the funs are approximately as quick as each other. From reading bits of the Julia mailing list they seem to intend to do some work on this bit of Julialang so that making slices is by reference rather than making new copies of each slice (column row etc)...