I have a program for doing Fourier series and I wanted to switch to CuArrays to make it faster. The code is as follows (extract):
#Arrays I want to use
coord = CuArray{ComplexF64,1}(complex.(a[:,1],a[:,2]))
t=CuArray{Float64,1}(-L:(2L/(N-1)):L)
#Array of indexes in the form [0,1,-1,2,-2,...]
n=[((-1)^i)div(i,2) for i in 1:grado]
#Array of functions I need for calculations
base= [x -> exp(π * im * i * x / L) / L for i in n]
base[i](1.) #This line is OK
base[i](-1:.1:1) #This line is OK
base[i].(t) #This line gives error!
base[i].(CuArray{Float64,1}(t)) #This line gives error!
And the error is:
GPU broadcast resulted in non-concrete element type Any.
This probably means that the function you are broadcasting contains an error or type instability.
If I change it like this
base= [(x::Float64) -> (exp(π * im * i * x / L) / L)::ComplexF64 for i in n]
the same lines still give error, but the error now is:
UndefVarError: parameters not defined
Any idea how I could fix this?
Thank you in advance!
Package information:
(#v1.6) pkg> st CUDA
Status `C:\Users\marce\.julia\environments\v1.6\Project.toml`
[052768ef] CUDA v2.6.2
P.S.: This other function has the same problem:
function integra(inizio, fine, arr)
N=size(arr,1)
h=(fine-inizio)/N
integrale=sum(arr)
integrale -= (first(arr)+last(arr))/2
integrale *= h
end
L=2
integra(-L,L,coord)
The first and easier problem is that you should take care to declare global variables to be constant so that the compiler can assume a constant type: const L = 2. A mere L = 2 allows you to do something like L = SomeOtherType(), and if that type can be Anything, so must the return type of your functions. On the CPU that's only a performance hit, but it's a no-no for the GPU. If you actually want L to vary in value, pass it in as an argument so the compiler can still infer types within a function.
Your ::ComplexF64 assertion did actually force a concrete return type, though the middle of the function is still type unstable (check with #code_warntype). The second problem you ran into after that patch was probably caused by this recently patched conflict between ExprTools.jl and LLVM.jl. Seems like you just need to update the packages or maybe reinstall them.
I am defining a variable phi_old at the start of my function, and assigning it a value of phi_start.
I then have an iterative loop that:
uses phi_old
solves a PDE to get a solution T
uses the solution T to calculate phi_new
checks the difference between phi_new and phi_old, if it is larger than error the iterative loop begins again, with phi_old using the values of phi_new.
if the difference is less than the error, the iteration stops.
But I am getting this error:
Warning: Assignment to `T` in soft scope is ambiguous because a global variable by the same name exists:
ERROR: LoadError: UndefVarError: phi_old not defined
Here is an example of my code's structure:
phi_old = createCellVariable(m,phi_start)
for i=1:i_end
phi_new = myfunction(T)
error = sum(phi_new.value[2,1:end]-phi_old.value[2,1:end])
if error > 1E-03
phi_old = phi_new
else
i=i+1
break
end
end
Are there any techniques I could use to better initialize/assign my arrays and fix this error? The arrays are quite large, so I would like to preallocate if possible, and only copy arrays if necessary.
I have a GDL v0.9.9 installed on my machine running Ubuntu 20.04. I am trying to write a function calculating a Julian Date.
list_1 file:
FUNCTION JulianDate, L, M, N
L1 = L + 4716 - FLOOR((14 - M) / 12)
M1 = (M + 9) MOD 12
G = FLOOR(3/4*FLOOR((L1 + 184)/100)) - 38
RETURN, FLOOR(365.25 * L1) + FLOOR(30.6 * M1 + 0.4) + N - G - 1402
END
JD = JulianDate(1990, 4, 30)
Unfortunately, after $ gdl list_1 I get the error(s):
% Programs can't be compiled from single statement mode.
% Variable is undefined: L
% Execution halted at: $MAIN$
% Variable is undefined: M
% Execution halted at: $MAIN$
% Variable is undefined: L1
% Execution halted at: $MAIN$
% Return statement in procedures cannot have values.
% Parser syntax error: unexpected token: END
% Ambiguous: Variable is undefined: JULIANDATE or: Function not found: JULIANDATE
% Execution halted at: $MAIN$
I compared my code with the documentation of IDL (as it should be "mutually intelligible") and it looked good for me. Is declaring/defining/calling a function in GDL different than in IDL?
Edit 1: I managed to run my code on the machine with IDL 8.5. The functions must be declared in the beginning of the file before any variables. However it does not change the behaviour of GDL.
idl and idlde (GUI) behave differently: while idl requires the functions to be stored in the separate .pro files, idlde handles the functions in the same file as the run script, as long as they are declared before any other commands.
gdl behaves just like idl, so functions' declarations in the separate files resolved the issue.
Following the approach of this answer I am trying to understand what happens exactly and how expressions and generated functions work in Julia within the concept of metaprogramming.
The goal is to optimize a recursive function using expressions and generated functions (for a concrete example you can have a look at the question answered in the link provided above).
Consider the following modified fibonacci function, in which I want to compute the fibonacci series up to n and multiply it by a number p.
The straightforward, recursive implementation would be
function fib(n::Integer, p::Real)
if n <= 1
return 1 * p
else
return n * fib(n-1, p)
end
end
As a first step, I could define a function which returns an expression instead of the computed value
function fib_expr(n::Integer, p::Symbol)
if n <= 1
return :(1 * $p)
else
return :($n * $(fib_expr(n-1, p)))
end
end
which, e.g. returns something like
julia> ex = fib_expr(3, :myp)
:(3 * (2 * (1myp)))
In this way I get an expression which is fully expanded and depends on the value assigned to the symbol myp. In this way I do not see the recursion anymore, basically I am metaprogramming: I created a function that creates another "function" (in this case we call it expression though).
I can now set myp = 0.5 and call eval(ex) to compute the result.
However, this is slower than the first approach.
What I can do though, is to generate a parametric function in the following way
#generated function fib_gen{n}(::Type{Val{n}}, p::Real)
return fib_expr(n, :p)
end
And magically, calling fib_gen(Val{3}, 0.5) gets things done, and is incredibly fast.
So, what is going on?
To my understanding, in the first call to fib_gen(Val{3}, 0.5), the parametric function fib_gen{Val{3}}(...) gets compiled and its content is the fully expanded expression obtained through fib_expr(3, :p), i.e. 3*2*1*p with p substituted with the input value.
The reason why it is so fast then, is because fib_gen is basically just a series of multiplications, whereas the original fib has to allocate on the stack every single recursive call making it slower, am I correct?
To give some numbers, here is my short benchmark using BenchmarkTools.
julia> #benchmark fib(10, 0.5)
...
mean time: 26.373 ns
...
julia> p = 0.5
0.5
julia> #benchmark eval(fib_expr(10, :p))
...
mean time: 177.906 μs
...
julia> #benchmark fib_gen(Val{10}, 0.5)
...
mean time: 2.046 ns
...
I have many questions:
Why the second case is so slow?
What exactly is and means ::Type{Val{n}}? (I copied that from the answer linked above)
Because of the JIT compiler, sometimes I am lost in what happens at compile-time and at run-time, as it is the case here...
Furthermore, I tried to combine fib_expr and fib_gen in a single function according to
#generated function fib_tot{n}(::Type{Val{n}}, p::Real)
if n <= 1
return :(1 * p)
else
return :(n * fib_tot(Val{n-1}, p))
end
end
which however is slow
julia> #benchmark fib_tot(Val{10}, 0.5)
...
mean time: 4.601 μs
...
What am I doing wrong here? Is it even possible to combine fib_expr and fib_gen in a single function?
I realize this is more a monograph rather than a question, however, even though I read the metaprogramming section few times, I am having a hard time to grasp everything, in particular with an applied example such as this one.
A monograph in response:
Metaprogramming basics
It will be easier to start with "normal" macros first. I'll relax the definition you used a bit:
function fib_expr(n::Integer, p)
if n <= 1
return :(1 * $p)
else
return :($n * $(fib_expr(n-1, p)))
end
end
That allows to pass in more than just symbols for p, like integer literals or whole expressions. Given this, we can define a macro for the same functionality:
macro fib_macro(n::Integer, p)
fib_expr(n, p)
end
Now, if #fib_macro 45 1 is used anywhere in the code, at compile time it will first be replaced by a long nested expression
:(45 * (44 * ... * (1 * 1)) ... )
and then compiled normally -- to a constant.
That's all there is to macros, really. Replacing syntax during compile time; and by recursion, this can be an arbitrarily long alteration between compiling, and evaluating functions on expressions. And for things that are essentially constant, but tedious to write otherwise, it is very useful: a bood example example is Base.Math.#evalpoly.
Evaluation at runtime?
But it has the problem that you cannot inspect values which are only known at runtime: you can't implement fib(n) = #fib_macro n 1, since at compile time, n is a symbol representing the parameter, and not a number you can dispatch on.
The next best solution to this would be to use
fib_eval(n::Integer) = eval(fib_expr(n, 1))
which works, but will repeat the compilation process every time it is called -- and that is much more overhead than the original function, since now at runtime, we perform the whole recursion on the expression tree and then call the compiler on the result. Not good.
Method dispatch & compilation
So we need a way to intermingle runtime and compile time. Enter #generated functions. These will at runtime dispatch on a type, and then work like a macro defining the function body.
First about type dispatch. If we have
f(x) = x + 1
and have a function call f(1), about the following will happen:
The type of the argument is determined (Int)
The method table of the function is consulted to find the best matching method
The method body is compiled for the specific Int argument type, if that hasn't been done before
The compiled method is evaluated on the concrete argument
If we then enter f(1.0), the same will happen again, with a new, different specialized method being compiled for Float64, based on the same function body.
Value types & singleton types
Now, Julia has the peculiar feature that you can use numbers as types. That means that the dispatch process outlined above will also work on the following function:
g(::Type{Val{N}}) where N = N + 1
That's a bit tricky. Remember that types are themselves values in Julia: Int isa Type.
Here, Val{N} is for every N a so-called singleton type having exactly one instance, namely Val{N}() -- just like Int is a type having many instances 0, -1, 1, -2, ....
Type{T} is also a singleton type, having as its single instance the type T. Int is a Type{Int}, and Val{3} is a Type{Val{3}} -- in fact, both are the only values of their type.
So, for each N, there is a type Val{N}, being the single instance of Type{Val{N}}. Thus, g will be dispatched and compiled for each single N. This is how we can dispatch on numbers as types. This already allows for optimization:
julia> #code_llvm g(Val{1})
define i64 #julia_g_61158(i8**) #0 !dbg !5 {
top:
ret i64 2
}
julia> #code_llvm f(1)
define i64 #julia_f_61076(i64) #0 !dbg !5 {
top:
%1 = shl i64 %0, 2
%2 = or i64 %1, 3
%3 = mul i64 %2, %0
%4 = add i64 %3, 2
ret i64 %4
}
But remember that it requires compilation for each new N at the first call.
(And fkt(::T) is just short for fkt(x::T) if you don't use x in the body.)
Integrating generating functions and value types
Finally to generated functions. They work as a slight modification of the above dispatch pattern:
The type of the argument is determined (Int)
The method table of the function is consulted to find the best matching method
The method body is treated as a macro and called with the Int argument type as a parameter, if that hasn't been done before. The resulting expression is compiled into a method.
The compiled method is evaluated on the concrete argument
This pattern allows to change the implementation for each type which the function is dispatched on.
For our concrete setting, we want to dispatch on the Val types representing the arguments of the Fibonacci sequence:
#generated function fib_gen{n}(::Type{Val{n}}, p::Real)
return fib_expr(n, :p)
end
You now see that your explanation was exactly right:
in the first call to fib_gen(Val{3}, 0.5), the parametric function
fib_gen{Val{3}}(...) gets compiled and its content is the fully
expanded expression obtained through fib_expr(3, :p), i.e. 3*2*1*p
with p substituted with the input value.
I hope that the whole story has also answered all three of your listed questions:
The implementation using eval replicates the recursion every time, plus the overhead of compilation
Val is a trick to lift numbers to types, and Type{T} the singleton type containing only T -- but I hope the examples were helpful enough
Compile time is not before execution, because of JIT -- it is every time a method gets compiled first time, because it get's called.
First of all, I am joining myself to the comments: your question is very well written & constructive.
I have reproduced your results using Julia 0.7-beta.
Difference between #generated fib_tot (one piece of code) and fib_gen (that calls fib_expr)
With my julia version results are identicals:
julia> #btime fib_tot(Val{10},0.5)
0.042 ns (0 allocations: 0 bytes)
1.8144e6
julia> #btime fib_gen(Val{10},0.5)
0.042 ns (0 allocations: 0 bytes)
1.8144e6
Sometimes breaking a function into multiple parts see official doc:performance tips can be useful, however in your peculiar case I do not see why this could be useful. At compile time Julia has everything it needs to optimize fib_tot. There is a branch if n<=1 however n is known at "compile time" thanks to the Type{Val{n}} trick and this branch should be removed without problem in the generated (specialized) code.
The Type{Val{n}} trick
To specialize functions, Julia inference is performed according to argument type and not according to argument value.
For instance a compiled version of foo(n::Int) = ... is not generated for each n value. You must define a type that depends on n value to reach this goal. This is precisely how Type{Val{n}} works: Val{n} is simply a parametrized empty structure:
struct Val{T} end
Hence, each Val{1}, Val{2}, ... Val{100}, ... is a different type. By consequence, if foo is defined as:
foo(::Type{Val{n}}) where {n} = ...
Each foo(Val{1}), foo(Val{2}), ... foo(Val{100}) will trigger a specialized foo version (because argument type is different).
The eval(fib_expr(n, 1)) case
This
julia> #btime eval(fib_expr(10, :p))
401.651 μs (99 allocations: 6.45 KiB)
1.8144e6
is slow because your expression is (re-)compiled every time. The problem can be avoided if you use a macro instead (see phg answer).
The fib version
.
julia> #btime fib(10,0.5)
30.778 ns (0 allocations: 0 bytes)
1.8144e6
There is only one compiled version of this fib function. By consequence, it must contain all the runtime branch tests etc... This explains how slow it is.
Just a remark about:
foo{n}(::Type{Val{n}}) deprecated syntax
The foo{n}(::Type{Val{n}}) syntax is deprecated, the new one is foo(::Type{Val{n}}) where {n}. You can read Julia doc, parametric methods for further details.
My Julia version:
julia> versioninfo()
Julia Version 0.7.0-beta.0
Commit f41b1ecaec (2018-06-24 01:32 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2603 v3 # 1.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, haswell)
I have some code that runs with no problems without parallelization. However, the same code generates exceptions if I try to run it using PSeq instead of Seq. The messages I get look a bit random, they are hard to replicate exactly.
Here is the code. When the exception happens the three lines starting with let tmp2 are highlighted.
let frameToRMatrix (df: Frame<'R,string>) =
let foo k df : float list =
df
|> Frame.getCol k
|> Series.values
|> List.ofSeq
let folder acc k = (k, foo k df |> box) :: acc
let tmp =
List.fold folder [] (df.ColumnKeys |> List.ofSeq)
|> namedParams
let sd = df |> Frame.getCol "Vol0" |> Series.lastValue
let sd = sd * 1000.0 |> int
printfn "%s" "I was here"
let rand = System.Random(sd)
let rms = rand.Next(500)
System.Threading.Thread.Sleep rms
let tmp2 =
tmp
|> R.cbind // This line prints something on the console the first time it is executed
printfn "%s" "And here too"
tmp2
The code above includes random number generation and calls to System.Threading.Thread.Sleep. If I do not include this code, which is not needed under sequential execution, I get a message:
System.ArgumentException: 'An item with the same key has already been added.'
and the following on the console:
I was here
I was here
[1] 4095
So execution never gets to the And here too lines.
When I include the random number generator and the call to sleep I get different results, which seem to depend on the build options.
Below are four examples, with the build options, error message and what I see on the console. Notice that in all examples there are four instances of I was here but only three instances of And here too.
---------------------------------------
Any CPU with Prefer 32-bit checked
System.Runtime.InteropServices.SEHException: 'External component has thrown an exception.'
I was here
I was here
[1] 4095
And here too
I was here
And here too
I was here
And here too
Warning: stack imbalance in 'lazyLoadDBfetch', 66 then 65
Error in value[[3L]](cond) : unprotect_ptr: pointer not found
---------------------------------------
Any CPU with Prefer 32-bit unchecked
System.Runtime.InteropServices.SEHException: 'External component has thrown an exception.'
I was here
I was here
[1] 1.759219e+13
And here too
I was here
And here too
I was here
And here too
Error: cons memory exhausted (limit reached?)
Error: cons memory exhausted (limit reached?)
---------------------------------------
x86
System.AccessViolationException" 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.'
I was here
I was here
[1] 4095
And here too
I was here
And here too
I was here
And here too
---------------------------------------
x64
Exception thrown: 'System.AccessViolationException' in Unknown Module. Attermpted to read or write protected memory.
$$$ - MachineLearning.signal: Calculating signal for ticker AAPL
$$$ - MachineLearning.signal: Calculating signal for ticker AAPL
I was here
I was here
[1] 1.759219e+13
And here too
I was here
And here too
I was here
And here too
Error in loadNamespace(name) :
no function to return from, jumping to top level
Based on my experience with debugging subtle issues with threading in the R type provider, I think the answer is no - sadly, the R native interop layer is not thread-safe and so you cannot call it from multiple threads in your F# application.
I think that the standard way of running R in parallel is to spawn multiple R.exe processes doing the work. I don't think you can easily initialise multiple independent R processes from F#, so your best bet is probably to create multiple .NET processes that each controls one R engine.