nested async and sync in Julia - asynchronous

I want to do multi Task (A,B) pairs which have no connections with each other. Each Task A include multi Task a. Task A has to be done before B start in the same one (A,B) pair. So it's like
#async for loop do multi (A,B)s
#sync do one (A,B)
#async for loop do Task A
do Task a
do Task B
how can I achieve this?
I have tried:
b = Vector{String}(undef, 6)
ta = #async for i in range(1, length(b))
a = Vector{String}(undef, 6)
#async for j in range(1, length(a)) # task A
a[j] = "hello "
sleep(1) # task a
end
#task B
b[i] = prod(a) * "world!"
sleep(1)
end
#time wait(ta)
fail: Task B does not wait for task A in the same pair.
b = Vector{String}(undef, 6)
ta = #async for i in range(1, length(b))
a = Vector{String}(undef, 6)
for j in range(1, length(a)) # task A
#async begin
a[j] = "hello "
sleep(1) # task a
end
end
#task B
b[i] = prod(a) * "world!"
sleep(1)
end
#time wait(ta)
fail: Task B does not wait for task A in the same pair.
b = Vector{String}(undef, 6)
ta = #async for i in range(1, length(b))
a = Vector{String}(undef, 6)
taA= #async for j in range(1, length(a)) # task A
a[j] = "hello "
sleep(1) # task a
end
wait(taA)
#task B
b[i] = prod(a) * "world!"
sleep(1)
end
#time wait(ta)
fail: multi (A,B) pairs not async to each other.
b = Vector{String}(undef, 6)
ta = #async for i in range(1, length(b))
a = Vector{String}(undef, 6)
taA = #async for j in range(1, length(a)) # task A
a[j] = "hello "
sleep(1) # task a
end
while !istaskdone(taA)
sleep(0.5)
end
b[i] = prod(a) * "world!"
sleep(1)
end
#time wait(ta)
fail: multi (A,B) pairs not async to each other.

If I understand your question right you want to have:
#sync for (A, B) in ABs
#async begin
#sync for a in A
#async do_task_A(a)
end
do_task_B(B)
end
end
For an example consider the function:
function do_task(t, x)
sleep(1/x)
println("Done $t : $x")
flush(stdout)
end
Lets run it!
julia> #time #sync for (A, B) in [([1,2], 100), ([3,4], 101),([5,6], 102)]
#async begin
#sync for a in A
#async do_task(:A, a)
end
do_task(:B, B)
end
end
Done A : 6
Done A : 5
Done B : 102
Done A : 4
Done A : 3
Done B : 101
Done A : 2
Done A : 1
Done B : 100
1.079950 seconds (20.25 k allocations: 1020.699 KiB, 3.52% compilation time)
You can see that tasks got executed asynchronously exactly in the order you requested.

All the answer from #Przemyslaw Szufel, I just sum it up as a note:
#sync for loop do task A
#async do task(a)
do task(B)
then Task B will wait until Task A finishes. And Multi Task a in Task A will be async to each other.
It is the building block for do multi Task a and then do Task B.
if you want to nest it, then just:
#sync for loop do Task (A,B)
#async begin
#sync for loop do task A
#async do task(a)
do task(B)
end
do task C
use the #sync macro before the for loop and then use the #async macro to wrap the content in the for loop. Then the multi little tasks in the for loop will run async to each other, but the code after the for loop will wait until the for loop finishes.
example code from #Przemyslaw Szufel
function do_task(t, x)
println("start $t : $x")
flush(stdout)
sleep(1 / x)
println("Done $t : $x")
flush(stdout)
end
#time #sync for (A, B) in [([1, 2], 100), ([3, 4], 101), ([5, 6], 102)]
#async begin
#sync for a in A
#async do_task(:A, a)
end
do_task(:B, B)
end
end

Related

Why two styles of executing Juilia programs are giving different results?

If a run a program written in julia as
sachin#localhost:$ julia mettis.jl then it runs sucessfully, without printing anything, though one print statement is in it.
And Secondly If run it as by going in julia:
sachin#localhost:$ julia
julia> include("mettis.jl")
main (generic function with 1 method)`
julia> main()
Then It gives some error.
I am puzzled why two style of executing is giving different result ?
Here is my code:
using ITensors
using Printf
function ITensors.op(::OpName"expτSS", ::SiteType"S=1/2", s1::Index, s2::Index; τ)
h =
1 / 2 * op("S+", s1) * op("S-", s2) +
1 / 2 * op("S-", s1) * op("S+", s2) +
op("Sz", s1) * op("Sz", s2)
return exp(τ * h)
end
function main(; N=10, cutoff=1E-8, δτ=0.1, beta_max=2.0)
# Make an array of 'site' indices
s = siteinds("S=1/2", N; conserve_qns=true)
# #show s
# Make gates (1,2),(2,3),(3,4),...
gates = ops([("expτSS", (n, n + 1), (τ=-δτ / 2,)) for n in 1:(N - 1)], s)
# Include gates in reverse order to complete Trotter formula
append!(gates, reverse(gates))
# Initial state is infinite-temperature mixed state
rho = MPO(s, "Id") ./ √2
#show inner(rho, H)
# Make H for measuring the energy
terms = OpSum()
for j in 1:(N - 1)
terms += 1 / 2, "S+", j, "S-", j + 1
terms += 1 / 2, "S-", j, "S+", j + 1
terms += "Sz", j, "Sz", j + 1
end
H = MPO(terms, s)
# Do the time evolution by applying the gates
# for Nsteps steps
for β in 0:δτ:beta_max
energy = inner(rho, H)
#printf("β = %.2f energy = %.8f\n", β, energy)
rho = apply(gates, rho; cutoff)main
rho = rho / tr(rho)
end
# #show energy
return nothing
end
There is nothing special about a function called main in Julia and defining a function is different from calling it. Consequently a file mettis.jl with the following code:
function main()
println("Hello, World!")
end
will not "do" anything when run (julia mettis.jl). However if you actually call the function at the end:
function main()
println("Hello, World!")
end
main()
you get the expected result
$ julia mettis.jl
Hello, World!

loop with array access is slow in Julia

I did a comparison between a loop with and without array access as below and found that the performance difference between the two was huge: 1.463677[sec] vs 0.086808[sec].
Could you explain how to improve my code with array access and why this happens?
#inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
r2set = Array[]
for i=1:10000
r2_add = rand(2, 1)
push!(r2set, r2_add)
end
return r2set
end
function test()
N = 10000
r2set = rand_gen()
a = [1 1]
b = [2 2]
#time for i=1:N, j=1:N
dist2(r2set[i], r2set[j])
end
#time for i=1:N, j=1:N
dist2(a, b)
end
end
test()
Make r2set have a concrete type like this (see also https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-containers-with-abstract-type-parameters-1):
#inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
r2set = Matrix{Float64}[]
for i=1:10000
r2_add = rand(2, 1)
push!(r2set, r2_add)
end
return r2set
end
function test()
N = 10000
r2set = rand_gen()
a = [1 1]
b = [2 2]
#time for i=1:N, j=1:N
dist2(r2set[i], r2set[j])
end
#time for i=1:N, j=1:N
dist2(a, b)
end
end
test()
And now the tests are:
julia> test()
0.347000 seconds
0.147696 seconds
which is already better.
Now if you really want speed use immutable type, e.g. Tuple not an array like this:
#inline dist2(p, q) = sqrt((p[1]-q[1])^2+(p[2]-q[2])^2)
function rand_gen()
r2set = Tuple{Float64,Float64}[]
for i=1:10000
r2_add = (rand(), rand())
push!(r2set, r2_add)
end
return r2set
end
function test()
N = 10000
r2set = rand_gen()
a = (1,1)
b = (2,2)
s = 0.0
#time for i=1:N, j=1:N
#inbounds s += dist2(r2set[i], r2set[j])
end
#time for i=1:N, j=1:N
s += dist2(a, b)
end
end
test()
And you will comparable speed of both:
julia> test()
0.038901 seconds
0.039666 seconds
julia> test()
0.041379 seconds
0.039910 seconds
Note that I have added an addition of s because without it Julia optimized out the loop by noticing that it does not do any work.
The key is that if you store arrays in an array then the outer array holds pointers to inner arrays while with immutable types the data is stored directly.

Outer constructor that has the same number of arguments as the field values

How can I define an outer constructor that has same number of arguments as the field values? What I want to do is something like this:
struct data
x
y
end
function data(x, y)
return data(x-y, x*y)
end
But it obviously causes stackoverflow.
Based on the various helpful comments, thanks to all, I changed my answer. Here is an example in Julia 1.0.0 of what you may be after. I am learning Julia myself, so maybe further comments can improve this example code.
# File test_code2.jl
struct Data
x
y
Data(x, y) = new(x - y, x * y)
end
test_data = Data(105, 5)
println("Constructor example: test_data = Data(105, 5)")
println("test_data now is...: ", test_data)
#= Output
julia> include("test_code2.jl")
Constructor example: test_data = Data(105, 5)
test_data now is...: Data(100, 525)
=#
This works for me
julia> struct datatype
x
y
end
julia> function datatype_create(a,b)
datatype(a - b, a * b)
end
datatype_create (generic function with 1 method)
julia> methods(datatype_create)
# 1 method for generic function "datatype_create":
[1] datatype_create(a, b) in Main at none:2
julia> methods(datatype)
# 1 method for generic function "(::Type)":
[1] datatype(x, y) in Main at none:2
julia> a = datatype_create(105,5)
datatype(100, 525)
julia> b = datatype_create(1+2im,3-4im)
datatype(-2 + 6im, 11 + 2im)
julia> c = datatype_create([1 2;3 4],[4 5;6 7])
datatype([-3 -3; -3 -3], [16 19; 36 43])
julia> d = datatype_create(1.5,0.2)
datatype(1.3, 0.30000000000000004)
If you are absolutely Ideologically Hell Bent on using an outer constructor, then you can do something like this
julia> datatype(a,b,dummy) = datatype(a - b,a * b)
datatype
julia> e = datatype(105,5,"dummy")
datatype(100, 525)
Antman's solution using the power of MACRO
julia> macro datatype(a,b)
return :( datatype($a - $b , $a * $b) )
end
#datatype (macro with 1 method)
julia> f = #datatype( 105 , 5 )
datatype(100, 525)

Julia loops are as slow as R loops

The code below in Julia and R is to show that the estimator of the population variance is a biased estimator, that is it depends on the sample size and no matter how many times we average over different observations, for small number of data points it is not equal to the variance of the population.
It takes for Julia ~10 seconds to finish the two loops and R does it in ~7 seconds.
If I leave the code inside the loops commented then the loops in R and Julia take the same time and if I only sum the iterators by s = s + i+ j Julia finishes in ~0.15s and R in ~0.5s.
Is it that Julia loops are slow or R became fast?
How can I improve the speed of the code below for Julia?
Can the R code become faster?
Julia:
using Plots
trials = 100000
sample_size = 10;
sd = Array{Float64}(trials,sample_size-1)
tic()
for i = 2:sample_size
for j = 1:trials
res = randn(i)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
toc()
sd2 = mean(sd,1)
plot(sd2[1:end])
R:
trials = 100000
sample_size = 10
sd = matrix(, nrow = trials, ncol = sample_size-1)
start_time = Sys.time()
for(i in 2:sample_size){
for(j in 1:trials){
res <- rnorm(n = i, mean = 0, sd = 1)
sd[j,i-1] = (1/(i))*(sum(res*res))-(1/((i)*i))*(sum(res)*sum(res))
}
}
end_time = Sys.time()
end_time - start_time
sd2 = apply(sd,2,mean)
plot(sqrt(sd2))
The plot in case anybody is curious!:
One way I could achieve much higher speed is to use parallel loop which is ver easy to implement in Julia:
using Plots
trials = 100000
sample_size = 10;
sd = SharedArray{Float64}(trials,sample_size-1)
tic()
#parallel for i = 2:sample_size
for j = 1:trials
res = randn(i)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
toc()
sd2 = mean(sd,1)
plot(sd2[1:end])
Using global variables in Julia in general is slow and should give you speed comparable to R. You should wrap your code in a function to make it fast.
Here is a timing from my laptop (I cut out only the relevant part):
julia> function test()
trials = 100000
sample_size = 10;
sd = Array{Float64}(trials,sample_size-1)
tic()
for i = 2:sample_size
for j = 1:trials
res = randn(i)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
toc()
end
test (generic function with 1 method)
julia> test()
elapsed time: 0.243233887 seconds
0.243233887
Additionally in Julia if you use randn! instead of randn you can speed it up even more as you avoid reallocation of res vector (I am not doing other optimizations to the code as this optimization is distinct to Julia in comparison to R; all other possible speedups in this code would help Julia and R in a similar way):
julia> function test2()
trials = 100000
sample_size = 10;
sd = Array{Float64}(trials,sample_size-1)
tic()
for i = 2:sample_size
res = zeros(i)
for j = 1:trials
randn!(res)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
toc()
end
test2 (generic function with 1 method)
julia> test2()
elapsed time: 0.154881137 seconds
0.154881137
Finally it is better to use BenchmarkTools package to measure execution time in Julia. First tic and toc functions will be removed from Julia 0.7. Second - you mix compilation and execution time if you use them (when running test function twice you will see that the time is reduced on the second run as Julia does not spend time compiling functions).
EDIT:
You can keep trials, sample_size and sd as global variables but then you should prefix them with const. Then it is enough to wrap a loop in a function like this:
const trials = 100000;
const sample_size = 10;
const sd = Array{Float64}(trials,sample_size-1);
function f()
for i = 2:sample_size
for j = 1:trials
res = randn(i)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
end
tic()
f()
toc()
Now for #parallel:
First, you should use #sync before #parallel to make sure all works correctly (i.e. that all workers have finished before you move to the next instruction). To see why this is needed run the following code on a system with more than one worker:
sd = SharedArray{Float64}(10^6);
#parallel for i = 1:2
if i < 2
sd[i] = 1
else
for j in 2:10^6
sd[j] = 1
end
end
end
minimum(sd) # most probably prints 0.0
sleep(1)
minimum(sd) # most probably prints 1.0
while this
sd = SharedArray{Float64}(10^6);
#sync #parallel for i = 1:2
if i < 2
sd[i] = 1
else
for j in 2:10^6
sd[j] = 1
end
end
end
minimum(sd) # always prints 1.0
Second, the speed improvement is due to #parallel macro not SharedArray. If you try your code on Julia with one worker it is also faster. The reason, in short, is that #parallel internally wraps your code inside a function. You can check it by using #macroexpand:
julia> #macroexpand #sync #parallel for i = 2:sample_size
for j = 1:trials
res = randn(i)
sd[j,i-1] = (1/(i))*(sum(res.^2))-(1/((i)*i))*(sum(res)*sum(res))
end
end
quote # task.jl, line 301:
(Base.sync_begin)() # task.jl, line 302:
#19#v = (Base.Distributed.pfor)(begin # distributed\macros.jl, line 172:
function (#20#R, #21#lo::Base.Distributed.Int, #22#hi::Base.Distributed.Int) # distributed\macros.jl, line 173:
for i = #20#R[#21#lo:#22#hi] # distributed\macros.jl, line 174:
begin # REPL[22], line 2:
for j = 1:trials # REPL[22], line 3:
res = randn(i) # REPL[22], line 4:
sd[j, i - 1] = (1 / i) * sum(res .^ 2) - (1 / (i * i)) * (sum(res) * sum(res))
end
end
end
end
end, 2:sample_size) # task.jl, line 303:
(Base.sync_end)() # task.jl, line 304:
#19#v
end

Julia - Continue outer loop

I am currently porting an algorithm from Java to Julia and now I have come across a part where I have to continue an outer loop from an inner loop when some condition is met:
loopC: for(int x : Y){
for(int i: I){
if(some_condition(i)){
continue loopC;
}
}
}
I have found some issues on GitHub on this topic but there seems to be only a discussion about it and no solution yet. Does anybody know a way how to accomplish this in Julia?
As in some other languages julia uses break for this:
for i in 1:4
for j in 1:4
if j == 2
break
end
end
end
breaks out of the inner loop whenever j is 2
However, if you ever need to exit the outer loop you can use #goto and #label like so
for i in 1:4
for j in 1:4
if (j-i) == 2
#goto label1
end
if j == 2
#goto label2
end
do stuff
end
#label label2
end
#label label1
Straight from the julia docs http://docs.julialang.org/en/release-0.5/manual/control-flow/
It is sometimes convenient to terminate the repetition of a while
before the test condition is falsified or stop iterating in a for loop
before the end of the iterable object is reached. This can be
accomplished with the break keyword
As mentioned by #isebarn, break can be used to exit the inner loop:
for i in 1:3
for j in 1:3
if j == 2
break # continues with next i
end
#show (i,j)
end # next j
end # next i
(i, j) = (1, 1)
(i, j) = (2, 1)
(i, j) = (3, 1)
However, some caution is required because the behaviour of break depends on how the nested loops are specified:
for i in 1:3, j in 1:3
if j == 2
break # exits both loops
end
#show (i,j)
end # next i,j
(i, j) = (1, 1)
See https://en.wikibooks.org/wiki/Introducing_Julia/Controlling_the_flow#Nested_loops
It is also possible, albeit cumbersome, to return from a nested function that contains the inner loop:
for i in 1:3
(i -> for j in 1:3
if j == 2
return
end
#show (i,j)
end)(i)
end
(i, j) = (1, 1)
(i, j) = (2, 1)
(i, j) = (3, 1)

Resources