How does the recursion work in heap-sort? - recursion

Let's say I have an array A = <1,6,2,7,3,8,4,9,5>
pseudocode for Heapsort:
BUILD-MAX-HEAP(A)
n = A.heapsize = A.length
for i = floor(n/2) down to 1
MAX-HEAPIFY(A,i)
MAX-HEAPIFY(A,i)
n = A.heap-size
l = LEFT(i)
r = RIGHT(i)
if l <= n and A[l] > A[i]
largest = l
if r <= n and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] <-> A[largest]
MAX-HEAPIFY(A,largest)
I know BUILD-MAX-HEAP will first call MAX-HEAPIFY(A,4), which will exchange 7 and 9, then after MAX-HEAPIFY (A,3) it will switch 8 and 2. Then it will call MAX-HEAPIFY(A,2), and this is where I get confused. This is how the heap looks when MAX-HEAPIFY(A,2) is called
The first thing that will happen is that 6 and 7 will be exchanged, then it will call MAX-HEAPIFY(A,4) (because 4 is now largest), and exchange 6 and 9, then it will call MAX-HEAPIFY(A,8) but nothing will happen because you've reached a leaf, so then it returns to the function that called it.
MAX-HEAPIFY(A-8) was called by MAX-HEAPIFY(A,4) so it returns to it
MAX-HEAPIFY(A,4) was called by MAX-HEAPIFY(A,2) so it returns to it
but now A[2] < A[4] (because 7 < 9), and it is at this point that I wonder how it knows to call MAX-HEAPIFY(A,2) again to exchange 7 and 9. When a recursive function (or subroutine) returns to the one that called it, there is no more code to be executed (since MAX-HEAPIFY only calls MAX-HEAPIFY at the end of the function), so it will return back up the recursion stack and in my mind it feels like 7 will still be the parent of 9
Sorry if this is confusing, but can somebody walk me through this to help me understand how exactly this is recursively max-heapifying itself?

The following is the series of steps I get when following your algorithm (note the levels of indent when we recurse at the end of each). Every time we exit the function, we just return to the main program (calling max_heapify with the numbers 4 down to 1). I'm not positive where your interpretation is off, but I'm hoping the following makes it clearer.
for i in (4,3,2,1):
MAX-HEAPIFY(A,i)
MAX-HEAPIFY(A,4):
largest=4 # initialized
l=8
r=9
largest=8 # calculated
swap A[4] and a[8]:
A = <1,6,2,9,3,8,4,7,5>
MAX-HEAPIFY(A, 8):
largest=8 # initialized
l=16
r=17
...return
...return
MAX-HEAPIFY(A,3):
largest=3 # initialized
l=6
r=7
largest=6 # calculated
swap A[3] and A[6]:
A = <1,6,8,9,3,2,4,7,5>
MAX-HEAPIFY(A, 6):
largest=6
l=12
r=13
...return
...return
MAX-HEAPIFY(A,2):
largest=2 # initialized
l=4
r=5
largest=4 # calculated
swap A[2]and A[4]:
A = <1,9,8,6,3,2,4,7,5>
MAX-HEAPIFY(A, 4):
largest=4 # initialized
l=8
r=9
largest=8
swap A[4] and A[8]:
A = <1,9,8,7,3,2,4,6,5>
MAX-HEAPIFY(A, 8):
largest=8 # initialized
l=16
r=17
...return
...return
...return
MAX-HEAPIFY(A,1):
largest=1 # initialized
l=2
r=3
largest=2 # calculated
swap A[1] and A[2]:
A = <9,1,8,7,3,2,4,6,5>
MAX-HEAPIFY(A, 2):
largest=2: # initialized
l=4
r=5
largest=4: # calculated
swap A[2] and A[4]:
A = <9,7,8,1,3,2,4,6,5>
MAX-HEAPIFY(A, 4):
largest=4: # initialized
l=8
r=9
largest=8: # calculated
swap A[4] and A[8]:
A = <9,7,8,6,3,2,4,1,5>
MAX-HEAPIFY(A, 8):
largest=8: # initialized
l=16
r=17
...return
...return
...return
...return
Done!
A = <9,7,8,6,3,2,4,1,5>
I then went so far as to translate your algorithm (almost directly) into python (note I had to make a few tweaks for python's 0-based index):
def build_max_heap(A):
for i in range(len(A)//2, 0, -1):
max_heapify(A, i)
def left(x):
return 2 * x
def right(x):
return 2 * x + 1
def max_heapify(A, i):
n = len(A)
largest = i
l = left(i)
r = right(i)
if l<=n and A[l-1] > A[i-1]:
largest = l
if r <=n and A[r-1] > A[largest-1]:
largest = r
if largest !=i:
A[i-1], A[largest-1] = A[largest-1], A[i-1]
max_heapify(A,largest)
if __name__ == '__main__':
A = [1,6,2,7,3,8,4,9,5]
build_max_heap(A) # modifies in-place
print(A)
This prints:
[9, 7, 8, 6, 3, 2, 4, 1, 5]
(which agrees with our manual iterations)
...and for one more check, using python's heapq module with its private method _heapify_max:
import heapq
A = [1,6,2,7,3,8,4,9,5]
heapq._heapify_max(A)
print(A)
...prints the same:
[9, 7, 8, 6, 3, 2, 4, 1, 5]

Related

Find length of array of functions in Julia

I want to find the length, nc, of this "vector of functions". I should be 2.
comp(x) = [([x[5], x[6], x[7], x[8],x[9], x[10]], tmp(x)) ; ([x[1],x[2]], [x[3],x[4]])];
nc = ....
I tried with length(comp) and length(comp(x)) but it doesn't work. I get "x not defined" and "no method matching length(::typeof(comp))", respectively.
Pulling together some of the comments to hopefully make things clearer:
What you have written is essentially
function comp(x)
a = [x[5], x[6], x[7], x[8],x[9], x[10]]
b = [x[1],x[2]]
c = [x[3],x[4]]
return [(a, tmp(x)); (b, c)]
end
that is, you have defined a function comp which takes one argument x and then returns a 2-element vector of 2-element tuples, with the first tuple holding values 5 to 10 of x and the result of tmp(x) (this function is not defined in your code so we don't know what it returns), and the second tuple holding the first and second, and third and fourth elements of x, respectively.
To illustrate, assume tmp(x) just sums up the elements of x, then we can pass some array (in the below example a range) of numbers to comp and see it in action:
julia> tmp(x) = sum(x)
tmp (generic function with 1 method)
julia> comp(1:20)
2-element Vector{Tuple{Vector{Int64}, Any}}:
([5, 6, 7, 8, 9, 10], 210)
([1, 2], [3, 4])
and you can get the result of the return value:
julia> length(comp(1:20))
2

Parallelize two (or more) functions in julia

I am trying to solve some wave equation problem (related to my Phd) using finite difference method. For this, I have translated (line by line) a fortran code (link below): (https://github.com/geodynamics/seismic_cpml/blob/master/seismic_CPML_2D_anisotropic.f90)
Inside these code and within the time loop, there are four main loops that are independent. In fact, I could arrange them into four functions.
As I have to run this code about a hundred times, it would be nice to speed up the process. In this sense, I am turning my eyes toward parallelization. See below, as an example:
function main()
...some common code...
for time=1:N
function fun1() # I want this function to run parallel...
function fun2() # ..this function to run parallel with 1,3,4
function fun3() # ..This function to run parallel with 2,3,4
function fun4() # ..This function to run parallel with 1,2,3
end
... more code here...
return
end
So,
1) Is it possible to do what I mention before?
2) Will this approach speed up my code?
3) Is there a better way to think this problem?
A minimal working example could be like this:
function fun1(t)
for i=1:1000
for j=1:1000
t+=(0.5)^t+(0.3)^(t-1);
end
end
return t
end
function fun2(t)
for i=1:1000
for j=1:1000
t+=(0.5)^t;
end
end
return t
end
function fun3(r)
for i=1:1000
for j=1:1000
r = (r + rand())/r;
end
end
return r
end
function main()
a = 2;
b = 2.5;
c = 3.0;
for i=1:100
a = fun1(a);
b = fun2(b);
c = fun3(c);
end
return;
end
So, As can be seen, non of the three functions above (fun1, fun2 & fun3) depend from any ohter, so they can sure run parallel. can these be achieved?, will it bust my computational speed?
Edited:
Hi #BogumiłKamiński I have altered the finite-Diff-eq in order to implement a "loop" (as you sugested) over the inputs and outputs of my functions. If there is no much trouble, I would like your opinion over the parellelization design of the code:
Key elements
1) I have packed all inputs in 4 tuples: sig_xy_in and sig_xy_cros_in (for the 2 sigma functions) and vel_vx_in and vel_vy_in (for 2 velocity functions). I then packed the 4 tuples into 2 vectors for "looping" purposes...
2) I packed the 4 functions in 2 vectors for "looping" purposes...
3) I run the first parallel loop and then unpack its output tuple...
4) I run the second parallel loop(for velocities) and then unpack its output tuple...
5) finally, I packed the outputed elements into the inputs tuples and continue the time loop until finish..
...code
l = Threads.SpinLock()
arg_in_sig = [sig_xy_in,sig_xy_cros_in]; # Inputs tuples x sigma funct
arg_in_vel = [vel_vx_in, vel_vy_in]; # Inputs tuples x velocity funct
func_sig = [sig_xy , sig_xy_cros]; # Vector with two sigma functions
func_vel = [vel_vx , vel_vy]; # Vector with two velocity functions
for it = 1:NSTEP # time steps
#------------------------------------------------------------
# Compute sigma functions
#------------------------------------------------------------
Threads.#threads for j in 1:2 # Star parallel of two sigma functs
Threads.lock(l);
Threads.unlock(l);
arg_in_sig[j] = func_sig[j](arg_in_sig[j]);
end
# Unpack tuples for sig_xy and sig_xy_cros
# Unpack tuples for sig_xy
sigxx = arg_in_sig[1][1]; # changed by sig_xy
sigyy = arg_in_sig[1][2]; # changed by sig_xy
m_dvx_dx = arg_in_sig[1][3]; # changed by sig_xy
m_dvy_dy = arg_in_sig[1][4]; # changed by sig_xy
vx = arg_in_sig[1][5]; # unchanged by sig_xy
vy = arg_in_sig[1][6]; # unchanged by sig_xy
delx_1 = arg_in_sig[1][7]; # unchanged by sig_xy
dely_1 = arg_in_sig[1][8]; # unchanged by sig_xy
...more unpacking...
# Unpack tuples for sig_xy_cros
sigxy = arg_in_sig[2][1]; # changed by sig_xy_cros
m_dvy_dx = arg_in_sig[2][2]; # changed by sig_xy_cros
m_dvx_dy = arg_in_sig[2][3]; # changed by sig_xy_cros
vx = arg_in_sig[2][4]; # unchanged by sig_xy_cros
vy = arg_in_sig[2][5]; # unchanged by sig_xy_cros
...more unpacking....
#--------------------------------------------------------
# velocity
#--------------------------------------------------------
Threads.#threads for j in 1:2 # Start parallel ot two velocity funct
Threads.lock(l)
Threads.unlock(l)
arg_in_vel[j] = func_vel[j](arg_in_vel[j])
end
# Unpack tuples for vel_vx
vx = arg_in_vel[1][1]; # changed by vel_vx
m_dsigxx_dx = arg_in_vel[1][2]; # changed by vel_vx
m_dsigxy_dy = arg_in_vel[1][3]; # changed by vel_vx
sigxx = arg_in_vel[1][4]; # unchanged changed by vel_vx
sigxy = arg_in_vel[1][5];....
# Unpack tuples for vel_vy
vy = arg_in_vel[2][1]; # changed changed by vel_vy
m_dsigxy_dx = arg_in_vel[2][2]; # changed changed by vel_vy
m_dsigyy_dy = arg_in_vel[2][3]; # changed changed by vel_vy
sigxy = arg_in_vel[2][4]; # unchanged changed by vel_vy
sigyy = arg_in_vel[2][5]; # unchanged changed by vel_vy
.....
...more unpacking...
# ensamble new input variables
sig_xy_in = (sigxx,sigyy,
m_dvx_dx,m_dvy_dy,
vx,vy,....);
sig_xy_cros_in = (sigxy,
m_dvy_dx,m_dvx_dy,
vx,vy,....;
vel_vx_in = (vx,....
vel_vy_in = (vy,.....
end #time loop
Here is a simple way to run your code in multithreading mode:
function fun1(t)
for i=1:1000
for j=1:1000
t+=(0.5)^t+(0.3)^(t-1);
end
end
return t
end
function fun2(t)
for i=1:1000
for j=1:1000
t+=(0.5)^t;
end
end
return t
end
function fun3(r)
for i=1:1000
for j=1:1000
r = (r + rand())/r;
end
end
return r
end
function main()
l = Threads.SpinLock()
a = [2.0, 2.5, 3.0]
f = [fun1, fun2, fun3]
Threads.#threads for i in 1:3
for j in 1:4
Threads.lock(l)
println((thread=Threads.threadid(), iteration=j))
Threads.unlock(l)
a[i] = f[i](a[i])
end
end
return a
end
I have added locking - just as an example how you can do it (in Julia 1.3 you would not have to do this as IO is thread safe there).
Also note that rand() is sharing data among threads prior to Julia 1.3 so it would be not safe to run these functions if all of them used rand() (again in Julia 1.3 it would be safe to do so).
To run this code first set the maximum number of threads you want to use e.g. like this on Windows: set JULIA_NUM_THREADS=4 (in Linux you should export). Here is an example of this code run (I have reduced the number of iterations done in order to shorten the output):
julia> main()
(thread = 1, iteration = 1)
(thread = 3, iteration = 1)
(thread = 2, iteration = 1)
(thread = 3, iteration = 2)
(thread = 3, iteration = 3)
(thread = 3, iteration = 4)
(thread = 2, iteration = 2)
(thread = 1, iteration = 2)
(thread = 2, iteration = 3)
(thread = 2, iteration = 4)
(thread = 1, iteration = 3)
(thread = 1, iteration = 4)
3-element Array{Float64,1}:
21.40311930108456
21.402807510451463
1.219028489573526
Now one smal cautionary note - while it is relatively easy to make code multithreaded in Julia (and in Julia 1.3 it will be even simpler) you have to be careful when you do it as you have to take care of race conditions.

#distributed seems to work, function return is wonky

I'm just learning how to do parallel computing in Julia. I'm using #sync #distributed at the start of a 3x nested for loop to parallelize things (see code at bottom). From the line println(errCmp[row, col]) I can watch all the elements of the array errCmp be printed out. E.g.
From worker 3: 2.351134946074191e9
From worker 4: 2.3500830193505473e9
From worker 5: 2.3502416529551845e9
From worker 2: 2.3509105625656652e9
From worker 3: 2.3508352842971106e9
From worker 4: 2.3497049296121807e9
From worker 5: 2.35048428351797e9
From worker 2: 2.350742582031195e9
From worker 3: 2.350616273660934e9
From worker 4: 2.349709546599313e9
However, when the function returns, errCmp is the array of zeros I pre-allocate at the begging.
Am I missing some closing term to collect everything?
function optimizeDragCalc(df::DataFrame)
paramGrid = [cd*AoM for cd = range(1e-3, stop = 0.01, length = 50), AoM = range(2e-4, stop = 0.0015, length = 50)]
errCmp = zeros(size(paramGrid))
# totalSize = size(paramGrid, 1) * size(paramGrid, 2) * size(df.time, 1)
#sync #distributed for row = 1:size(paramGrid, 1)
for col = 1:size(paramGrid, 2)
# Run the propagation here
BC = 1/paramGrid[row, col]
slns, _ = propWholeTraj(df, BC)
for time = 1:size(df.time, 1)
errDF = propError(slns[time], df, time)
errCmp[row, col] += sum(errDF.totalErr)
end # time
# println("row: ", row, " of ",size(paramGrid, 1)," col: ", col, " of ", size(paramGrid, 2))
println(errCmp[row, col])
end # col
end # row
# plot(heatmap(z = errCmp))
return errCmp, paramGrid
end
errCmp, paramGrid = #time optimizeDragCalc(df)
You did not provide a minimal working example but I guess it might be hard. So here is mine MWE. Let us assume that we want to use Distributed to calculate sums of Array's columns:
using Distributed
addprocs(2)
#everywhere using StatsBase
data = rand(1000,2000)
res = zeros(2000)
#sync #distributed for col = 1:size(data)[2]
res[col] = StatsBase.mean(data[:,col])
# does not work!
# ... because data is created locally and never returned!
end
In order to correct the above code you need to provide an aggregator function (I keep the example intentionally simplified - a further optimization is possible).
using Distributed
addprocs(2)
#everywhere using Distributed,StatsBase
data = rand(1000,2000)
#everywhere function t2(d1,d2)
append!(d1,d2)
d1
end
res = #sync #distributed (t2) for col = 1:size(data)[2]
[(myid(),col, StatsBase.mean(data[:,col]))]
end
Now let us see the output. We can see that some of the values have been calculated on worker 2 while others on worker 3:
julia> res
2000-element Array{Tuple{Int64,Int64,Float64},1}:
(2, 1, 0.49703681326230276)
(2, 2, 0.5035341367791002)
(2, 3, 0.5050607022354537)
⋮
(3, 1998, 0.4975699181976122)
(3, 1999, 0.5009498778934444)
(3, 2000, 0.499671315490524)
Further possible improvements/modifications:
use #spawnat to generate values at remote processes (instead of the master process and sending them)
use SharedArray - this allows to automatically distribute data among workers. From my experience requires very careful programming.
use ParallelDataTransfer.jl to send data among workers. Very easy to use, not efficient for huge number of messages.
always consider Julia threading mechanism (in some scenarios it makes life easier - again depends on the problem)

Map, reduce, filter apply to for loops and while loops

I'm new to Julia and learning use of Map, reduce, filter.
It is becoming very hard for me to comprehend how it can replace for and while loops.
For ex for below code, I would like to replace for loop
function addMultiplesOf3And5(N::Int)
sumOfMultiples = 0
if(N == 3)
return sumOfMultiples + N
end
for i = 3:N-1
if(i % 3 == 0 && i % 5 == 0)
continue
elseif(i % 3 == 0)
sumOfMultiples += i
elseif(i % 5 == 0)
sumOfMultiples += i
end
end
return sumOfMultiples
end
I would really appreciate the help.
Update :
This is what I did after going through tutorials
function addMultiplesOf3And5(N::Int)
array = range(1,N-1)
return reduce(+, map(x -> multiples_of_3_Or_5(x), array))
end
function multiples_of_3_Or_5(n)
if(n % 3 == 0 && n % 5 == 0)
return 0
elseif(n % 3 == 0)
return n
elseif(n % 5 == 0)
return n
else
return 0
end
end
Final:
function addMultiplesOf3And5(N::Int)
array = range(1,N-1)
return reduce(+, filter(x -> ((x%3==0)$(x%5==0)), array))
end
To understand how you can replace your 'for loop + if block' code with 'map / reduce / filter' you need to know exactly how they work and why they might be chosen instead.
1. The map function
map is a function that takes a function variable and a list as arguments, and returns a new list, where each element is the result of applying the function to each element of the old list. So for example if your variable f refers to a function f(x) = x + 5 you defined earlier, and you have a list L=[1,2,3,4,5], then map(f, L) will return [f(L[1]), f(L[2]), f(L[3]), f(L[4]), f(L[5])]
So if you have code like:
f(x) = x + 5;
L = [1,2,3,4,5];
A = zeros(5);
for i in L
A[i] = f(i);
end
You could rewrite this as a mapping operation like so:
A = map(x -> x + 5, [1,2,3,4,5]);
2. The reduce function
reduce takes a binary function variable (i.e. a function that takes two arguments) and a list as arguments. What it does is probably best explained by an example. Calling reduce with the + operator, and list [1,2,3,4,5] will do the following:
Step 1: [1, 2, 3, 4, 5] % : 5 elements
Step 2: [1+2, 3, 4, 5] % [3,3,4,5] : 4 elements
Step 3: [3+3, 4, 5] % [6, 4, 5] : 3 elements
Step 4: [6+4, 5] % [10, 5] : 2 elements
Step 5: [10+5] % [15] : 1 elements
result: 15
i.e. we have reduced the list to a single result by successively applying the binary function to the first pair of elements, consuming the list little by little.
So if you have code like:
f(x,y) = x + y
L = [1,2,3,4,5];
A = L[1];
for i in 2:length(L)
A = f(A, L[i])
end
you could rewrite this as a reduce operation like so:
A = reduce(x,y -> x+y, [1,2,3,4,5])
3. The filter function
filter takes a predicate function (e.g. iseven, isnull, ==, or anything that takes an argument and performs a test on it, resulting in true or false) and a list, tests each element of the list with the function and returns a new list that only contains the elements that pass that test. e.g.
filter(iseven, [1,2,3,4,5]) # returns [2,4]
The answer to your problem
If I understand correctly, addMultiplesOf3And5 takes a number N (e.g. 20), and does the following:
filter out all the elements that can be divided by either 3 or 5 from the list [1,2,3,...,20]
successively add all elements of the resulting list together using a reduce function.
You should be able to use the above to figure out the exact code :)
Not sure what the function in the question is supposed to calculate, but:
addMult3or5(N) = N==3 ? 3 : sum(filter(x->((x%3==0)$(x%5==0)),3:N-1))
calculates the same thing.
sum is a a reduce-like function for the + operation.
Hope this helps clarify.
Also, $ is the exclusive-or operation in Julia.

How to make recursive nested loops which use loop variables inside?

I need to make a nested loop with an arbitrary depth. Recursive loops seem the right way, but I don't know how to use the loop variables in side the loop. For example, once I specify the depth to 3, it should work like
count = 1
for i=1, Nmax-2
for j=i+1, Nmax-1
for k=j+1,Nmax
function(i,j,k,0,0,0,0....) // a function having Nmax arguments
count += 1
end
end
end
I want to make a subroutine which takes the depth of the loops as an argument.
UPDATE:
I implemented the scheme proposed by Zoltan. I wrote it in python for simplicity.
count = 0;
def f(CurrentDepth, ArgSoFar, MaxDepth, Nmax):
global count;
if CurrentDepth > MaxDepth:
count += 1;
print count, ArgSoFar;
else:
if CurrentDepth == 1:
for i in range(1, Nmax + 2 - MaxDepth):
NewArgs = ArgSoFar;
NewArgs[1-1] = i;
f(2, NewArgs, MaxDepth, Nmax);
else:
for i in range(ArgSoFar[CurrentDepth-1-1] + 1, Nmax + CurrentDepth - MaxDepth +1):
NewArgs = ArgSoFar;
NewArgs[CurrentDepth-1] = i;
f(CurrentDepth + 1, NewArgs, MaxDepth, Nmax);
f(1,[0,0,0,0,0],3,5)
and the results are
1 [1, 2, 3, 0, 0]
2 [1, 2, 4, 0, 0]
3 [1, 2, 5, 0, 0]
4 [1, 3, 4, 0, 0]
5 [1, 3, 5, 0, 0]
6 [1, 4, 5, 0, 0]
7 [2, 3, 4, 0, 0]
8 [2, 3, 5, 0, 0]
9 [2, 4, 5, 0, 0]
10 [3, 4, 5, 0, 0]
There may be a better way to do this, but so far this one works fine. It seems easy to do this in fortran. Thank you so much for your help!!!
Here's one way you could do what you want. This is pseudo-code, I haven't written enough to compile and test it but you should get the picture.
Define a function, let's call it fun1 which takes inter alia an integer array argument, perhaps like this
<type> function fun1(indices, other_arguments)
integer, dimension(:), intent(in) :: indices
...
which you might call like this
fun1([4,5,6],...)
and the interpretation of this is that the function is to use a loop-nest 3 levels deep like this:
do ix = 1,4
do jx = 1,5
do kx = 1,6
...
Of course, you can't write a loop nest whose depth is determined at run-time (not in Fortran anyway) so you would flatten this into a single loop along the lines of
do ix = 1, product(indices)
If you need the values of the individual indices inside the loop you'll need to unflatten the linearised index. Note that all you are doing is writing the code to transform array indices from N-D into 1-D and vice versa; this is what the compiler does for you when you can specify the rank of an array at compile time. If the inner loops aren't to run over the whole range of the indices you'll have to do something more complicated, careful coding required but not difficult.
Depending on what you are actually trying to do this may or may not be either a good or even satisfactory approach. If you are trying to write a function to compute a value at each element in an array whose rank is not known when you write the function then the preceding suggestion is dead flat wrong, in this case you would want to write an elemental function. Update your question if you want further information.
you can define your function to have a List argument, which is initially empty
void f(int num,List argumentsSoFar){
// call f() for num+1..Nmax
for(i = num+1 ; i < Nmax ; i++){
List newArgs=argumentsSoFar.clone();
newArgs.add(i);
f(i,newArgs);
}
if (num+1==Nmax){
// do the work with your argument list...i think you wanted to arrive here ;)
}
}
caveat: the stack should be able to handle Nmax depth function calls
Yet another way to achieve what you desire is based on the answer by High Performance Mark, but can be made more general:
subroutine nestedLoop(indicesIn)
! Input indices, of arbitrary rank
integer,dimension(:),intent(in) :: indicesIn
! Internal indices, here set to length 5 for brevity, but set as many as you'd like
integer,dimension(5) :: indices = 0
integer :: i1,i2,i3,i4,i5
indices(1:size(indicesIn)) = indicesIn
do i1 = 0,indices(1)
do i2 = 0,indices(2)
do i3 = 0,indices(3)
do i4 = 0,indices(4)
do i5 = 0,indices(5)
! Do calculations here:
! myFunc(i1,i2,i3,i4,i5)
enddo
enddo
enddo
enddo
enddo
endsubroutine nestedLoop
You now have nested loops explicitly coded, but these are 1-trip loops unless otherwise desired. Note that if you intend to construct arrays of rank that depends on the nested loop depth, you can go up to rank of 7, or 15 if you have a compiler that supports it (Fortran 2008). You can now try:
call nestedLoop([1])
call nestedLoop([2,3])
call nestedLoop([1,2,3,2,1])
You can modify this routine to your liking and desired applicability, add exception handling etc.
From an OOP approach, each loop could be represented by a "Loop" object - this object would have the ability to be constructed while containing another instance of itself. You could then theoretically nest these as deep as you need to.
Loop1 would execute Loop2 would execute Loop3.. and onwards.

Resources