Why a recursive algorithm cannot be parallelized efficiently ? - recursion

It is very tough to convert a sequential code which has recursion in it to an equivalent parallel code written in openmp,CUDA or MPI .
Why is it so ?

If a piece of code has been written as a recursive algorithm, there is a good chance that the calculations performed in each level of recursion depends on the results of the next. This would imply that it is hard to do the calculations from different recursive steps in parallel.
Another way of thinking about this is to imagine flattening out the recursion into iteration (see for example Can every recursion be converted into iteration?). A recursive algorithm is likely to generate a flattened version where each iterations depend on the results from other iterations, making it hard to do the iterations in parallel.

Related

Numerical optimization with MPI

I am trying to parallelize an optimization routine by using MPI directives. The structure of the program is roughly like in the block diagram at the end of the text. Data is fed to the optimization routine, it calls an Objective function subroutine and another subroutine, which calculates a matrix called “Jacobian”. The Optimization routine iterates as many times as needed to reach a minimum of the Objective function and exits with a result.The Jacobian is used to decide in which direction the minimum might be and to take a step in that direction. I don’t have control over the Optimization routine, I only supply the Objective function and the function calculating the Jacobian. Most of the time is spend on calculating the Jacobian. Since each matrix element of the Jacobian is independent of the rest of the elements, it seems as a good candidate for parallelization. However, I haven’t been able to accomplish this. Initially I was thinking that I can distribute the calculation of the Jacobian over a large number of nodes, each of which would calculate only some of the matrix elements. I did that but after just one iteration all the threads on the nodes exit and the program stalls. I am starting to think that without the source code of the Optimization routine this might not be possible. The reason is that distributing the code over multiple nodes and instructing them to only calculate a fraction of the Jacobian messes up the optimization on all of them, except the master. Is there a way around it, using MPI and without touching the code in the optimization routine ? Can only the function calculating the Jacobian be executed on all nodes except the master ? How would you do this ?
It turned out easier than I thought. As explained in the question, the worker threads were exiting after just one iteration. The solution is to just enclose the code in the Jacobian calculation executed by the workers with an infinite while loop and break out of it by sending a message from the main thread (master) once it exits with the answer.

Recursive vs non-recursive [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Recursion and Iteration
What is the difference between a recursive and a non-recursive function? Fibonacci to be exact.
I looking for answers that relate towards the time and memory.
"Recursive" simply means that a function calls itself. This may or may not be intentional (unintentional recursion is responsible for lots of crashes).
Intentional recursion, where a function performs part of an operation, then calls itself to perform the remaining part, is often a useful programming paradigm, but requires some degree of comprehension/experience/skill to "get your head around it".
Basically, recursion can be used to replace "iteration" (loops) and to replace accompanying array allocations (with variables local to the function body). But not every iterative or array-using function can be effectively converted to its recursive equivalent.
If the problem is suitable for recursion, one can often write a recursive version that is about equivalent in execution efficiency to the non-recursive version ... maybe slightly better or worse depending on how efficient the call mechanism is compared to looping and array indexing in the language/compiler. In terms of storage, recursion is rarely more efficient, but it benefits from not having to pre-allocate (and pre-know the size of the allocation) for the particular problem at hand.
Mostly recursion is better (when it actually is) because it makes an implementation much simpler and less error-prone, and errors are by far the biggest cost in computing. (But of course improperly done it can cost you big time as well.)
When recursion is good it's very good. When recursion is bad it's very bad.
Recursive functions are procedures or subroutines implemented in a programming language, whose implementation references itself.
Non Recursive Function are procedures or subroutines implemented in a programming language, whose implementation does not references itself
Below is a link for recursive and non recursive fibonacci series:- Recursive and Non Recursive Fibonacci Series

Is recursion preferred compare to iteration in multicore era?

Or say, do multicore CPUs process recursion faster than iteration?
Or it simply depends on how one language runs on the machine?
like c executes function calls with large cost, comparing to doing simple iterations.
I had this question because one day I told one of my friend that recursion isn't any amazing magic that can speed up programs, and he told me that with multicore CPUs recursion can be faster than iteration.
EDIT:
If we consider the most recursion-loved situation (data structure, function call),
is it even possible for recursion to be faster?
EDIT ont Oct 12th:
So how are the multicore cpus performing for now?
Are the softwares nowadays all programed for multi-core cpus?
There are really two ways to look at this problem:
1. Looking purely at the compiled code, then yes, iteration is faster than recursion. This is because recursion adds a function call (=overhead), and iteration does not. However, a common type of recursion is tail recursion: the recursive call is made at the end of the function. This is always optimized to iteration by compilers. So in that case it does not matter. Ergo: in some cases recursion is slower, but it is never faster.
2. From a functional programming viewpoint, most of the time recursive functions are written to be without side effects. (Having side effects in a recursive function would make it really difficult to get it to produce correct results.) If a function doesn't have side effects, then it is trivial to parallelize (thus easier to run on a multicore system). This isn't a property of recursive functions per se, but that could be the reason why your friend argues that recursion can be faster than iteration.
While the recursion is elegant and mathematically beautiful, it consumes a lot of resources, especially memory. If you have an efficient iterative solution, You should go for that.

Recursive vs. Iterative algorithms

I'm implementing the Euclidian algorithm for finding the GCD (Greatest Common Divisor) of two integers.
Two sample implementations are given: Recursive and Iterative.
http://en.wikipedia.org/wiki/Euclidean_algorithm#Implementations
My Question:
In school I remember my professors talking about recursive functions like they were all the rage, but I have one doubt. Compared to an iterative version don't recursive algorithms take up more stack space and therefore much more memory? Also, because calling a function requires uses some overhead for initialization, aren't recursive algorithms more slower than their iterative counterpart?
It depends entirely on the language. If your language has tail-call recursion support(a lot do now days) then they will go at an equal speed. If it does not, then the recursive version will be slower and take more (precious) stack space.
It all depends on the language and compiler. Current computers aren't really geared towards efficient recursion, but some compilers can optimize some cases of recursion to run just as efficiently as a loop (essentially, it becomes a loop in the machine code). Then again, some compilers can't.
Recursion is perhaps more beautiful in a mathematical sense, but if you feel more comfortable with iteration, just use it.

Most important speed issues

I am participating in Al Zimmermann's Programming Contest.
http://www.azspcs.net/Contest/SonOfDarts
I have written a recursive algorithm but it takes a long time to run. I was wondering what are the most important things to consider about speed of recursive algorithms. I have made most of the properties global so they don't get allocated every time the recursions step. Is there anything else I can do that will speed up my program without changing my algorithm?
It depends on the details of your algorithm. If it is tail recursive you could transform it to an iterative algorithm fairly easily.
Recusrsion is always slower than itterative. Due to stack/heap/memory aloocation performs slower than most. It is alwyas easier to implement a recusive function in complex algorithms, nut if possible, itterative will be faster.
What language are you using for writing your program? Some languages like Haskell are tailor-made for recursive algorithms while others like Python are not.
How much time is being spent within each function call vs the number of recursive calls out of the function? Too less code being executed within the function itself would certainly lead to performance loss.
Variables on stack are usually much faster than global variables. Consider passing them around from function to function rather than putting them in global.
Unfortunately there isn't enough context in the question to provide better answer.
Recursive algorithms can also be designed in such a way that they are tail recursive. In such a situation, compilers support tail recursion optimization leading to much faster code.
There are probably a lot of overlapping sub-questions in your algorithm and you didn't save the intermediate results for each sub-questions. If you do, your program should be fast enough.
EDIT:
I just gave the dart question some thought and felt taking the recursion may not be a good approach to the solution. I did some research in SQL server with the sample given by the question:
create table regions (score int)
insert into regions values (0)
insert into regions values (1)
insert into regions values (2)
insert into regions values (4)
insert into regions values (7)
insert into regions values (11)
create table results (score int)
insert into results
select distinct (s1.score+s2.score+s3.score)
from regions s1, regions s2, regions s3
select * from results
The script clearly reveals a possible solution that can be easily implemented in an imperative programming style, without taking any recursive approach.
Don't assume the problem is with the recursion, or anything else a priori. Just do this, where you find out what's biggest, fix it, and move on to the next. I'm not saying it won't turn out that recursion is the big deal at some point. It's just that chances are very good there are bigger problems you can fix first.
If you can submit compiled code for Intel platforms, then:
Collocation of memory content to favor the CPU cache content wins over best classical algorithms in any area. Make sure to use Intel VTune performance analyzer output fed to your linker options to keep bodies of related functions located close in code memory.

Resources