Short question: I have a javascript that goes very deep in recursion. How can I increase the stack size so that I can execute it (something like "ulimit -s unlimited" in Unix systems)?
Long story: I have to draw a graph and I use Cytoscape JS (http://js.cytoscape.org/) coupled with the Dagre layout extension (https://github.com/cytoscape/cytoscape.js-dagre). The drawing algorithm goes deep in the recursion and I end up getting "Uncaught RangeError: Maximum call stack size exceeded" in Chrome and "too much recursion" in Firefox. How can I set the stack size to unlimited or very large (i.e. like "ulimit -s unlimited" in Unix systems) so that I can draw the graph?
Thank you!
Chrome has a flag for this:
chromium-browser --js-flags="--stack-size 2048"
You will also want to run ulimit -s unlimited before running the command above, though: otherwise, your deeply recursive Javascript code will crash Chrome.
You cannot alter the stack size in browsers, but you can use a trick called trampolining.
You can find a working code solution here:
How to understand trampoline in JavaScript?
Try changing your algorhythm to not use as much stack space on each iteration of the function. For instance:
Setting local variables to null when not being used.
Use global variables for temporary calculations when possible. That way, that temporary variable won't be on the stack.
Use fewer variables in your recursive function. Reuse the same variables for different things in different parts of the function.
Break your recursive function into several functions. Some of those functions won't be recursive and therefore the local variables in those functions won't carry on when the recursive function calls itself.
Create a global array of things to do and add items to this list instead of calling a function recursively. use the push and pop methods of the array() object.
Have fewer parameters on your recursive function. Pass an object instead.
I hope these ideas help you.
Related
I am often confronted with the following situation when I debug my Julia code:
I suspect that a certain variable (often a large matrix) deep inside my code is not what I intended it to be and I want to have a closer look at it. Ideally, I want to have access to it in the REPL so I can play around with it.
What is the best practice to get access to variables several function layers deep without passing them up the chain, i.e. changing the function returns?
Example:
function multiply(u)
v = 2*u
w = subtract(v)
return w
end
function subtract(x)
i = x-5
t = 10
return i-3t
end
multiply(10)
If I run multiply() and suspect that the intermediate variable i is not what I assume it should be, how would I gain access to it in the REPL?
I know that I could just write a test function and test that i has the intended properties right inside subtract(), but sometimes it would just be quicker to use the REPL.
This is the same in any programming language. You can use debugging tools like ASTInterpreter2 (which has good Juno integration) to step through your code and have an interactive REPL in the current environment, or you can use println debugging where you run the code with #show commands in there to print out values.
I have a TCL script running inside a TCL shell (synopsys primetime if it's of any difference).
The script is initiated by source <script> from the shell.
The script calls itself recursively after a specific time interval has passed by calling source <script> at the end of the script.
My question is a bit academic: Could there be a stack-overflow issue if the script keeps calling itself in this method?
If I expand the question: What happens when a TCL script sources another script? Does it fork to a child process? if so, then every call forks to another child, which will eventually stack up to a pile of processes - but since the source command itself is not parallel - there is no fork (from my understanding).
Hope the question is clear.
Thanks.
Short answer: yes.
If you're using Tcl 8.5 or before, you'll run out of C stack. There's code to try to detect it and throw a soft (catchable) error if you do. There's also a (lower) limit on the number of recursions that can be done, controllable via interp recursionlimit. Note that this is counting recursive entries to the core Tcl script interpreter engine; it's not exactly recursion levels in your script, though it is very close.
# Set the recursion limit for the current interpreter to 2000
interp recursionlimit {} 2000
The default is 1000, which is enough for nearly any non-recursive algorithm.
In Tcl 8.6, a non-recursive execution engine is used for most commands (including source). This lets your code use much greater recursion depths, limited mainly by how much general memory you have. I've successfully run code with recursion depths of over a million on conventional hardware.
You'll still need to raise the interp recursionlimit though; the default 1000 limit remains because it catches more bugs (i.e., unintentional recursions) than not. It's just that you can meaningfully raise it much more.
The command doesn’t fork a new process. It acts as if the lines in the sourced files were there in place of the invocation of source. They are interpreted by the current interpreter unless you specify otherwise.
It seems that I can duplicate a kernel by get the program object and kernel name from the kernel. And then I can create a new one.
Is this the right way? It doesn't looks so good, though.
EDIT: To answer properly the question: Yes it is the correct way, there is no other way in CL 2.0 or earlier versions.
The compilation (and therefore, slow step) of the CL code creation is in the "program" creation (clProgramBuild + clProgramLink).
When you create a kernel. You are just creating a object that packs:
An entry point to a function in the program code
Parameters for input + output to that function
Some memory to remember all the above data between calls
It is an simple task that should be almost for free.
That is why it is preferred to have multiple kernel with different input parameters. Rather than one single kernel, and changing the parameters every loop.
I wrote a function that shall perform foldl similar to Haskell in Maxima,
foldl(f,ac,li):=block([con:[],acc:ac],/*print("List=",li,ac),*/
if (is(li#[])) then
(acc:apply(f,cons(acc,[first(li)])),
acc:foldl(f,acc,rest(li))),acc)$
And it works fine in folding the list from left side and evaluating along the way hence preventing any accumulation of long unevaluated expression in buffer.
The problem I am facing in running this with,
foldl(lambda([x,y],x+y),0,makelist(i,i,1,97));
Error in PROGN [or a callee]: Bind stack overflow.
But if I run it upto 96, it generates result appropriately.
I don't understand why is this simple addition causing problem as I don't have any infinte loop or memory hungry task going on.
Well, foldl is defined as a recursive function, and it will call itself as many times as there are elements in the list. So whether it works depends on the Lisp implementation-specific limit for the function call stack. For GCL it seems the limit is relatively small. For other Lisp implementations, the limit is greater. But the only way to make it work for all sizes of the list is to write it iteratively.
There are built-in functions similar to foldl -- see lreduce, rreduce, xreduce, and tree_reduce.
I have started using the doMC package for R as the parallel backend for parallelised plyr routines.
The parallelisation itself seems to be working fine (though I have yet to properly benchmark the speedup), my problem is that the logging is now asynchronous and messages from different cores are getting mixed in together. I could created different logfiles for each core, but I think I neater solution is to simply add a different label for each core. I am currently using the log4r package for my logging needs.
I remember when using MPI that each processor got a rank, which was a way of distinguishing each process from one another, so is there a way to do this with doMC? I did have the idea of extracting the PID, but this does seem messy and will change for every iteration.
I am open to ideas though, so any suggestions are welcome.
EDIT (2011-04-08): Going with the suggestion of one answer, I still have the issue of correctly identifying which subprocess I am currently inside, as I would either need separate closures for each log() call so that it writes to the correct file, or I would have a single log() function, but have some logic inside it determining which logfile to append to. In either case, I would still need some way of labelling the current subprocess, but I am not sure how to do this.
Is there an equivalent of the mpi_rank() function in the MPI library?
I think having multiple process write to the same file is a recipe for a disaster (it's just a log though, so maybe "disaster" is a bit strong).
Often times I parallelize work over chromosomes. Here is an example of what I'd do (I've mostly been using foreach/doMC):
foreach(chr=chromosomes, ...) %dopar% {
cat("+++", chr, "+++\n")
## ... some undoubtedly amazing code would then follow ...
}
And it wouldn't be unusual to get output that tramples over each other ... something like (not exactly) this:
+++chr1+++
+++chr2+++
++++chr3++chr4+++
... you get the idea ...
If I were in your shoes, I think I'd split the logs for each process and set their respective filenames to be unique with respect to something happening in that process's loop (like chr in my case above). Collate them later if you must ... ie. map/reduce your log files :-)