Depth first traversal - recursion

I'm trying to optimise a html parser for a Roku application I'm helping to develop. The parser currently takes too long to parse the data (8 seconds) and it does this by recursivly traversing the children of each tag encountered within a for each loop.
parser (nodes):
for each node in nodes
if node.isTag
parser(node.nodes)
else if node.isBlock
text.push(node)
something akin to that, although much more convoluted! I'm assuming it's slow because it's recursive and there is no tail recursion optimisation on the platform etc
I'm not too sure how to implement a stack to remove the recursiveness from this - I've tried using GoTo but that didn't seem to work :/
Can anyone provide some insight, and or whether you think the problem might be caused by the recursion?

What are you trying to achieve?
Some boxes are slow, there is only a certain amount of optimisation you can apply. If you need to parse the whole document, it usually takes what it takes. We had similar problem, app froze for a few seconds, looking like it's crashed but we got around it by displaying asynchronous spinning icon.

Related

Asynchronous long running computation in elm

I am new to ELM and I am trying to wrap my head around asynchronous computations.
I am trying to write an ELM program to automate radiology report generation.
This program uses a graph data structure to store the state of a hierarchical tree of toggles, radio, and edit boxes that the user can edit (add, delete, move around) the toggles.
Individual toggles can also have their behavior modified so that toggling one ON, will show or hide other toggles.
As an example, when you toggled Chest ON, the lung would show, and the liver would be hidden. I do this by using the edges of the graph in the update function, which then is interpreted and rendered in the view function.
I then use a look-up table of sorts to compute the English phrase corresponding to a set of specific ON or OFF toggles, like liver, cyst, 3, mm, segment_4.
This works fine, but computing the English phrases is computationally expensive and it blocks my UI thread.
I have been playing with Tasks and Process.swap but I am misunderstanding them somehow.
I thought that I could change my ToggleOnOff message implementation so that I could spawn the long running implementation, return immediately my new state (i.e., (model, Cmd msg) ), and set the spawned processed to call the update again with the English text string when it finished with the computation. I have failed completely in implementing this.
I have a generateReport function which I turned into a Task using Task.succeed.
Than I tried to spawn this generateReportTask and chain it using Task.andThen to the msg UpdateReportFullText. And then I would need to return a new model imediately with the state of the toggles, before the long running generateReportTask runs and calls update on its own to add the english text to the model.
I believe I am still thinking about this imperatively and that I am misunderstanding FP and ELM. My experience with ELM has always shown me that the ELM way is always shorter and clearer than my clumsy imperative-style noob attempts.
Can someone please educate me and help a poor fellow see the light?
Thank you
(Sorry for the dry no code question but my program is hard to reproduce in a few lines...)
Long running computations are a bit tricky on the web platform, especially if you want them not to block the user interface.
In practice you have two options:
Use Web Workers. This is a technology that actually spins up a new process that can run entirely in parallel with your UI process. Unfortunately Web Workers are basically an entirely separate program that communicates with your main program via (mostly) serialised messages. This means that the computation you perform has to be accelerated more than the serialization/deserialization ends up costing.
In Elm, you would use a Platform.worker program, then add ports and wire things together in JavaScript. This article seems to describe this in some detail.
You can chunk up your computation into small bits and perform each small bit per frame. This can work even better in some cases where there are intermediate results that can be rendered, although this isn't necessary. You can see an example I wrote here (notice how it actually takes a few seconds to compute the layout of the graph, but it just looks like a neat little animation).

Cause stack overflow through recursion

I have been working on implementing a priority queue of strings in c++ using a binary tree.
As I think the simplicity of recursion is great. I am not going to post code as I have already spent a long time today with the debugger and I am not asking for someone to debug for me, but basically after implementing recursive methods to dequeue and insert elements and testing correct behavior with up to 1000 random strings I have used a test hub that tries to enqueue 10000 random strings and I have a stack overflow error. After this, I have changed my recursive methods for others that use a pointer cursor to scan my tree to insert and dequeue using the same logic and it has not crashed as I expected (i have coded it almost as a linked list).
The question is then, Can I cause stack overflow through recursion even if I use to pass by reference?
These recursive methods are part of a class and defined as private.
I hope the question is not vague but I am still not experienced enough in c++.
Thanks a lot for your help!
In recursion you're calling your function again and again. On every call you use the stack memory for parameters, stack variables and more. So basicly the answer is definitely yes, a deep recursion can cause a stack overflow.

Detect deadloop in PinTool

I am writing a PinTool, which can manipulate certain register/memory value. However, after manipulation, one challenge I am facing now, is the deadloop.
In particular, due to the frequent manipulation of certain register value, it is indeed common to create deadloop in the execution trace. I am thinking to detect such case, and terminate the execution.
So here is my question, what is a good practice to detect a deadloop in a PinTool? I can come up with some naive solutions, say, record the executed instructions, and if certain instruction has been executed for a large amount of times, just terminate the execution.
Could anyone help me on this issue? Thank you.
Detecting whether a program will terminate isn't a computable problem in general, so no, I don't think it's a good idea.

What is a practical difference between a loop and recursion

I am currently working in PHP, so this example will be in PHP, but the question applies to multiple languages.
I am working on this project with a fiend of mine, and as always we were held up by a big problem. Now we both went home, couldn't solve the problem. That night we both found the solution, only I used a loop to tackle the problem, and he used recursion.
Now I wanted to tell him the difference between the loop and recursion, but I couldn't come up with a solution where you need recursion over a normal loop.
I am going to make a simplified version of both, I hope someone can explain how one is different from the other.
Please forgive me for any coding errors
The loop:
printnumbers(1,10);
public function printnumbers($start,$stop)
{
for($i=$start;$i<=$stop;$i++)
{
echo $i;
}
}
Now the code above just simply prints out the numbers.
Now let's do this with recursion:
printnumbers(1,10);
public function printnumbers($start,$stop)
{
$i = $start;
if($i <= $stop)
{
echo $i;
printnumbers($start+1,$stop);
}
}
This method above will do the exact same thing as the loop, but then only with recursion.
Can anyone explain to me what there is different about using one of these methods.
Loops and recursions are in many ways equivalent. There are no programs the need one or the other, in principle you can always translate from loops to recursion or vice versa.
Recursions is more powerful in the sense that to translating recursion to a loop might need a stack that you have to manipulate yourself. (Try traversing a binary tree using a loop and you will feel the pain.)
On the other hand, many languages (and implementations), e.g., Java, don't implement tail recursion properly. Tail recursion is when the last thing you do in a function is to call yourself (like in your example). This kind of recursion does not have to consume any stack, but in many languages they do, which means you can't always use recursion.
Often, a problem is easier expressed using recursion. This is especially true when you talk about tree-like data structures (e.g. directories, decision trees...).
These data structures are finite in nature, so most of the time processing them is clearer with recursion.
When stack-depth is often limited, and every function call requires a piece of stack, and when talking about a possibly infinite data structure you will have to abandon recursion and translate it into iteration.
Especially functional languages are good at handling 'infinite' recursion. Imperative languages are focused on iteration-like loops.
In general, a recursive function will consume more stack space (since it's really a large set of function calls), while an iterative solution won't. This also means that an iterative solution, in general, will be faster because.
I am not sure if this applies to an interpreted language like PHP though, it is possible that the interpreter can handle this better.
A loop will be faster because there's always overhead in executing an extra function call.
A problem with learning about recursion is a lot of the examples given (say, factorials) are bad examples of using recursion.
Where possible, stick with a loop unless you need to do something different. A good example of using recursion is looping over each node in a Tree with multiple levels of child nodes.
Recursion is a bit slower (because function calls are slower than setting a variable), and uses more space on most languages' call stacks. If you tried to printnumbers(1, 1000000000), the recursive version would likely throw a PHP fatal error or even a 500 error.
There are some cases where recursion makes sense, like doing something to every part of a tree (getting all files in a directory and its subdirectories, or maybe messing with an XML document), but it has its price -- in speed, stack footprint, and the time spent to make sure it doesn't get stuck calling itself over and over til it crashes. If a loop makes more sense, it's definitely the way to go.
Well, I don't know about PHP but most languages generate a function call (at the machine level) for every recursion. So they have the potential to use a lot of stack space, unless the compiler produces tail-call optimizations (if your code allows it).
Loops are more 'efficient' in that sense because they don't grow the stack. Recursion has the advantage of being able to express some tasks more naturally though.
In this specific case, from a conceptual (rather than implementative) point of view, the two solutions are totally equivalent.
Compared to loops, a function call has its own overhead like allocating stack etc. And in most cases, loops are more understandable than their recursive counterparts.
Also, you will end up using more memory and can even run out of stack space if the difference between start and stop is high and there are too many instances of this code running simultaneously (which can happen as you get more traffic).
You don't really need recursion for a flat structure like that. The first code I ever used recursion in involved managing physical containers. Each container might contain stuff (a list of them, each with weights) and/or more containers, which have a weight. I needed the total weight of a container and all it held. (I was using it to predict the weight of large backpacks full of camping equipment without packing and weighing them.) This was easy to do with recursion and would have been a lot harder with loops. But many kinds of problems that naturally suit themselves to one approach can also be tackled with the other.
Stack overflow.
And no, I don't mean a website or something. I MEAN a "stack overflow".

AS3 large game performance degradation over time

I'm currently working on a very large Flash platformer game (hundreds of classes) and dealing with an issue where the game slowly grinds to a halt if you leave it on for long enough. I didn't write the game and so I'm only vaguely familiar with its internals. Some of the mysterious symptoms include,
The game will run fine for a determinate amount of time (on a given level) when all of a sudden it will exponentially start leaking memory.
The time it takes for the game to hit the point where it leaks exponentially is shorter when there's more sprites on the screen.
Even when nothing is being visibly rendered to the screen, the game slows down.
The game slows down faster with more frequent sprite collisions.
Completely disabling the collision code does slow down the degradation but doesn't prevent the game from eventually dropping frames.
Looking at the source and using Flex profiler, my prime suspects are,
There are many loitering objects, especially WeakMethodClosure, taking up large amounts of memory.
The program makes extremely extensive use of weak event listeners (dozens dispatched per frame).
BitmapData is being copied every time a new sprite is created. These are 50x50 pixel sprites that spawn at about 8 sprites per second.
I know it's next to impossible to tell me the problem without seeing the source, so I'm just looking for tidbits that might help me narrow this down. Has anybody experienced this evasive performance degradation in their own projects? What was the cause in your case?
I recently completed a optimization of an large project.
And I can give you some architectural advices:
Main principle – try do as little as possible function / events calls
Get rid off all but one
onEnterFrame / onInterval / onTimer cycles. Do everything you
need in one general event call.
You probably will need a lot of static arrays for storing processed object
references.
Do your graphics / render stuff also in one main loop
Try use big (probably prerendered) canvases instead of small sprites / bitmaps.
Usually it usable for backgrounds. But also working good for a smaller objects
(like trees, platforms, etc)
Rid off small bitmap resources, assemble it to one tile-sheet and draw your stuff
directly from it, through source–rect property
I hope its helps you!
Memory leaks – such headash.
P.S. Test your game in different browsers, IE – most leaking of all, sometimes it don`t clear memory every after refresh.
Avoid anonymous methods - change them into class level methods.
Use weak reference in addEventListener and/or make sure you remove all the listeners of an object before removing it with removeChild
Make sure you removeChild all the sprites instead of just letting them fly off the screen. Also, if possible, reuse the sprites instead of creating new ones.
You should consider object pooling if you have a lot of creation/destruction going on, especially with heavy objects like bitmapdata.
see Object Pool class
Sounds like you need to profile your application to see what is happening.
This thread had a couple of suggestions, but, ultimately, you will need to just put in code to help determine what is going on.
Profiling ActionScript-3 Code
You may want to see if you can just run some smaller parts over a period of time and see if you see a slowdown.
You may want to unit test your application, so you can run various parts rapidly, looking for memory leaks.
One framework is: http://asunit.org/, and another is: http://opensource.adobe.com/wiki/display/flexunit/
Unit testing is something I use a great deal for profiling, so you can test at the top level of your game, run it thousands of times, looking for problems, then run each part and see which has problems, and just work your way down. This is a manual process, but if the two ideas in the SO thread listed in the beginning don't help, this may be your best approach.
Are you using too much memory? Or is your cpu usage too high?
First determine if its a memory or processor limit you are hitting. It sounds like the later, seems like there are many objects doing stuff around ... likely those extra sprites aren't being freed well. Look for dependencies among objects / variables / anything in those events, make sure the sprites are removed, pay attention to any handlers of EnterFrame or recurring events.
It sounds far more likely that you are capping out on processor speed than memory. You have to try extra hard to memory cap a Flash application.
Fortunately there are a lot of easy things you can do to keep CPU down...
1) Stringently manage event listeners for everything, especially mouse listeners. Do you have $texas event listeners on all your sprite objects? That could be a problem.
2) Access arrays using int or uint instead of Number. This is huge, and this is one of those backwater Adobe tricks. Accessing array object with int and uint is way faster than Number and if you do a lot of iteration (and it sounds like you do) this could shave precious milliseconds off your frame execution.
3) In the same vein as #2, monitor your math operations and what types you are using for certain operations. The slowest thing you can do in math operations for AS3 is repetitive casting (feeding ints to a function that returns a Number), or performing basic operations like add and subtract on Number instead of int.
The great thing about having a wtfhuge program like this in Flash is that even a minor optimization change could have a major impact on performance. I once toyed with a raytrace engine in AS3 where I declared one extra variable and it killed my FPS from 30 to 23.
Flash has a rather infamous issue (many consider it a bug) that causes event listeners for timers and the ENTER_FRAME event to not get garbage collected, even if they were weakly referenced. So, even though it's good practice to use weakly referenced events, you should still remove all your event listeners when they are no longer needed.

Resources