How to debug issues with differing execution times in different contexts

How to debug issues with differing execution times in different contexts - asp.net

The following question seems to be haunting me more consistently than most other questions recently. What kinds of things would you suggest I suggest that they look for when trying to debug "performance issues" like this?
ok, get this - running this in query analyzer takes < 1 second
exec usp_MyAccount_Allowance_Activity '1/1/1900', null, 187128
debugging locally, this takes 10 seconds:
DataSet allowanceBalance =
SqlHelper.ExecuteDataset(
WebApplication.SQLConn(),
CommandType.StoredProcedure,
"usp_MyAccount_Allowance_Activity",
Params);
same parameters

Horrible question to answer really - code in debugger vs code not in debugger introduces all manner of Heisenbug timing problems, most of which you'll never know about, into the soup of things that can muck things up for you.
Debuggers tend to put their fingers in all the tasty places that may affect performance.
Debug Events. The debugger gets special events during the application load, execution, dll load/unload, shutdown. The debugger will do whatever it wants in these events. That will be a source of slowdown.
Debug Output. OutputDebugString() and all the code that uses it (the trace output in .Net, for example) suddenly become active. This is slow.
The HeapAlloc() family of functions, when run under a debugger start to check for all sorts of heap inconsistencies, which consumes more time.
If you have Symbol Discovery turned on, there may be delays as various Symbol Servers are queried for symbols and downloaded if required (you'll notice the delay if they are downloaded).

Related

Is a preemptive multitasking OS possible on the interruptless DCPU-16?

I am looking into various OS designs in the hopes of writing a simple multitasking OS for the DCPU-16. However, everything I read about implementation of preemptive multitasking is centered around interrupts. It sounds like in the era of 16-bit hardware and software, cooperative multitasking was more common, but that requires every program to be written with multitasking in mind.
Is there any way to implement preemptive multitasking on an interruptless architecture? All I can think of is an interpreter which would dynamically switch tasks, but that would have a huge performance hit (possibly on the order of 10-20x+ if it had to parse every operation and didn't let anything run natively, I'm imagining).

Preemptive multitasking is normally implemented by having interrupt routines post status changes/interesting events to a scheduler, which decides which tasks to suspend, and which new tasks to start/continue based on priority. However, other interesting events can occur when a running task makes a call to an OS routine, which may have the same effect.
But all that matters is that some event is noted somewhere, and the scheduler decides who to run. So you can make all such event signalling/scheduling occur only only on OS calls.
You can add egregious calls to the scheduler at "convenient" points in various task application code to make your system switch more often. Whether it just switches, or uses some background information such as elapsed time since the last call is a scheduler detail.
Your system won't be as responsive as one driven by interrupts, but you've already given that up by choosing the CPU you did.

Actually, yes. The most effective method is to simply patch run-times in the loader. Kernel/daemon stuff can have custom patches for better responsiveness. Even better, if you have access to all the source, you can patch in the compiler.
The patch can consist of a distributed scheduler of sorts. Each program can be patched to have a very low-latency timer; on load, it will set the timer, and on each return from the scheduler, it will reset it. A simplistic method would allow code to simply do an
if (timer - start_timer) yield to scheduler;
which doesn't yield too big a performance hit. The main trouble is finding good points to pop them in. In between every function call is a start, and detecting loops and inserting them is primitive but effective if you really need to preempt responsively.
It's not perfect, but it'll work.
The main issue is making sure that the timer return is low latency; that way it is just a comparison and branch. Also, handling exceptions - errors in the code that cause, say, infinite loops - in some way. You can technically use a fairly simple hardware watchdog timer and assert a reset on the CPU without clearing any of the RAM; an in-RAM routine would be where RESET vector points, which would inspect and unwind the stack back to the program call (thus crashing the program but preserving everything else). It's sort of like a brute-force if-all-else-fails crash-the-program. Or you could POTENTIALLY change it to multi-task this way, RESET as an interrupt, but that is much more difficult.
So...yes. It's possible but complicated; using techniques from JIT compilers and dynamic translators (emulators use them).
This is a bit of a muddled explanation, I know, but I am very tired. If it's not clear enough I can come back and clear it up tomorrow.
By the way, asserting reset on a CPU mid-program sounds crazy, but it is a time-honored and proven technique. Early versions of Windows even did it to run compatibility mode on, I think 386's, properly, because there was no way to switch back to 32-bit from 16-bit mode. Other processors and OSes have done it too.
EDIT: So I did some research on what the DCPU is, haha. It's not a real CPU. I have no idea if you can assert reset in Notch's emulator, I would ask him. Handy technique, that is.

I think your assessment is correct. Preemptive multitasking occurs if the scheduler can interrupt (in the non-inflected, dictionary sense) a running task and switch to another autonomously. So there has to be some sort of actor that prompts the scheduler to action. If there are no interrupting devices (in the inflected, technical sense) then there's little you can do in general.
However, rather than switching to a full interpreter, one idea that occurs is just dynamically reprogramming supplied program code. So before entry into a process, the scheduler knows full process state, including what program counter value it's going to enter at. It can then scan forward from there, substituting, say, either the twentieth instruction code or the next jump instruction code that isn't immediately at the program counter with a jump back into the scheduler. When the process returns, the scheduler puts the original instruction back in. If it's a jump (conditional or otherwise) then it also effects the jump appropriately.
Of course, this scheme works only if the program code doesn't dynamically modify itself. And in that case you can preprocess it so that you know in advance where jumps are without a linear search. You could technically allow well-written self-modifying code if it were willing to nominate all addresses that may be modified, allowing you definitely to avoid those in your scheduler's dynamic modifications.
You'd end up sort of running an interpreter, but only for jumps.

another way is to keep to small tasks based on an event queue (like current GUI apps)
this is also cooperative but has the effect of not needing OS calls you just return from the task and then it will go on to the next task
if you then need to continue a task you need to pass the next "function" and a pointer to the data you need to the task queue

Adobe AIR SQLite Async Events Not Dispatching

Working on an application that has very heavy use of the local sqlite db. Initially it was setup for synchronous database communication, but with such heavy usage we were seeing the application "freeze" for brief periods fairly often.
After doing a refactor to asynchronous communication we are seeing a different issue. The application seems to be far less reliable. Jobs seem to simply not complete. After much debugging and tweaking the problem seems to be the database event handles not always being caught. I'm seeing this specifically when beginning a transaction or closing the connection.
Here is an example:
con.addEventListener(SQLErrorEvent.ERROR, tran_ErrorHandler);
con.addEventListener(SQLEvent.BEGIN, con_beginHandler);
con.begin(SQLTransactionLockType.IMMEDIATE);
Most of the time this works just fine. But every now and then con_beginHandler isn't hit after con.begin is called. This makes it so we have an open transaction that never gets committed and can really hang up future requests. When investigating this same issue with the connection close handler, one of the solutions was to simply delay it. In that context it was OK to wait even several seconds.
setTimeout(function():void{ con.begin(SQLTransactionLockType.IMMEDIATE); }, 1000);
Changing to something like this does seem to make the transaction more reliable, however, that really stretches out the time it takes for the application to complete actions. This is a very db heavy application, so even adding 200ms has a noticeable affect. But something as short as 200ms also doesn't seem to fully solve the issue. It has to be 500-1000ms or higher in order for me to stop seeing this issue.
I've written a separate AIR application to try and stress test our code and the transactions, but am unable to reproduce this in that environment. I even have it try to do something that will "freeze" the application (long loops that do some math or other processing) to see if application strain is what makes them misfire, but everything seems reliable.
I'm at a loss for how to resolve this at this point. I even tried running con.begin off of a binding event, just to add more time. The only thing that seems to work is excessively long timers/timeouts, which I don't think is an acceptable solution.
Has anybody else run into this? Is there some trick to async that I'm missing?

I had a few more ideas to try after the refreshing weekend, none of which panned out; however, during these attempts and more investigations I finally found a pattern to the issue. Even though it doesn’t happen consistently, when it does happen it is fairly consistent on where it happens. There are 1 or 2 spots during the problematic processes that try to compact the DB after doing data clearing, in order to help keep the file sizes smaller. I think the issue here is compact wasn’t worked into the async flow properly. So while we are trying to compact the db, we are also trying to start up the new transaction. So if the compact takes a bit of time every once in a while, then we get a hang up. I think the assumed behavior was for async event handling to dispatch when the transaction is finally started instead of just never happening at all, but this does make some amount of sense.

Removing Flex DisplayObjects from view AND memory pool

I have an issue with a Flex application, one that I didn't build, so I can provide all my findings but sorry for any lack of clarity.
There is a Flex app with 7 main views. And there is a memory issue when navigating between views.
All these views were in a ViewStack, but due to some involving 3D objects I assumed it was too much to have it all in the display list. I'm now clearing all children from the stack and adding/removing them when needed. This gave a small performance increase, but still becomes unresponsive with use. The strange thing is, with this and the original method, the CPU climbs with use but eventually levels out somewhere. Now I'm creating new instances of each screen when they are navigated to and setting the previous variable to null. Now it looks like CPU is spiking when the view is created, but leveling out to something much much lower than it was. This felt like progress, but now the available memory keeps climbing where it wasn't before....
My understanding was calling remove child or remove all children would mark the object for deletion when the garbage collector next ran. I can't see any other references to the instance. My code is along the lines of
this.parentApplication.viewstack.removeAllChildren();
this.parentApplication.viewstack.addChild(new HomeScreen);
I have a function for each button to add a new instance like the above.
The only thing I can see and feel silly asking but need confirmation, is each view extends a class called "Screen", this class contains a singleton reference to some core components
this.model = PancakeApplication.instance.model;
this.meaModel = MeaApplication.instance.meaModel;
this.meaModel.addEventListener(ScreenChangeEvent.SCREEN_CHANGE, electedScreenChangeHandler);
Would this trick the garbage collector into thinking it was still needed?
General advice on clearing Objects from the memory pool would awesome!!! I've never needed to analyze the Flash Player in such depth.
SOLVED: I think it's an error with sound drivers, removing all sound and shes purring like a kitten. Works on my machine fine with windows XP, but not on the touch pad the application is crashing on with windows 7 (unsure of the drivers looking into them now).
UPDATE: Now I'm thinking its not the drivers, tried 3 different versions, all with no improvement. I did discover the sound was fading in and out with the TweenLite lib. Doesn't look like there are any memory leaks in TweenLite as it works fine on other machines. Just the use of volumeEasingFunction seems to consume increasing amounts of CPU until it freaks out. It is crappy hardware running windows 7, which doesn't help.

The first thing that comes to mind is you should be setting the use weak reference parameter to true in your event listener. It is the fifth parameter, so in your example:
this.meaModel.addEventListener(ScreenChangeEvent.SCREEN_CHANGE, electedScreenChangeHandler, false, 0, true);
Grant Skinner has a great 3 part series on AS3 Resource Management that would probably help you get a better idea of what to look for. You can find the details about weakly references listeners in part 3 or in a standalone article written before part 3 was posted.

How to increase my Web Application's Performance?

I have a ASP.NET web application (.NET 2008) using MS SQL server 2005. I want to increase the performance of the web site. Does anyone know of an article containing steps to do that, step by step, in SQL (indexes, etc.), and in the code?

Performance tuning is a very specific process. I don't know of any articles that discuss directly how to achieve this, but I can give you a brief overview of the steps I follow when I need to improve performance of an application/website.
Profile.
Start by gathering performance data. At the end of the tuning process you will need some numbers to compare to actually prove you have made a difference. This means you need to choose some specific processes that you monitor and record their performance and throughput.
For example, on your site you might record how long a login takes. You need to keep this very narrow. Pick a specific action that you want to record and time it. (Use a tool to do the timing, or put some Stopwatch code in you app to report times. Also, don't just run it once. Run it multiple times. Try to ensure you know all the environment set up so you can duplicate this again at the end.
Try to make this as close to your production environment as possible. Make sure your code is compiled in release mode, and running on real separate servers, not just all on one box etc.
Instrument.
Now you know what action you want to improve, and you have a target time to beat, you can instrument your code. This means injecting (manually or automatically) extra code that times each method call, or each line and records times and or memory usage right down the call stack.
There are lots of tools out their that can help you with this and automate some of it. (Microsoft's CLR profiler (free), Redgate - Ants (commercial), the higher editions of visual studio have stuff built in, and loads more) But you don't have to use automatic tools, it's perfectly acceptable to just use the Stopwatch class to time each block of your code. What you are looking for is a bottle neck. The likely hood is that you will find a high proportion of the overall time is spent in a very small bit of code.
Tune.
Now you have some timing data, you can start tuning.
There are two approaches to consider here. Firstly, take an overall perspective. Consider if you need to re design the whole call stack. Are you repeating something unnecessarily? Or are you just doing something you don't need to?
Secondly, now you have an idea of where your bottle neck is you can try and figure out ways to improve this bit of code. I can't offer much advice here, because it depends on what your bottle neck is, but just look to optimise it. Perhaps you need to cache data so you don't have to loop over it twice. Or batch up SQL calls so you can do just one. Or tighten your query filters so you return less data.
Re-profile.
This is the most important step that people often miss out. Once you have tuned your code, you absolutely must re-profile it in the same environment that you ran your initial profiling in. It is very common to make minor tweaks that you think might improve performance and actually end up degrading it because of some unknown way that the CLR handles something. This is much more common in managed languages because you often don't know exactly what is going on under the covers.
Now just repeat as necessary.
If you are likely to be performance tuning often I find it good to have a whole batch of automated performance tests that I can run that check the performance and throughput of various different activities. This way I can run these with every release and record performance changes each release. It also means that I can check that after a performance tuning session I know I haven't made the performance of some other area any worse.
When you are profiling, don't always just think about the time to run a single action. Also consider profiling under load, with lots of users logged in. Sometimes apps perform great when there's just one user connected, but when they hit a certain number of users suddenly the whole thing grinds to a halt. Perhaps because suddnely they are spending more time context switching or swapping memory in and out to disk. If it's throughput you want to improve you need to be figuring out what is causing the limit on throughput.
Finally. Check out this huge MSDN article on Improving .NET Application Performance and Scalability. Specifically, you might want to look at chapter 6 and chapter 17.

I think the best we can do from here is give you some pointers:
query less data from the sql server (caching, appropriate query filters)
write better queries (indexing, joins, paging, etc)
minimise any inappropriate blockages such as locks between different requests
make sure session-state hasn't exploded in size
use bigger metal / more metal
use appropriate looping code etc
But to stress; from here anything is guesswork. You need to profile to find the general area for the suckage, and then profile more to isolate the specific area(s); but start by looking at:
sql trace between web-server and sql-server
network trace between web-server and client (both directions)
cache / state servers if appropriate
CPU / memory utilisation on the web-server

I think First of all you have to find your Bottlenecks and then try to improve those.
This helps you to perform exactly where you have serios problem.
An in addition you needto improve your Connection to DB. For exampleusing a Lazy , Singletone Pattern and also create Batch request instead of single requests.
It help you to decrease DB connection.
Check your cache and suitable loop structures.
another thing is to use appropriate types, forexample if you need int donot create a long and etc
at the end ypu can use some Profiler (specially in SQL) andcheckif your queries implemented as well as possible.

Adobe Flex App page file usage going through the roof!

I have been working on an Adobe Flex application for some months now, and the application is meant to run 24/7 for days (weeks!) continuously. However, I'm now seeing that after a few days of running nonstop the computer it runs on tells me that the system is low on virtual memory and gives me an error about Page File usage. Once I close the Flex app, the Page File usage goes down from 1.9 GB to 100 MB (or less). It seems that its using up all this memory and not freeing it although I have been very careful in my app to not keep huge arrays.
The app does some graphing and draws a lot of shapes (to greate a 'gauge') and then gets rid of them by re-declaring that object as another 'gauge'.
Any idea why my page file usage is climbing so high?!

You most probably have eventListeners that are not being removed. They keep references to objects and prevent them from being garbage collected.

You can use the profiler in Flex Builder professional to see where your memory usage is going. Like another poster mentioned, event listeners are alot of times the culprits in cases like this, but more generally, just because you think you are getting rid (destroying or deleting) a variable, doesn't mean that it is really getting taken care of by the garbage collector. If any reference (like an event listener) still exists to that variable (or object) it will not be collected. The profiler will point out these things.

I've heard rumors that putting anything on the Stage will create memory leaks. In other words, you can be as careful as possible with your code, but you'll still leak memory. This has not been validated by Adobe, as far as I know. A good test might be to instantiate a Shape and a Sprite and a MovieClip, add them to the display list, and then let the app run overnight. Would love to hear the results if you do end up testing this.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex