I've seen a lot of other developers refer to threads in ActionScript functions. As a newbie I have no idea what they are referring to so:
What is a thread in this sense?
How would I run more than one thread at a time?
How do I ensure that I am only running one thread at a time?
Thanks
~mike
Threads represent a way to have a program appear to perform several jobs concurrently. Although whether or not the jobs can actually occur simultaneously is dependent on several factors (most importantly, whether the CPU the program is running on has multiple cores available to do the work). Threads are useful because they allow work to be done in one context without interfering with another context.
An example will help to illustrate why this is important. Suppose that you have a program which fetches the list of everyone in the phone book whose name matches some string. When people click the "search" button, it will trigger a costly and time-consuming search, which might not complete for a few seconds.
If you have only a single-threaded execution model, the UI will hang and be unresponsive until the search completes. Your program has no choice but to wait for the results to finish.
But if you have several threads, you can offload the search operation to a different thread, and then have a callback -- a trigger which is invoked when the work is completed -- to let you know that things are ready. This frees up the UI and allows it to continue to respond to events.
Unfortunately, because ActionScript's execution model doesn't support threads natively, it's not possible to get true threading. There is a rough approximation called "green threads", which are threads that are controlled by an execution context or virtual machine rather than a larger operating system, which is how it's usually done. Several people have taken a stab at it, although I can't say how widespread their usage is. You can read more at Alex Harui's blog here and see an example of green threads for ActionScript here.
It really depends on what you mean. The execution model for ActionScript is single-threaded, meaning it can not run a process in the background.
If you are not familiar with threading, it is essentially the ability to have something executed in the background of a main process.
So, if you needed to do a huge mathematical computation in your flex/flash project, with a multi-threaded program you could do that in the background while you simultaneously updated your UI. Because ActionScript is not multi-threaded you can not do such things. However, you can create a pseudo-threading class as demonstrated here:
http://blogs.adobe.com/aharui/pseudothread/PseudoThread.as
The others have described what threading is, and you'd need threading if you were getting hardcore into C++ and 3D game engines, among many other computationally-expensive operations, and languages that support multi-threading.
Actionscript doesn't have multi-threading. It executes all code in one frame. So if you create a for loop that processes 100,000,000 items, it will cause the app to freeze. That's because the Flash Player can only execute one thread of code at a time, per frame.
You can achieve pseudo-threading by using:
Timers
Event.ENTER_FRAME
Those allow you to jump around and execute code.
Tween engines like TweenMax can operate on 1000's of objects at once over a few seconds by using Timers. You can also do this with Event.ENTER_FRAME. There is something called "chunking" (check out Grant Skinner's AS3 Optimizations Presentation), which says "execute computationally expensive tasks over a few frames", like drawing complex bitmaps, which is a pseudo-multi-threading thing you can do with actionscript.
A lot of other things are asynchronous, like service calls. If you make an HTTPService request in Flex, it will send a request to the server and then continue executing code in that frame. Once it's done, the server can still be processing that request (say it's saving a 30mb video to a database on the server), and that might take a minute. Then it will send something back to Flex and you can continue code execution with a ResultEvent.RESULT event handler.
So Actionscript basically uses:
Asynchronous events, and
Timers...
... to achieve pseudo-multi-threading.
a thread allows you to execute two or more blocks of actionscrpt simultaniously by default you will always be executing on the same default thread unless you explcitly start a new thread.
Related
I'm assuming that the definition of asynchronous is as follows.
Let's start with a relationship between two 'things': X and Y.
They can be anything, e.g. X can be you and Y can be your washing machine.
Let's say X requests something of Y.
This can also be anything: a question, a task.
Let's say we live in a world where Y cannot immediately respond with the answer / completion status.
What happens?
In a Synchronous relationship, you 'wait around' in some way.
This could involve just sitting there or asking repeatedly.
In an Asynchronous relationship, you go on with your life.
Y will ping you when it's done.
From the perspective of a user's API, node.js and asyncio seem asynchronous. For example, in node.js you can register callbacks upon completion of certain events. And in asyncio, the callback logic goes right after some await my_io().
But here's my question - are node.js and asyncio actually truly asynchronous? Implementation-wise, do they just engage in a bunch of frantic non-blocking "hey, is this file descriptor free yet?" calls or is it actually interrupt-driven?
Yes, they're truly asynchronous by your definition, you (i.e. the Node engine/Python interpreter) go on doing other work while waiting for the task to complete.
How it's implemented doesn't matter to you--you trust that the designers made the right design decisions on your behalf. If there is frantic "hey is this file descriptor free yet?" going on (that's called "polling" or "spin waiting"), that's an implementation detail handled by the engine which has dispatched a thread to wait.
Incidentally, busy waiting is sometimes an efficient way to wait for a resource in certain circumstances (generally, when you expect it to be available very soon) as an alternative to an interrupt or notification.
Think of it this way: if you're waiting for your washing machine to finish a load and you want to be notified precisely (or as precisely as possible) when it finishes as you go about other tasks, you could call your friend to watch the laundry for you. The contract is that your friend notifies you as soon as possible when the laundry is done, but you don't care how they do it. Maybe they stand next to the machine and check if it's done constantly (a good idea if the load is almost complete), or maybe they're clever and have devised a system so they can do other work without constantly checking. Either way, it doesn't impact your ability to do other tasks.
The "friend" is like a thread dispatched by the Node engine and the "laundry" might be a file, HTTP response, etc. From your perspective, it's truly asynchronous--the work spent to fulfill the resource request is being done by a thread spawned by the runtime, the OS or network and runs in parallel to your process.
See this answer for diagrams showing the distinction between asynchrony and synchrony.
I understand the benefits of async on the frontend (some kind of UI). With async the UI does not block and the user can "click" on other things while waiting for another operation to execute.
But what is the benefit of async programming on the backend?
The main benefit is that there could be various slow operations on the backend, which can prevent other requests from using the cpu at the same time. These operations could be: 1. Database operations 2. File operations 3. Remote calls to other services (servers), etc. You don't want to block the cpu while these operations are in progress.
First of all, there a benefit of handling more than one request at a time. Frameworks like ASP.net or django create new (or reuse existing) threads for each requests.
If you mean async operations from the thread of particular request, that's more complicated. In most cases, it does not help at all, because of the overhead of spawning new thread. But, we have things like Schedulers in C# for example, which help a lot. When correctly used, they free up a lot of CPU time normally wasted on waiting.
For example, you send a picture to a server. Your request is handled in new thread. This thread can do everything on it's own: upack the picture and save it to disk, then update the database.
Or, you can write to disk AND update the database at the same time. The thread that is done first is our focus here. When used without scheduler, it starts spinning a loop, checking if the other thread is done, which takes CPU time. If you use scheduler, it frees that thread. When the other task is done, it uses propably yet another precreated thred to finish handling of your request.
That scenario does make it seem like it's not worth the fuss, but it is easy to imagine more coplicated tasks, that can be done in the same time instead of sequentailly. On top of that, schedulers are rather smart and will make it so the total time needed will be lowest, and CPU usage moderate.
I keep hearing that using async programming patterns will make my code run faster. Why is that true? Doesn't the same exact code have to run either way, whether it runs now or it runs later?
It's not faster, it just doesn't waste time.
Synchronous code stops processing when waiting for I/O. Which means that when you're reading a file you can't run any other code. Now, if you have nothing else to do while that file is being read then asynchronous code would not buy you anything much.
Usually the additional CPU time that you can use is useful for servers. So the question is why do asynchronous programming instead of starting up a new thread for each client?
It turns out that starting and tearing down threads is expensive. Some time back in the early 2000s a web server benchmark found that tclhttpd compared favorably to Apache for serving static image files. This is despite the fact that tclhttpd was written in tcl and Apache was written in C and tcl was known to be 50 times slower than C. Tcl managed to hold its own against Apache because tcl had an easy to use asynchronous I/O API. So tclhttpd used it.
It's not that C doesn't have asynchronous I/O API. It's just that they're rarely used. So Apache didn't use it. These days, Apache2 uses asynchronous I/O internally along with thread pools. The C code ends up looking more complicated but it's faster - lesson learned.
Which leads us to the recent obsession with asynchronous programming. Why are people obsessed with it? (most answers on Stackoverflow about javascript programming for example insist that you should never use synchronous versions of asynchronous functions).
This goes back to how you rarely see asynchronous programs in C even though it's the superior way of doing things (GUI code is an exception because UI libraries learned early on to rely on asynchronous programming and Events). There are simply too many functions in C that are synchronous. So even if you wanted to do asynchronous programming you'll end up calling a synchronous function sooner or later. The alternative is to abandon stdlib and write your own asynchronous libraries for everything - from file I/O to networking to SQL.
So, in languages like javascript where asynchronous programming ended up as the default style there is pressure from other programmers to not mess it up and accidentally introduce synchronous functions which would be hard to integrate with asynchronous code without losing a lot of performance. So in the end, like taxes, asynchronous code has become a social contract.
It's not always faster. In fact, just setting up and tearing down the async environment adds a lot of time to your code. You have to spin off a new process/thread, set up an event queue/message pump, and clean up everything nicely in the end. (Even if your framework hides all these details from you, they're happening in the background).
The advantage is blocking. Lot's of our code depends on external resources. We need to query a database for the records to process, or download the latest version of something from a website. From the moment you ask that resource for information until you get an answer, your code has nothing to do. It's blocking, waiting for an answer. All the time your program spends blocking is totally wasted.
That's what async is designed for. By spinning the "wait for this blocking operation" code off into an async request, you let the rest of your non-blocking code keep running.
As a metaphor, imagine a manager telling his employee what to do that day. One the tasks is a phone call to a company with long wait times. If he told her to make the call synchronously she would call and wait on hold without doing anything else. Make it async and she can work on a lot of other tasks while the phone sits on hold in the background.
It runs the same code , but it does not wait for time taking task to finish . It will continue to execute code until async function is done.
The ASP.NET runtime is meant for short work loads that can be run in parallel. I need to be able to schedule periodic events and background tasks that may or may not run for much longer periods.
Given the above I have the following problems to deal with:
The AppDomain can shutdown due to changes (Web.config, bin, App_Code, etc.)
IIS recycles the AppPool on a regular basis (daily)
IIS itself might restart, or for that matter the server might crash
I'm not convinced that running this code inside ASP.NET is not the right thing to do, becuase it would allow for a simpler programming model. But doing so would require that an external service periodically makes requests to the app so that the application is keept running and that all background tasks are programmed with utter most care. They will have to be able to pause and resume thier work, in the event of an unexpected error.
My current line of thinking goes something like this:
If all jobs are registered in the database, it should be possible to use the database as a bookkeeping mechanism. In the case of an error, the database would contain all state necessary to resume the operation at the next opportunity given.
I'd really appriecate some feedback/advice, on this matter. I've been considering running a windows service and using some RPC solution as well, but it doesn't have the same appeal to me. And I'd instead have a lot of deployment issues and sycnhronizing tasks and code cross several applications. Due to my business needs this is less than optimial.
This is a shot in the dark since I don't know what database you use, but I'd recommend you to consider dialog timers and activation. Assuming that most of the jobs have to do some data manipulation, and is likely that all have to do only data manipulation, leveraging activation and timers give an extremely reliable job scheduling solution, entirely embedded in the database (no need for an external process/service, not dependencies outside the database bounds like msdb), and is a solution that ensures scheduled jobs can survive restarts, failover events and even disaster recovery restores. Simply put, once a job is scheduled it will run even if the database is restored one week later on a different machine.
Have a look at Asynchronous procedure execution for a related example.
And if this is too radical, at least have a look at Using Tables as Queues since storing the scheduled items in the database often falls under the 'pending queue' case.
I recommend that you have a look at Quartz.Net. It is open source and it will give you some ideas.
Using the database as a state-keeping mechanism is a completely valid idea. How complex it will be depends on how far you want to take it. In many cases you will ended up pairing your database logic with a Windows service to achieve the desired result.
FWIW, it is typically not a good practice to manually use the thread pool inside an ASP.Net application, though (contrary to what you may read) it actually works quite nicely other than the huge caveat that you can't guarantee it will work.
So if you needed a background thread that examined the state of some object every 30 seconds and you didn't care if it fired every 30 seconds or 29 seconds or 2 minutes (such as in a long app pool recycle), an ASP.Net-spawned thread is a quick and very dirty solution.
Asynchronously fired callbacks (such as on the ASP.Net Cache object) can also perform a sort of "behind the scenes" role.
I have faced similar challenges and ultimately opted for a Windows service that uses a combination of building blocks for maximum flexibility. Namely, I use:
1) WCF with implementation-specific types OR
2) Types that are meant to transport and manage objects that wrap a job OR
3) Completely generic, serializable objects contained in a custom wrapper. Since they are just a binary payload, this allows any object to be passed to the service. Once in the service, the wrapper defines what should happen to the object (e.g. invoke a method, gather a result, and optionally make that result available for return).
Ultimately, the web site is responsible for querying the service about its state. This querying can be as simple as polling or can use asynchronous callbacks with WCF (though I believe this also uses some sort of polling behind the scenes).
I tell you what I have do.
I have create a class called Atzenta that have a timer (1-2 second trigger).
I have also create a table on my temporary database that keep the jobs. The table knows the jobID, other parameters, priority, job status, messages.
I can add, or delete a job on this class. When there is no action to be done the timer is stop. When I add a job, then the timer starts again. (the timer is a thread by him self that can do parallel work). I use the System.Timers and not other timers for this.
The jobs can have different priority.
Now let say that I place a job on this table using the Atzenta class. The next time that the timer is trigger is check the query on this table and find the first available job and just run it. No other jobs run until this one is end.
Every synchronize and flags are done from the table. In the table I have flags for every job that show if its |wait to run|request to run|run|pause|finish|killed|
All jobs are all ready known functions or class (eg the creation of statistics).
For stop and start, I use the global.asax and the Application_Start, Application_End to start and pause the object that keep the tasks. For example when I do a job, and I get the Application_End ether I wait to finish and then stop the app, ether I stop the action, notify the table, and start again on application_start.
So I say, Atzenta.RunTheJob(Jobs.StatisticUpdate, ProductID); and then I add this job on table, open the timer, and then on trigger this job is run and I update the statistics for the given product id.
I use a table on a database to synchronize many pools that run the same web app and in fact its work that way. With a common table the synchronize of the jobs is easy and you avoid 2 pools to run the same job at the same time.
On my back office I have a simple table view to see the status of all jobs.
I've programmed in a number of languages, but I am not aware of deadlocks in my code.
I took this to mean it doesn't happen.
Does this happen frequently (in programming, not in the databases) enough that I should be concerned about it?
Deadlocks could arise if two conditions are true: you have mutilple theads, and they contend for more than one resource.
Do you write multi-threaded code? You might do this explicitly by starting your own threads, or you might work in a framework where the threads are created out of your sight, and so you're running in more than one thread without you seeing that in your code.
An example: the Java Servlet API. You write a servlet or JSP. You deploy to the app server. Several users hit your web site, and hence your servlet. The server will likely have a thread per user.
Now consider what happens if in servicing the requests you want to aquire some resources:
if ( user Is Important ){
getResourceA();
}
getResourceB();
if (today is Thursday ) {
getResourceA();
}
// some more code
releaseResourceA();
releaseResoruceB();
In the contrived example above, think about what might happen on a Thursday when an important user's request arrives, and more or less simultaneously an unimportant user's request arrives.
The important user's thread gets Resoruce A and wants B. The less important user gets resource B and wants A. Neither will let go of the resource that they already own ... deadlock.
This can actually happen quite easily if you are writing code that explicitly uses synchronization. Most commonly I see it happen when using databases, and fortunately databases usually have deadlock detection so we can find out what error we made.
Defense against deadlock:
Acquire resources in a well defined order. In the aboce example, if resource A was always obtained before resource B no deadlock would occur.
If possible use timeouts, so that you don't wait indefinately for a resource. This will allow you to detect contention and apply defense 1.
It would be very hard to give an idea of how often it happens in reality (in production code? in development?) and that wouldn't really give a good idea of how much code is vulnerable to it anyway. (Quite often a deadlock will only occur in very specific situations.)
I've seen a few occurrences, although the most recent one I saw was in an Oracle driver (not in the database at all) due to a finalizer running at the same time as another thread trying to grab a connection. Fortunately I found another bug which let me avoid the finalizer running in the first place...
Basically deadlock is almost always due to trying to acquire one lock (B) whilst holding another one (A) while another thread does exactly the same thing the other way round. If one thread is waiting for B to be released, and the thread holding B is waiting for A to be released, neither is willing to let the other proceed.
Make sure you always acquire locks in the same order (and release them in the reverse order) and you should be able to avoid deadlock in most cases.
There are some odd cases where you don't directly have two locks, but it's the same basic principle. For example, in .NET you might use Control.Invoke from a worker thread in order to update the UI on the UI thread. Now Invoke waits until the update has been processed before continuing. Suppose your background thread holds a lock with the update requires... again, the worker thread is waiting for the UI thread, but the UI thread can't proceed because the worker thread holds the lock. Deadlock again.
This is the sort of pattern to watch out for. If you make sure you only lock where you need to, lock for as short a period as you can get away with, and document the thread safety and locking policies of all your code, you should be able to avoid deadlock. Like all threading topics, however, it's easier said than done.
If you get a chance take a look at first few chapters in Java Concurrency in Practice.
Deadlocks can occur in any concurrent programming situation, so it depends how much concurrency you deal with. Several examples of concurrent programming are: multi-process, multi-thread, and libraries introducing multi-thread. UI frameworks, event handling (such as timer event) could be implemented as threads. Web frameworks could spawn threads to handle multiple web requests simultaneously. With multicore CPUs you might see more concurrent situations visibly than before.
If A is waiting for B, and B is waiting for A, the circular wait causes the deadlock. So, it also depends on the type of code you write as well. If you use distributed transactions, you can easily cause that type of scenario. Without distributed transactions, you risk bank accounts from stealing money.
All depends on what you are coding. Traditional single threaded applications that do not use locking. Not really.
Multi-threaded code with multiple locks is what will cause deadlocks.
I just finished refactoring code that used seven different locks without proper exception handling. That had numerous deadlock issues.
A common cause of deadlocks is when you have different threads (or processes) acquire a set of resources in different order.
E.g. if you have some resource A and B, if thread 1 acquires A and then B, and thread 2 acquires B and then A, then this is a deadlock waiting to happen.
There's a simple solution to this problem: have all your threads always acquire resources in the same order. E.g. if all your threads acquire A and B in that order, you will avoid deadlock.
A deadlock is a situation with two processes are dependent on each other - one cannot finish before the other. Therefore, you will likely only have a deadlock in your code if you are running multiple code flows at any one time.
Developing a multi-threaded application means you need to consider deadlocks. A single-threaded application is unlikely to have deadlocks - but not impossible, the obvious example being that you may be using a DB which is subject to deadlocking.