When using "Async", why "Await" is used - asynchronous

We use the term "Async" to allow the code stream to continue without blocking the main stream. This is OK. We want the mainstream to continue without waiting for that process to complete. But usually, "Async and Await" are used together.
My question is; When we add "await", we expect "Async" status in the code stream. In this case, I do not understand what is the benefit of using "Async"? Can someone explain this, thank you, good work.

It's not clear which technology you're referring to. I'll assume you're referring to the async and await keywords that have been added to many languages the last few years.
In that case, this statement is actually incorrect:
We use the term "Async" to allow the code stream to continue without blocking the main stream.
In C#, JavaScript, Python, F#, and Visual Basic, the async keyword does not act asynchronously. It has two effects:
Enable the await keyword within that method/lambda.
Transform that method/lambda into a state machine.
So async does not mean "run this code on a background thread" or anything like that, which is a common but incorrect assumption.
The first point above is important for backwards compatibility. Each of these languages use await as a contextual keyword, so the text "await" in a program is only a keyword if it is in a method marked with async. This allows the languages to introduce the new keywords without breaking any existing code (e.g., someone's code that used await as a variable name).
See this post of mine which collected a lot of the discussion around the keywords as they were originally designed for C#/VB. Those same design decisions carried over almost exactly to other languages as they adopted the keywords, too. One of the resources linked from that post is Eric Lippert's post on why we need an async keyword in addition to an await keyword.

Related

Intergrating both synchronous and asynchronous libraries

Can synchronous and asynchronous functions be integrated into one call/interface whilst maintaining static typing? If possible, can it remain neutral with inheritance, i.e. not wrapping sync methods in async or vice versa (though this might be the best way).
I've been reading around and see it's generally recommending to keep these separate (http://www.tagwith.com/question_61011_pattern-for-writing-synchronous-and-asynchronous-methods-in-libraries-and-keepin and Maintain both synchronous and asynchronous implementations). However, the reason I want to do this is I'm creating a behaviour tree framework for Dart language and am finding it hard to mix both sync and async 'nodes' together to iterate through. It seems these might need to be kept separate, meaning nodes that would suit a sync approach would have to be async, or the opposite, if they are to be within the same 'tree'.
I'm looking for a solution particularly for Dart lang, although I know this is firmly in the territory of general programming concepts. I'm open to this not being able to be achieved, but worth a shot.
Thank you for reading.
You can of course use sync and async functions together. What you can't do is go back to sync execution after a call of an async function.
Maintaining both sync and async methods is in my opinion mostly a waste of time. Sometimes sync versions are convenient to not to have to invoke an async call for some simple operation but in general Dart async is an integral part of Dart. If you want to use Dart you have to get used to it.
With the new async/await feature you can write code that uses async functions almost the same as when only sync functions are used.

Usage of F# async workflows

As this question is huge, I will give my view on this question so that you can simply tell me whether I am right or not. If not, where to correct. If my view is superficial, please present an overview of F# async usage. In mu understanding, to write async program, you need to put async code into "async" block like async{expression}, and use "let!" or "use!" to bind names to primitives, then you need to use method to run this async expression like "Async.Run". In addition, you can use exception handling to deal with exception, and cancellation to cancel when necessary. I also know there are several primitives that defined in F# core libraries, and F# extension of I/O operation. I just need to make sure the relation between these things. If you think my view on async workflows is superficial, please give an overview usage like what I have mentioned above. Thank you very much!
This question is huge, so at best, I can highlight some ideas and point you to learning resources and examples.
The description in the question isn't wrong (though there is no Async.Run function). But the main point about Asyncs is how they execute and why the way they execute is useful.
An async block defines a piece of code that becomes an Async<'T> object, which can be seen as a computation that can be executed at a later time. The Async returns an object of type 'T when its execution has completed -- if it has neither failed nor been cancelled.
let!, do! and use! are used inside of an async block to run another Async and, in the cases of let! and use!, bind its result to a name inside the current async. Unlike for example normal let, which simply binds any value to a name, the versions with an exclamation mark explicitly "import" the result of another async.
When an Async depends on another and waits for its result, such as with a let! binding, it normally does not block a thread. Asyncs utilize the .NET thread pool for easy parallel execution, and after an Async completes that another Async depends on, a continuation runs the remainder of the dependent Async.
The Async functions offer many ready-made ways to run Asyncs, such as Async.Start, which is a simple dispatch of an Async with no result, Async.RunSynchronously, which runs the Async and returns its result as if it were a normal function, Async.Parallel, which combines a sequence of Asyncs into one that executes them in parallel, or Async.StartAsTask, which runs an Async as an independent task. Further methods allow composition of Asyncs in terms of cancellation, or explicit control over continuation after an exception or cancellation.
Asyncs are very useful where waiting times are included: otherwise blocking calls can use Asyncs to not block execution, for example in I/O bound functions.
The best introductions to F# Asyncs I know are written, or co-written, by Don Syme, the lead designer of F#:
The chapter Reactive, Asynchronous, and Parallel Programming in the book Expert F#
A blog post with examples for asyncronous agents
The blog post introducing Asyncs in late 2007

Whether to use TPL or async /await

There is an existing third party Rest API available which would accept one set of input and return the output for the same. (Think of it as Bing's Geo coding service, which would accept address and return location detail)
My need would be is to call this API multiple times (say 500-1000) for a single asp.net request and each call may take close to 500ms to return.
I could think of three approaches on how to do this action. Need your input on which could be best possible approach keeping speed as criteria.
1. Using Http Request in a for loop
Write a simple for loop and for each input call the REST API and add the output to the result. This by far could be the slowest. But there is no overhead of threads or context switching.
2. Using async and await
Use async and await mechanisms to call REST Api. It could be efficient as thread continues to do other activites while waiting for REST call to return. The problem I am facing is that, as per recommendations I should be using await all the way to the top most caller, which is not possible in my case. Not following it may lead to dead locks in asp.net as mentioned here http://msdn.microsoft.com/en-us/magazine/jj991977.aspx
3. Using Task Parallel Library
Using a Parallel.ForEach and use the Synchronuos API to invoke the Server parallely and use ConcurrentDictionary to hold the result. But may result in thread overhead
Also, let me know is there any other better way to handle things. I understand people might suggest to track performance for each approach, but would like to understand how people has solved this problem before
The best solution is to use async and await, but in that case you will have to take it async all the way up the call stack to the controller action.
The for loop keeps it all sequential and synchronous, so it would definitely be the slowest solution. Parallel will block multiple threads per request, which will negatively impact your scalability.
Since the operation is I/O-based (calling a REST API), async is the most natural fit and should provide the best overall system performance of these options.
First, I think it's worth considering some issues that you didn't mention in your question:
500-1000 API calls sounds like quite a lot. Isn't there a way to avoid that? Doesn't the API have some kind of bulk query functionality? Or can't you download their database and query it locally? (The more open organizations like Wikimedia or Stack Exchange often support this, the more closed ones like Microsoft or Google usually don't.)
If those options are not available, then at least consider some kind of caching, if that makes sense for you.
The number of concurrent requests to the same server allowed at the same time in ASP.NET is only 10 by default. If you want to make more concurrent requests, you will need to set ServicePointManager.DefaultConnectionLimit.
Making this many requests could be considered abuse by the service provider and could lead to blocking of your IP. Make sure the provider is okay with this kind of usage.
Now, to your actual question: I think that the best option is to use async-await, even if you can't use it all the way. You can avoid deadlocks either by using ConfigureAwait(false) at every await (which is the correct solution) or by using something like Task.Run(() => /* your async code here */).Wait() to escape the ASP.NET context (which is the simple solution).
Using something like Parallel.ForEach() is not great, because it unnecessarily wastes ThreadPool threads.
If you go with async, you should probably also consider throttling. A simple way to achieve that is by using SemaphoreSlim.

How to use non-blocking or asynchronous IO with Boost Spirit?

Does Spirit provide any capabilities for working with non-blocking IO?
To provide a more concrete example: I'd like to use Boost's Spirit parsing framework to parse data coming in from a network socket that's been placed in non-blocking mode. If the data is not completely available, I'd like to be able to use that thread to perform other work instead of blocking.
The trivial answer is to simply read all the data before invoking Spirit, but potentially gigabytes of data would need to be received and parsed from the socket.
It seems like that in order to support non-blocking I/O while parsing, Spirit would need some ability to partially parse the data and be able to pause and save its parse state when no more data is available. Additionally, it would need to be able to resume parsing from the saved parse state when data does become available. Or maybe I'm making this too complicated?
TODO Will post a example for a simple single-threaded 'event-based' parsing model. This is largely trivial but might just be what you need.
For anything less trivial, please heed to following considerations/hints/tips:
How would you be consuming the result? You wouldn't have the synthesized attributes any earlier anyway, or are you intending to use semantic actions on the fly?
That doesn't usually work well due to backtracking. The caveats could be worked around by careful and judicious use of qi::hold, qi::locals and putting semantic actions with side-effects only at stations that will never be backtracked. In other words:
this is bound to be very errorprone
this naturally applies to a limited set of grammars only (those grammars with rich contextual information will not lend themselves well for this treatment).
Now, everything can be forced, of course, but in general, experienced programmers should have learned to avoid swimming upstream.
Now, if you still want to do this:
You should be able to get spirit library thread safe / reentrant by defining BOOST_SPIRIT_THREADSAFE and linking to libboost_thread. Note this makes the gobals used by Spirit threadsafe (at the cost of fine grained locking) but not your parsers: you can't share your own parsers/rules/sub grammars/expressions across threads. In fact, you can only share you own (Phoenix/Fusion) functors iff they are threadsafe, and any other extensions defined outside the core Spirit library should be audited for thread-safety.
If you manage the above, I think by far the best approach would seem to
use boost::spirit::istream_iterator (or, for binary/raw character streams I'd prefer to define a similar boost::spirit::istreambuf_iterator using the boost::spirit::multi_pass<> template class) to consume the input. Note that depending on your grammar, quite a bit of memory could be used for buffering and the performance is suboptimal
run the parser on it's own thread (or logical thread, e.g. Boost Asio 'strands' or its famous 'stackless coprocedures')
use coarse-grained semantic actions like shown above to pass messages to another logical thread that does the actual processing.
Some more loose pointers:
you can easily 'fuse' some functions to handle lazy evaluation of your semantic action handlers using BOOST_FUSION_ADAPT_FUNCTION and friends; This reduces the amount of cruft you have to write to get simple things working like normal C++ overload resolution in semantic actions - especially when you're not using C++0X and BOOST_RESULT_OF_USE_DECLTYPE
Because you will want to avoid semantic actions with side-effects, you should probably look at Inherited Attributes and qi::locals<> to coordinate state across rules in 'pure functional fashion'.

API design: is "fault tolerance" a good thing?

I've consolidated many of the useful answers and came up with my own answer below
For example, I am writing a an API Foo which needs explicit initialization and termination. (Should be language agnostic but I'm using C++ here)
class Foo
{
public:
static void InitLibrary(int someMagicInputRequiredAtRuntime);
static void TermLibrary(int someOtherInput);
};
Apparently, our library doesn't care about multi-threading, reentrancy or whatnot. Let's suppose our Init function should only be called once, calling it again with any other input would wreak havoc.
What's the best way to communicate this to my caller? I can think of two ways:
Inside InitLibrary, I assert some static variable which will blame my caller for init'ing twice.
Inside InitLibrary, I check some static variable and silently aborts if my lib has already been initialized.
Method #1 obviously is explicit, while method #2 makes it more user friendly. I am thinking that method #2 probably has the disadvantage that my caller wouldn't be aware of the fact that InitLibrary shouln't be called twice.
What would be the pros/cons of each approach? Is there a cleverer way to subvert all these?
Edit
I know that the example here is very contrived. As #daemon pointed out, I should initialized myself and not bother the caller. Practically however, there are places where I need more information to properly initialize myself (note the use of my variable name someMagicInputRequiredAtRuntime). This is not restricted to initialization/termination but other instances where the dilemma exists whether I should choose to be quote-and-quote "fault tolorent" or fail lousily.
I would definitely go for approach 1, along with an easy-to-understand exception and good documentation that explains why this fails. This will force the caller to be aware that this can happen, and the calling class can easily wrap the call in a try-catch statement if needed.
Failing silently, on the other hand, will lead your users to believe that the second call was successful (no error message, no exception) and thus they will expect that the new values are set. So when they try to do something else with Foo, they don't get the expected results. And it's darn near impossible to figure out why if they don't have access to your source code.
Serenity Prayer (modified for interfaces)
SA, grant me the assertions
to accept the things devs cannot change
the code to except the things they can,
and the conditionals to detect the difference
If the fault is in the environment, then you should try and make your code deal with it. If it is something that the developer can prevent by fixing their code, it should generate an exception.
A good approach would be to have a factory that creates an intialized library object (this would require you to wrap your library in a class). Multiple create-calls to the factory would create different objects. This way, the initialize-method would then not be a part of the public interface of the library, and the factory would manage initialization.
If there can be only one instance of the library active, make the factory check for existing instances. This would effectively make your library-object a singleton.
I would suggest that you should flag an exception if your routine cannot achieve the expected post-condition. If someone calls your init routine twice, and the system state after calling it the second time will be the same would be the same as if it had just been called once, then it is probably not necessary to throw an exception. If the system state after the second call would not match the caller's expectation, then an exception should be thrown.
In general, I think it's more helpful to think in terms of state than in terms of action. To use an analogy, an attempt to open as "write new" a file that is already open should either fail or result in a close-erase-reopen. It should not simply perform a no-op, since the program will be expecting to be writing into an empty file whose creation time matches the current time. On the other hand, trying to close a file that's already closed should generally not be considered an error, because the desire is that the file be closed.
BTW, it's often helpful to have available a "Try" version of a method that might throw an exception. It would be nice, for example, to have a Control.TryBeginInvoke available for things like update routines (if a thread-safe control property changes, the property handler would like the control to be updated if it still exists, but won't really mind if the control gets disposed; it's a little irksome not being able to avoid a first-chance exception if a control gets closed when its property is being updated).
Have a private static counter variable in your class. If it is 0 then do the logic in Init and increment the counter, If it is more than 0 then simply increment the counter. In Term do the opposite, decrement until it is 0 then do the logic.
Another way is to use a Singleton pattern, here is a sample in C++.
I guess one way to subvert this dilemma is to fulfill both camps. Ruby has the -w warning switch, it is custom for gcc users to -Wall or even -Weffc++ and Perl has taint mode. By default, these "just work," but the more careful programmer can turn on these strict settings themselves.
One example against the "always complain the slightest error" approach is HTML. Imagine how frustrated the world would be if all browsers would bark at any CSS hacks (such as drawing elements at negative coordinates).
After considering many excellent answers, I've come to this conclusion for myself: When someone sits down, my API should ideally "just work." Of course, for anyone to be involved in any domain, he needs to work at one or two level of abstractions lower than the problem he is trying to solve, which means my user must learn about my internals sooner or later. If he uses my API for long enough, he will begin to stretch the limits and too much efforts to "hide" or "encapsulate" the inner workings will only become nuisance.
I guess fault tolerance is most of the time a good thing, it's just that it's difficult to get right when the API user is stretching corner cases. I could say the best of both worlds is to provide some kind of "strict mode" so that when things don't "just work," the user can easily dissect the problem.
Of course, doing this is a lot of extra work, so I may be just talking ideals here. Practically it all comes down to the specific case and the programmer's decision.
If your language doesn't allow this error to surface statically, chances are good the error will surface only at runtime. Depending on the use of your library, this means the error won't surface until much later in development. Possibly only when shipped (again, depends on alot).
If there's no danger in silently eating an error (which isn't a real error anyway, since you catch it before anything dangerous happens), then I'd say you should silently eat it. This makes it more user friendly.
If however someMagicInputRequiredAtRuntime varies from calling to calling, I'd raise the error whenever possible, or presumably the library will not function as expected ("I init'ed the lib with value 42, but it's behaving as if I initted with 11!?").
If this Library is a static class, (a library type with no state), why not put the call to Init in the type initializer? If it is an instantiatable type, then put the call in the constructor, or in the factory method that handles instantiation.
Don;t allow public access to the Init function at all.
I think your interface is a bit too technical. No programmer want to learn what concept you have used while designing the API. Programmers want solutions for their actual problems and don't want to learn how to use an API. Nobody wants to init your API, that is something that the API should handle in the background as far as possible. Find a good abstraction that shields the developer from as much low-level technical stuff as possible. That implies, that the API should be fault tolerant.

Resources