rust tokio: calling async function from sync closure - asynchronous

I have the following problem:
I'm trying to call sync closure from async function, but sync closure has to later call another async function.
I cannot make async closure, because they are unstable at the moment:
error[E0658]: async closures are unstable
so I have to do it this way somehow.
I found a couple of questions related to the problem, such as this, but when I tried to implement it, im receiving the following error:
Cannot start a runtime from within a runtime.
This happens because a function (like `block_on`)
attempted to block the current thread while the
thread is being used to drive asynchronous tasks.'
here is playground link which hopefully can show what I'm having problem with.
I'm using tokio as stated in the title.

As the error message states, Tokio doesn't allow you to have nested runtimes.
There's a section in the documentation for Tokio's Mutex which states the following (link):
[It] is ok and often preferred to use the ordinary Mutex from the standard library in asynchronous code. [...] the feature that the async mutex offers over the blocking mutex is that it is possible to keep the mutex locked across an .await point, which is rarely necessary for data.
Additionally, from Tokio's mini-Redis example:
A Tokio mutex is mostly intended to be used when locks need to be held
across .await yield points. All other cases are usually best
served by a std mutex. If the critical section does not include any
async operations but is long (CPU intensive or performing blocking
operations), then the entire operation, including waiting for the mutex,
is considered a "blocking" operation and tokio::task::spawn_blocking
should be used.
If the Mutex is the only async thing you need in your synchronous call, it's most likely better to make it a blocking Mutex. In that case, you can lock it from async code by first calling try_lock(), and, if that fails, attempting to lock it in a blocking context via spawn_blocking.

Related

Can I use atomics in an asynchronous context?

Is there a way to use atomic types in an asynchronous context instead of an asynchronous Mutex or RwLock? Can standard atomics be used as is in an asynchronous context?
Or, for example, is there an asynchronous equivalent of std::sync::atomic::AtomicUsize with load / store methods, which could replace something like tokio::sync::RwLock<usize> with read().await / write().await methods?
Yes, using Atomics in an async context is no problem.
Most of them are lock-free (=can't even block).
Even if you would block, it is still recommended to use normal, blocking synchronization primitives over async ones, unless you hold that lock during an await.
For more info I quote the respective chapter of the tokio tutorial:
On using std::sync::Mutex
Note, std::sync::Mutex and not tokio::sync::Mutex is used to guard the HashMap. A common error is to unconditionally use tokio::sync::Mutex from within async code. An async mutex is a mutex that is locked across calls to .await.
A synchronous mutex will block the current thread when waiting to acquire the lock. This, in turn, will block other tasks from processing. However, switching to tokio::sync::Mutex usually does not help as the asynchronous mutex uses a synchronous mutex internally.
As a rule of thumb, using a synchronous mutex from within asynchronous code is fine as long as contention remains low and the lock is not held across calls to .await. Additionally, consider using parking_lot::Mutex as a faster alternative to std::sync::Mutex.
Note that this is of course only true if all the threads accessing said lock/atomic are also async worker threads. If you have external threads that could block the mutex, you have to consider waiting for that mutex a blocking operation and treat it as such.

Why 'await' cannot be used in ordinary functions?

Correction: as follows from two answers, the problem looks to be specific to dart as C# implements Task.Wait allowing to achieve the desired effect. I would be grateful for any further information on the reasons why this is not possible in dart or how to achieve it there. I would be happy to close the question in its current form.
There are situations where it could make sense to have synchronous functions use asynchronous calls internally waiting for their completion before delivering the result. This is explicitly not allowed in both C# and Dart: anything that uses await must be marked as async (thus returning a future).
A simple example in Dart, but the same is valid for C#:
Disclaimer: this is not how I would do things, so do not suggest a "correct" solution of converting all methods to async; the code here is just to illustrate the question.
I have a fully asynchronous implementation of a generic wrapper for HTTP requests to some particular REST API. It does some login, authentication etc.:
class RequestProcessor {
Future post(String path, [payload]) async {
return request("POST", path, payload);
}
}
I would like to implement a high-level user-facing API translating to requests to the REST-API. Some of those methods I'd like to have synchronous. This is what I would like and am not allowed to have:
class API {
final RequestProcessor processor;
API(this.processor);
// this takes long and should be asynchronous, happy with a future (OK)
Future getQueryResults(query) async {
return await processor.post("/query", query);
}
// here I would prefer to have it synchronous (COMPILE ERROR)
String getVersion() {
return await processor.post("/version");
}
}
Question: Why is this not allowed?
It's a design choice.
Dart is single-threaded - each synchronous function call will run to completion before allowing any other code to run. If you allow a synchronous function to block until a future is complete, then no other code will run and the future will never complete (because the code that would complete it can't run).
It is possible to have a system that allows other code to run while one synchronous function is blocked. That requires either having more than one active stack (parallel or not), or implicitly transforming the entire program to be async. Neither choice is great when you compile to JavaScript, so Dart did not try to do that.
The await is an asynchronous wait so it does not make sense to have it in a synchronous function. To be asynchronous the method needs to return when the awaited operation is initiated, whereas the synchronous method needs to run to completion.
You can use a synchronous wait instead, in which case your method does not need to be async. In C# you can do this by calling the Wait() method on the returned task.
Because waiting and returning a future is fundamentally different.
Returning a future frees the current thread and optionally schedules a continuation for later, waiting blocks the current thread.
For example, in a UI thread, returning a future is fine but waiting will cause the UI to hang, in a web server thread waiting will block the thread and prevent the web server from handling more request, etc.
It's even possible waiting for what should be an async operation will cause a deadlock.
.net provide Task.Wait for when you know waiting is fine, I don't know about dart - but this is not automatic (except for await inside catch clauses in async methods in C#)
In Dart marking a function/method async tells the VM that it needs to rewrite the code before executing it.
await doesn't change an async function to a sync one, the function is still async, it just allows nicer syntax than then(() { return ...then(() { return...})}) chains.
If you make an async call everything following is async, there is nothing you can do about it.

How is asynchronous callback implemented?

How do all the languages implements asynchronous callbacks?
For example in C++, one need to have a "monitor thread" to start a std::async. If it is started in main thread, it has to wait for the callback.
std::thread t{[]{std::async(callback_function).get();}}.detach();
v.s.
std::async(callback_function).get(); //Main thread will have to wait
What about asynchronous callbacks in JavaScript? In JS callbacks are massively used... How does V8 implement them? Does V8 create a lot of threads to listen on them and execute callback when it gets message? Or does it use one thread to listen on all the callbacks and keep refreshing?
For example,
setInterval(function(){},1000);
setInterval(function(){},2000);
Does V8 create 2 threads and monitor each callback state, or it has a pool thing to monitor all the callbacks?
V8 does not implement asynchronous functions with callbacks (including setInterval). Engine simply provides a way to execute JavaScript code.
As a V8 embedder you can create setInterval JavaScript function linked to your native C++ function that does what you want. For example, create thread or schedule some job. At this point it is your responsibility to call provided callback when it is necessary. Only one thread at a time can use V8 engine (V8 isolate instance) to execute code. This means synchronization is required if a callback needs to be called from another thread. V8 provides locking mechanism is you need this.
Another more common approach to solve this problem is to create a queue of functions for V8 to execute and use infinite queue processing loop to execute code on one thread. This is basically an event loop. This way you don't need to use execution lock, but instead use another thread to push callback function to a queue.
So it depends on a browser/Node.js/other embedder how they implement it.
TL;DR: To implement asynchronous callback is basically to allow the control flow to proceed without blocking for the callback. Before the callback function is finally called, the control flow is free to execute anything that has no dependence on the callback's result, e.g., the caller can proceed as if the callback function has returned, or the caller may yield its control to other functions.
Since the question is for general implementation rather than a specific language, my answer tries to be as general as to cover the implementation commonalities.
Different languages have different implementations for asynchronous callbacks, but the principles are the same. The key is to decouple the control flow from the code executed. They correspond to the execution context (like a thread of control with a runtime stack) and the executed task. Traditionally the execution context and the executed task are usually 1:1 associated. With asynchronous callbacks, they are decoupled.
1. The principles
To decouple the control flow from the code, it is helpful to think of every asynchronous callback as a conditional task. When the code registers an asynchronous callback, it virtually installs the task's condition in the system. The callback function is then invoked when the condition is satisfied. To support this, a condition monitoring mechanism and a task scheduler are needed, so that,
The programmer does not need to track the callback's condition;
Before the condition is satisfied, the program may proceed to execute other code that does not depend on the callback's result, without blocking on the condition;
Once the condition is satisfied, the callback is guaranteed to execute. The programmer does not need to schedule its execution;
After the callback is executed, its result is accessible to the caller.
2. Implementation for Portability
For example, if your code needs to process the data from a network connection, you do not need to write the code checking the connection state. You only registers a callback that will be invoked once the data is available for processing. The dirty work of connection checking is left to the language implementation, which is known to be tricky especially when we talk about scalability and portability.
The language implementation may employ asynchronous io, nonblocking io or a thread pool or whatever techniques to check the network state for you, and once the data is ready, the callback function is then scheduled to execute. Here the control flow of your code looks like directly going from the callback registration to the callback execution, because the language hides the intermediate steps. This is the portability story.
3. Implementation for Scalability
To hide the dirty work is only part of the whole story. The other part is that, your code itself does not need to block waiting for the task condition. It does not make sense to wait for one connection's data when you have lots of network connections simultaneously and some of them may already have data ready. The control flow of your code can simply register the callback, and then moves on with other tasks (e.g., the callbacks whose conditions have been satisfied), knowing that the registered callbacks will be executed anyway when their data are available.
If to satisfy the callback's condition does not involve much of the CPU (e.g., waiting for a timer, or waiting for the data from network), and the callback function itself is light-weighted, then single CPU (or single thread) is able to process lots of callbacks concurrently, such as incoming network requests processing. Here the control flow may look like jumping from one callback to another. This is the scalability story.
4. Implementation for Parallelism
Sometimes, the callbacks are not pending for non-blocking IO condition, but for blocking operations such as page fault; or the callbacks do not rely on any condition, but are pure computation logics. In this case, asynchronous callback does not save you the CPU waiting time (because there is no idle waiting). But since asynchronous callback implies that the callback function can be executed in parallel with the caller or other callbacks (subject to certain data sharing and synchronization constraints), the language implementation can dispatch the callback tasks to different threads, achieving the benefits of parallelism, if the platform has more than one hardware thread context. It still improves scalability.
5. Implementation for Productivity
The productivity with asynchronous callback may not be very positive when the code need to deal with chained callbacks, i.e., when callbacks register other callbacks in recursive way, known as callback hell. There are ways to rescue.
The semantics of an asynchronous callback can be explored so as to substitute the hopeless nested callbacks with other language constructs. Basically there can be two different views of callbacks:
From data flow point of view: asynchronous callback = event + task.
To register a callback essentially generates an event that will emit
when the task condition is satisfied. In this view, the chained
callbacks are just events whose processing triggers other event
emission. It can be naturally implemented in event-driven
programming, where the task execution is driven by events. Promise
and Observable may also be regarded as event-driven concept. When
multiple events are ready concurrently, their associated tasks can
be executed concurrently as well.
From control flow point of view: to register a callback yields the
control to other code, and the callback execution just resumes the
control flow once its condition is satisfied. In this view, chained
asynchronous callbacks are just resumable functions. Multiple
callbacks can be written as one after another in traditional
"synchronous" way, with yield operation in between (or await). It
actually becomes coroutine.
I haven't discussed the implementation of data passing between the asynchronous callback and its caller, but that is usually not difficult if using shared memory where caller and callback can share data. Actually Golang's channel can also be considered in line of yield/await but with its focus on data passing.
The callbacks that are passed to browser APIs, like setTimeout, are pushed into the same browser queue when the API has done its job.
The engine can check this queue when the stack is empty and push the next callback into the JS stack for execution.
You don’t have to monitor the progress of the API calls, you asked it to do a job and it will put your callback in the queue when it’s done.

Meteor threading style clarification

Meteor's documentation states:
In Meteor, your server code runs in a single thread per request, not in the asynchronous callback style typical of Node
Do they actually mean?
A) the server is running multiple threads in parallel (which seems unusual within the Node.js ecosystem)
or
B) There is still only a single thread within an evented server and each request is processed sequentially, at least until it makes calls to resources outside the server - like the datastore, at which point the server itself is handling the callbacks while it processes with other requests, so you don't have to write/administer the callbacks yourself.
Brad, your B is correct.
Meteor uses fibers internally. As you said, there's only one thread inside an evented server, but when you do (eg) a database read, Fibers yields and control quickly gets back to the event loop. So your code looks like:
doc = MyCollection.findOne(id);
(with a hidden "yield to the event loop, come back when the doc is here") rather than
MyCollection.findOne(id, function (err, doc) {
if (err)
handle(err);
process(doc);
});
Error handling in the fiber version also just uses standard JavaScript exceptions instead of needing to check an argument every time.
I think this leads to an easier style of code to read for business logic which wants to take a bunch of actions which depend on each other in series. However, most of Meteor's synchronous APIs optionally take callbacks and become asynchronous, if you'd like to use the async style.

Async: Why AsyncDownloadString?

Alright... I'm getting a bit confused here. The async monad allows you to use let! which will start the computation of the given async method, and suspend the thread, untill the result is available.. thats all fine, I do understand that.
Now what I dont understand is why they made an extension for the WebClient class, thats named AsyncDownloadString - Couldn't you just wrap the normal DownloadString inside an async block? I'm pretty sure, I'm missing an important point here, since I've done some testing that shows DownloadString wrapped inside an async block, still blocks the thread.
There is an important difference between the two:
The DownloadString method is synchronous - the thread that calls the method will be blocked until the whole string is downloaded (i.e. until the entire content is transferred over the internet).
On the other hand, AsyncDownloadString doesn't block the thread for a long time. It asks the operating system to start the download and then releases the thread. When the operating system receives some data, it picks a thread from the thread pool, the thread stores the data to some buffer and is again released. When all data is downloaded, the method will read all data from the buffer and resume the rest of the asynchronous workflow.
In the first case, the thread is blocked during the entire download. In the second case, threads are only busy for very short period of time (when processing received responses, but not when waiting for the server).
Internally, the AsyncDownloadString method is just a wrapper for DownloadStringAsync, so you can also find more information in the MSDN documentation.
The important point to note is that async programming is about doing operations that are not CPU bound i.e those which are IO bound. These IO bound operations are performed on IO threads (using overlapped IO feature of operating system). What this implies is that even if you wrap some factorial function inside a async block and run it inside another async block using let! binding, you won't get any benefit out of it as it will be running on CPU bound thread and the main purpose of doing async programming is to not take up a CPU bound thread when something which is of IO nature, as that CPU bound thread can be used for other purpose in the meantime the IO completes.
If you look at the various IO classes in .NET like File, Socket etc. They all have blocking as well as non blocking read and write operations. The blocking operations will wait for the IO to complete on the CPU thread and hence blocking the CPU thread till IO is done, where as the non blocking operations uses the overlapped IO API calls to perform the operation.
Async have a method to make a async block out of these non blocking APIs of Files, Socket etc. In your case calling DownloadString will block the CPU thread as it uses the blocking API of the underlying class where as AsyncDownloadString uses the non blocking - io overlapped - based API call.

Resources