Using dyn async traits (with async-trait crate) in spawned tokio task - asynchronous

I'm working on an asynchronous rust application which utilizes tokio. I'd also like to define some trait methods as async and have opted for the async-trait crate rather than the feature in the nightly build so that I can use them as dyn objects. However, I'm running into issues trying to use these objects in a task spawned with tokio::spawn. Here's a minimal complete example:
use std::time::Duration;
use async_trait::async_trait;
#[tokio::main]
async fn main() {
// These two lines based on the examples for dyn traits in the async-trait create
let value = MyStruct::new();
let object = &value as &dyn MyTrait;
tokio::spawn(async move {
object.foo().await;
});
}
#[async_trait]
trait MyTrait {
async fn foo(&self);
}
struct MyStruct {}
impl MyStruct {
fn new() -> MyStruct {
MyStruct {}
}
}
#[async_trait]
impl MyTrait for MyStruct {
async fn foo(&self) {
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
When I compile this I get the following output:
error: future cannot be sent between threads safely
--> src/main.rs:11:18
|
11 | tokio::spawn(async move {
| __________________^
12 | | object.foo().await;
13 | | });
| |_____^ future created by async block is not `Send`
|
= help: the trait `Sync` is not implemented for `dyn MyTrait`
note: captured value is not `Send` because `&` references cannot be sent unless their referent is `Sync`
--> src/main.rs:12:9
|
12 | object.foo().await;
| ^^^^^^ has type `&dyn MyTrait` which is not `Send`, because `dyn MyTrait` is not `Sync`
note: required by a bound in `tokio::spawn`
--> /home/wilyle/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.25.0/src/task/spawn.rs:163:21
|
163 | T: Future + Send + 'static,
| ^^^^ required by this bound in `spawn`
error: could not compile `async-test` due to previous error
(The results are similar when making object boxed with let object: Box<dyn MyTrait> = Box::new(MyStruct::new()); and when moving the construction fully inside the tokio::spawn call)
By messing around and trying a few things I found that I could solve the issue by boxing object and adding additional trait bounds. Replacing the first two lines of main in my example with the following seems to work just fine:
let object: Box<dyn MyTrait + Send + Sync> = Box::new(MyStruct::new());
So I have two questions:
Why doesn't my original example work? Is it some inconsistency between the two libraries I'm trying to use or am I approaching async programming in rust incorrectly?
Is the solution of adding additional trait bounds the right way to solve this? I'm rather new to rust and have only been programming with it for a few months so I wouldn't be surprised to hear I'm just approaching this wrong.

If you're not sure what Send and Sync mean, check out those documentation links. Something to note is that if T is Sync, then &T is Send.
Question #2 is simple: yes this is the right way to do it. async-trait uses Pin<Box<dyn Future + Send>> as its return type for basically the same reasons. Note that you can only add auto traits to trait objects.
For question #1, there's two issues: Send and 'static.
Send
When you cast something as dyn MyTrait, you're removing all the original type information and replacing it with the type dyn MyTrait. That means you lose the auto-implemented Send and Sync traits on MyStruct. The tokio::spawn function requires Send.
This issue isn't inherent to async, it's because tokio::spawn will run the future on its threadpool, possibly sending it to another thread. You can run the future without tokio::spawn, for example like this:
fn main() {
let runtime = tokio::runtime::Runtime::new().unwrap();
let value = MyStruct::new();
let object = &value as &dyn MyTrait;
runtime.block_on(object.foo());
}
The block_on function runs the future on the current thread, so Send is not necessary. And it blocks until the future is done, so 'static is also not needed. This is great for things that are created at runtime and contain the entire logic of the program, but for dyn Trait types you usually have other things going on that makes this not as useful.
'static
When something requires 'static, it means that all references need to live as long as 'static. One way of satisfying that is to remove all references. In an ideal world you could do:
let object = value as dyn MyTrait;
However, rust doesn't support dynamically sized types on the stack or as function arguments. We're trying to remove all references, so &dyn MyTrait isn't going to work (unless you leak or have a static variable). Box lets you have ownership over dynamically sized types by putting them on the heap, eliminating the lifetime.
You need Send for this because the upgrade from Sync to Send only happens with &, not Box. Instead, Box<T> is Send when T is Send.
Sync is more subtle. While spawn doesn't require Sync, the async block does require Send + Sync to be Send. Since foo takes &self, that means it returns a Future that holds &self. That type is then polled, so in between polls &self could be sent in between threads. And as before, &T is Send if T is Sync. However, if you change it to foo(&mut self) it compiles without + Sync. Makes sense since now it can check that it's not being used concurrently, but it seems to me like the &self verison could be allowed in the future.

Related

Is it safe to use mutable let bindings inside task CEs or async CEs?

When you think you know, you don't know, is what went through my head earlier today. While going over someone's code I noticed something similar to:
task {
let mutable pleaseContinue = true
let mutable state = MyState.Running
while pleaseContine do
match x with
| Ok ->
// do something
do! runSomeTasks()
state <- MyState.Running
pleaseContinue <- true
| Error err ->
do! Log.Fatal err "Something bad"
state <- MyState.Crashed
pleaseContinue <- false
| ... // originally many other states
return
}
Basically, whenever I see mutable in someone else's code I tend to want to get rid of it. But short of that, I found myself wondering whether the mutable variables inside task and the like are properly part of the closure and are safe to read/update/write, as long as they aren't defined outside the task CE builder.
Is this a correct and safe assumption?
This is in F# 6.0, and using Microsoft.FSharp.Control.TaskBuilder and friends.
This looks fine to me.
See https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/computation-expressions.
The builder implements the body of the while loop as a recursive function. It can modify state and pleaseContinue since the variables are in scope.
I'm just not clear where the x that you refer to is declared or mutated - presumably as part of the omitted code before the return statement. In any case, if it is in scope, it should be fine.

Rust, std::cell::Cell - get immutable reference to inner data

Looking through the documentation for std::cell::Cell, I don't see anywhere how I can retrieve a non-mutable reference to inner data. There is only the get_mut method: https://doc.rust-lang.org/std/cell/struct.Cell.html#method.get_mut
I don't want to use this function because I want to have &self instead of &self mut.
I found an alternative solution of taking the raw pointer:
use std::cell::Cell;
struct DbObject {
key: Cell<String>,
data: String
}
impl DbObject {
pub fn new(data: String) -> Self {
Self {
key: Cell::new("some_uuid".into()),
data,
}
}
pub fn assert_key(&self) -> &str {
// setup key in the future if is empty...
let key = self.key.as_ptr();
unsafe {
let inner = key.as_ref().unwrap();
return inner;
}
}
}
fn main() {
let obj = DbObject::new("some data...".into());
let key = obj.assert_key();
println!("Key: {}", key);
}
Is there any way to do this without using unsafe? If not, perhaps RefCell will be more practical here?
Thank you for help!
First of, if you have a &mut T, you can trivially get a &T out of it. So you can use get_mut to get &T.
But to get a &mut T from a Cell<T> you need that cell to be mutable, as get_mut takes a &mut self parameter. And this is by design the only way to get a reference to the inner object of a cell.
By requiring the use of a &mut self method to get a reference out of a cell, you make it possible to check for exclusive access at compile time with the borrow checker. Remember that a cell enables interior mutability, and has a method set(&self, val: T), that is, a method that can modify the value of a non-mut binding! If there was a get(&self) -> &T method, the borrow checker could not ensure that you do not hold a reference to the inner object while setting the object, which would not be safe.
TL;DR: By design, you can't get a &T out of a non-mut Cell<T>. Use get_mut (which requires a mut cell), or set/replace (which work on a non-mut cell). If this is not acceptable, then consider using RefCell, which can get you a &T out of a non-mut instance, at some runtime cost.
In addition to to #mcarton answer, in order to keep interior mutability sound, that is, disallow mutable reference to coexist with other references, we have three different ways:
Using unsafe with the possibility of Undefined Behavior. This is what UnsafeCell does.
Have some runtime checks, involving runtime overhead. This is the approach RefCell, RwLock and Mutex use.
Restrict the operations that can be done with the abstraction. This is what Cell, Atomic* and (the unstable) OnceCell (and thus Lazy that uses it) does (note that the thread-safe types also have runtime overhead because they need to provide some sort of locking). Each provides a different set of allowed operations:
Cell and Atomic* do not let you to get a reference to the contained value, and only replace it as whole (basically, get() and set, though convenience methods are provided on top of these, such as swap()). Projection (cell-of-slice to slice-of-cells) is also available for Cell (field projection is possible, but not provided as part of std).
OnceCell allows you to assign only once and only then take shared reference, guaranteeing that when you assign you have no references and while you have shared references you cannot assign anymore.
Thus, when you need to be able to take a reference into the content, you cannot choose Cell as it was not designed for that - the obvious choice is RefCell, indeed.

Pointers as function arguments when implementing a structure

Why there is a & symbol before self in the full_name() function but there isn't any in the to_tuple() function? When I look at them, the usage of self is similar in both function, but why use &. Also when I add & to to_tuple() or delete it from full_name() it would throw an error. Can someone explain it?
fn full_name(&self) -> String {
format!("{} {}", self.first_name, self.last_name)
}
fn to_tuple(self) -> (String, String) {
(self.first_name, self.last_name)
}
full_name does not consume self, it uses a reference via &self: The members are only used via references as arguments to format!(), so a reference suffices.
to_tuple (as the name to_... suggests) consumes self: It moves the members from self into the returned tuple. Since the original self is no longer valid memory after the move (self no longer owns the memory), it has to be consumed, hence a move via self.
You can change full_name to use self, that is move ownership. This would become unhandy, though, as calling the function would consume the struct without the need to.
to_tuple could be changed to not consume self, yet it would need to .clone() (make a copy) of the members, which is costly.

How to return reference to locally allocated struct/object? AKA error: `foo` does not live long enough

Here's a simplified example of what I'm doing:
struct Foo ...
impl io::Read for Foo ...
fn problem<'a>() -> io::Result<&'a mut io::Read> {
// foo does not live long enough, because it gets allocated on the stack
let mut foo = Foo{ v: 42 };
Ok(&mut foo)
}
Rust playground is here.
Obviously, the problem is that foo is allocated on the stack, so if we return a reference to it, the reference outlives the object.
In C, you'd get around this by using malloc to allocate the object on the heap, and the caller would need to know to call free when appropriate. In a GCed language, this would just work since foo would stick around until there are no references to it. Rust is really clever, and kind of in-between, so I'm not sure what my options are.
I think one option would be to return a managed pointer type. Is Box the most appropriate? (I found a guide to pointers in rust, but it is way outdated.)
The reason I'm returning a reference is that in reality I need to return any of several structs which implement Read. I suppose another option would be to create an enum to wrap each of the possible structs. That would avoid heap allocation, but seems needlessly awkward.
Are there other options I haven't thought of?
Replacing the reference with a Box compiles successfully:
fn problem<'a>() -> io::Result<Box<io::Read>> {
let mut foo = Foo{ v: 42 };
Ok(Box::new(foo))
}
Can you use static type? Looks like in either C or rust, static variable lasts as long as the program does - even if it's a static local.
http://rustbyexample.com/scope/lifetime/static_lifetime.html

interfacing Rust with Berkeley DB

I have an existing C++ program that uses Berkeley DB as a storage backend. I would like to rewrite it in Rust. Is there a way to write a Foreign Function Interface in Rust to use Berkeley DB? I have found the tutorial Rust Foreign Function Interface, but it seems too simple an example for the complicated C structs used in BDB; for example, to open a database
I need to declare a DB struct and call DB->open(). But I don't know how to do this using the example shown in the tutorial.
Can anyone help with this?
Well, looking into the C API of BDB I found out that it consists of C structures with elements-pointers to functions. It is not explained in the tutorial (which is very strange), but Rust currently supports pointers to foreign functions. It is also mentioned in Rust reference manual.
You can create all required structures roughly based on the ones defined in db.h, and since Rust and C structures memory layout is the same you can pass these structures to/from the library and expect correct pointers to be present in them.
For example, your DB->open() call could look like this:
struct DB {
open: extern "C" fn()
}
let db = ... // Get DB from somewhere
(db.open)() // Parentheses around db.open are needed to disambiguate field access
This, however, really should be wrapped in some kind of impl-based interface because calling extern functions is unsafe operation, and you do not want your users to put unsafe around all database interactions.
Given the sheer size and complexity of the DB struct, there doesn't appear to be a "clean" way to expose the whole thing to Rust. A tool similar to C2HS to generate the FFI from C headers would be nice, but alas we don't have one.
Note also that the Rust FFI can't currently call into C++ libraries, so you'll have to use the C API instead.
I'm not familiar with the DB APIs at all, but it appears plausible to create a small support library in C to actually create an instance of the DB struct, then expose the public members of the struct __db via getter and setter functions.
Your implementation might look something like this:
[#link_args = "-lrust_dbhelper"]
extern {
fn create_DB() -> *c_void;
fn free_DB(db: *c_void);
}
struct DB {
priv db: *c_void
}
impl Drop for DB {
fn drop(&self) {
free_DB(self.db);
}
}
priv struct DBAppMembers {
pgsize: u32,
priority: DBCachePriority
// Additional members omitted for brevity
}
impl DB {
pub fn new() -> DB {
DB {
db: create_DB()
}
}
pub fn set_pgsize(&mut self, u32 pgsize) {
unsafe {
let x: *mut DBAppMembers = ::std::ptr::transmute(self.db);
x.pgsize = pgsize;
}
}
// Additional methods omitted for brevity
}
You can save yourself from some additional work by specifically calling C functions with the DB.db member as a parameter, but that requires working in an unsafe context, which should probably be avoided where possible. Otherwise, each function exported by libdb will need to have its own wrapper in your native struct DB.

Resources