Call async function in non-async for rustr - asynchronous

Hi I have a system which pulls some data from different sources. All sources until today were sync so i designed the following API:
trait GraphPuller {
pub fn Graph() -> Result<Vec<i64>>
}
struct GitPuller;
Struct TcpPuller;
impl GraphPuller for GitPuller {
pub fn Graph() -> Result<Vec<i64>> {
...
}
}
impl GraphPuller for Tc[Puller {
pub fn Graph() -> Result<Vec<i64>> {
...
}
}
Now I have to implement the Puller for AWS. AWS API is async, so i have to call async function in non sync system.
How Should I design the trait and api to avoid issues?

The easiest way is to keep the trait the same and to just block on the Amazon API calls. I'd recommend pollster::block_on because it implements just the very minimum instead of pulling in the more heavyweight futures crate.

Related

Is it possible to implement a feature like Java's CompletableFuture::complete in Rust?

I'm a beginner in rust, and I'm trying to use rust's asynchronous programming.
In my requirement scenario, I want to create a empty Future and complete it in another thread after a complex multi-round scheduling process. The CompletableFuture::complete of Java can meet my needs very well.
I have tried to find an implementation of Rust, but haven't found one yet.
Is it possible to do it in Rust?
I understand from the comments below that using a channel for this is more in line with rust's design.
My scenario is a hierarchical scheduling executor.
For example, Task1 will be splitted to several Drivers, each Driver will use multi thread(rayon threadpool) to do some computation work, and the former driver's state change will trigger the execution of next driver, the result of the whole task is the last driver's output and the intermedia drivers have no output. That is to say, my async function cannot get result from one spawn task directly, so I need a shared stack variable or a channel to transfer the result.
So what I really want is this: the last driver which is executed in a rayon thread, it can get a channel's tx by it's identify without storing it (to simplify the state change process).
I found the tx and rx of oneshot cannot be copies and they are not thread safe, and the send method of tx need ownership. So, I can't store the tx in main thread and let the last driver find it's tx by identify. But I can use mpsc to do that, I worte 2 demos and pasted it into the body of the question, but I have to create mpsc with capacity 1 and close it manually.
I wrote 2 demos, as bellow.I wonder if this is an appropriate and efficient use of mpsc?
Version implemented using oneshot, cannot work.
#[tokio::test]
pub async fn test_async() -> Result<()>{
let mut executor = Executor::new();
let res1 = executor.run(1).await?;
let res2 = executor.run(2).await?;
println!("res1 {}, res2 {}", res1, res2);
Ok(())
}
struct Executor {
pub pool: ThreadPool,
pub txs: Arc<DashMap<i32, RwLock<oneshot::Sender<i32>>>>,
}
impl Executor {
pub fn new() -> Self {
Executor{
pool: ThreadPoolBuilder::new().num_threads(10).build().unwrap(),
txs: Arc::new(DashMap::new()),
}
}
pub async fn run(&mut self, index: i32) -> Result<i32> {
let (tx, rx) = oneshot::channel();
self.txs.insert(index, RwLock::new(tx));
let txs_clone = self.txs.clone();
self.pool.spawn(move || {
let spawn_tx = txs_clone.get(&index).unwrap();
let guard = block_on(spawn_tx.read());
// cannot work, send need ownership, it will cause move of self
guard.send(index);
});
let res = rx.await;
return Ok(res.unwrap());
}
}
Version implemented using mpsc, can work, not sure about performance
#[tokio::test]
pub async fn test_async() -> Result<()>{
let mut executor = Executor::new();
let res1 = executor.run(1).await?;
let res2 = executor.run(2).await?;
println!("res1 {}, res2 {}", res1, res2);
// close channel after task finished
executor.close(1);
executor.close(2);
Ok(())
}
struct Executor {
pub pool: ThreadPool,
pub txs: Arc<DashMap<i32, RwLock<mpsc::Sender<i32>>>>,
}
impl Executor {
pub fn new() -> Self {
Executor{
pool: ThreadPoolBuilder::new().num_threads(10).build().unwrap(),
txs: Arc::new(DashMap::new()),
}
}
pub fn close(&mut self, index:i32) {
self.txs.remove(&index);
}
pub async fn run(&mut self, index: i32) -> Result<i32> {
let (tx, mut rx) = mpsc::channel(1);
self.txs.insert(index, RwLock::new(tx));
let txs_clone = self.txs.clone();
self.pool.spawn(move || {
let spawn_tx = txs_clone.get(&index).unwrap();
let guard = block_on(spawn_tx.value().read());
block_on(guard.deref().send(index));
});
// 0 mock invalid value
let mut res:i32 = 0;
while let Some(data) = rx.recv().await {
println!("recv data {}", data);
res = data;
break;
}
return Ok(res);
}
}
Disclaimer: It's really hard to picture what you are attempting to achieve, because the examples provided are trivial to solve, with no justification for the added complexity (DashMap). As such, this answer will be progressive, though it will remain focused on solving the problem you demonstrated you had, and not necessarily the problem you're thinking of... as I have no crystal ball.
We'll be using the following Result type in the examples:
type Result<T> = Result<T, Box<dyn Error + Send + Sync + 'static>>;
Serial execution
The simplest way to execute a task, is to do so right here, right now.
impl Executor {
pub async fn run<F>(&self, task: F) -> Result<i32>
where
F: FnOnce() -> Future<Output = Result<i32>>,
{
task().await
}
}
Async execution - built-in
When the execution of a task may involve heavy-weight calculations, it may be beneficial to execute it on a background thread.
Whichever runtime you are using probably supports this functionality, I'll demonstrate with tokio:
impl Executor {
pub async fn run<F>(&self, task: F) -> Result<i32>
where
F: FnOnce() -> Result<i32>,
{
Ok(tokio::task::spawn_block(task).await??)
}
}
Async execution - one-shot
If you wish to have more control on the number of CPU-bound threads, either to limit them, or to partition the CPUs of the machine for different needs, then the async runtime may not be configurable enough and you may prefer to use a thread-pool instead.
In this case, synchronization back with the runtime can be achieved via channels, the simplest of which being the oneshot channel.
impl Executor {
pub async fn run<F>(&self, task: F) -> Result<i32>
where
F: FnOnce() -> Result<i32>,
{
let (tx, mut rx) = oneshot::channel();
self.pool.spawn(move || {
let result = task();
// Decide on how to handle the fact that nobody will read the result.
let _ = tx.send(result);
});
Ok(rx.await??)
}
}
Note that in all of the above solutions, task remains agnostic as to how it's executed. This is a property you should strive for, as it makes it easier to change the way execution is handled in the future by more neatly separating the two concepts.

How to deal with non-Send futures in a Tokio spawn context?

Tokio's spawn can only work with a Send future. This makes the following code invalid:
async fn async_foo(v: i32) {}
async fn async_computation() -> Result<i32, Box<dyn std::error::Error>> {
Ok(1)
}
async fn async_start() {
match async_computation().await {
Ok(v) => async_foo(v).await,
_ => unimplemented!(),
};
}
#[tokio::main]
async fn main() {
tokio::spawn(async move {
async_start().await;
});
}
The error is:
future cannot be sent between threads safely
the trait `Send` is not implemented for `dyn std::error::Error`
If I understand correctly: because async_foo(v).await might yield, Rust internally have to save all the context which might be on a different thread - hence the result of async_computation().await must be Send - which dyn std::error::Error is not.
This could be mitigated if the non-Send type can be dropped before the .await, such as:
async fn async_start() {
let result;
match async_computation().await {
Ok(v) => result = v,
_ => return,
};
async_foo(result).await;
}
However once another .await is needed before the non-Send type is dropped, the workarounds are more and more awkward.
What a is a good practice for this - especially when the non-Send type is for generic error handling (the Box<dyn std::error::Error>)? Is there a better type for errors that common IO ops implements (async or not) in Rust? Or there is a better way to group nested async calls?
Most errors are Send so you can just change the return type to:
Box<dyn std::error::Error + Send>
It's also common to have + Sync.

How to extract values from async functions to a non-async one? [duplicate]

I am trying to use hyper to grab the content of an HTML page and would like to synchronously return the output of a future. I realized I could have picked a better example since synchronous HTTP requests already exist, but I am more interested in understanding whether we could return a value from an async calculation.
extern crate futures;
extern crate hyper;
extern crate hyper_tls;
extern crate tokio;
use futures::{future, Future, Stream};
use hyper::Client;
use hyper::Uri;
use hyper_tls::HttpsConnector;
use std::str;
fn scrap() -> Result<String, String> {
let scraped_content = future::lazy(|| {
let https = HttpsConnector::new(4).unwrap();
let client = Client::builder().build::<_, hyper::Body>(https);
client
.get("https://hyper.rs".parse::<Uri>().unwrap())
.and_then(|res| {
res.into_body().concat2().and_then(|body| {
let s_body: String = str::from_utf8(&body).unwrap().to_string();
futures::future::ok(s_body)
})
}).map_err(|err| format!("Error scraping web page: {:?}", &err))
});
scraped_content.wait()
}
fn read() {
let scraped_content = future::lazy(|| {
let https = HttpsConnector::new(4).unwrap();
let client = Client::builder().build::<_, hyper::Body>(https);
client
.get("https://hyper.rs".parse::<Uri>().unwrap())
.and_then(|res| {
res.into_body().concat2().and_then(|body| {
let s_body: String = str::from_utf8(&body).unwrap().to_string();
println!("Reading body: {}", s_body);
Ok(())
})
}).map_err(|err| {
println!("Error reading webpage: {:?}", &err);
})
});
tokio::run(scraped_content);
}
fn main() {
read();
let content = scrap();
println!("Content = {:?}", &content);
}
The example compiles and the call to read() succeeds, but the call to scrap() panics with the following error message:
Content = Err("Error scraping web page: Error { kind: Execute, cause: None }")
I understand that I failed to launch the task properly before calling .wait() on the future but I couldn't find how to properly do it, assuming it's even possible.
Standard library futures
Let's use this as our minimal, reproducible example:
async fn example() -> i32 {
42
}
Call executor::block_on:
use futures::executor; // 0.3.1
fn main() {
let v = executor::block_on(example());
println!("{}", v);
}
Tokio
Use the tokio::main attribute on any function (not just main!) to convert it from an asynchronous function to a synchronous one:
use tokio; // 0.3.5
#[tokio::main]
async fn main() {
let v = example().await;
println!("{}", v);
}
tokio::main is a macro that transforms this
#[tokio::main]
async fn main() {}
Into this:
fn main() {
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap()
.block_on(async { {} })
}
This uses Runtime::block_on under the hood, so you can also write this as:
use tokio::runtime::Runtime; // 0.3.5
fn main() {
let v = Runtime::new().unwrap().block_on(example());
println!("{}", v);
}
For tests, you can use tokio::test.
async-std
Use the async_std::main attribute on the main function to convert it from an asynchronous function to a synchronous one:
use async_std; // 1.6.5, features = ["attributes"]
#[async_std::main]
async fn main() {
let v = example().await;
println!("{}", v);
}
For tests, you can use async_std::test.
Futures 0.1
Let's use this as our minimal, reproducible example:
use futures::{future, Future}; // 0.1.27
fn example() -> impl Future<Item = i32, Error = ()> {
future::ok(42)
}
For simple cases, you only need to call wait:
fn main() {
let s = example().wait();
println!("{:?}", s);
}
However, this comes with a pretty severe warning:
This method is not appropriate to call on event loops or similar I/O situations because it will prevent the event loop from making progress (this blocks the thread). This method should only be called when it's guaranteed that the blocking work associated with this future will be completed by another thread.
Tokio
If you are using Tokio 0.1, you should use Tokio's Runtime::block_on:
use tokio; // 0.1.21
fn main() {
let mut runtime = tokio::runtime::Runtime::new().expect("Unable to create a runtime");
let s = runtime.block_on(example());
println!("{:?}", s);
}
If you peek in the implementation of block_on, it actually sends the future's result down a channel and then calls wait on that channel! This is fine because Tokio guarantees to run the future to completion.
See also:
How can I efficiently extract the first element of a futures::Stream in a blocking manner?
As this is the top result that come up in search engines by the query "How to call async from sync in Rust", I decided to share my solution here. I think it might be useful.
As #Shepmaster mentioned, back in version 0.1 futures crate had beautiful method .wait() that could be used to call an async function from a sync one. This must-have method, however, was removed from later versions of the crate.
Luckily, it's not that hard to re-implement it:
trait Block {
fn wait(self) -> <Self as futures::Future>::Output
where Self: Sized, Self: futures::Future
{
futures::executor::block_on(self)
}
}
impl<F,T> Block for F
where F: futures::Future<Output = T>
{}
After that, you can just do following:
async fn example() -> i32 {
42
}
fn main() {
let s = example().wait();
println!("{:?}", s);
}
Beware that this comes with all the caveats of original .wait() explained in the #Shepmaster's answer.
This works for me using tokio:
tokio::runtime::Runtime::new()?.block_on(fooAsyncFunction())?;

Rust Trait warning: method references the `Self` type in its `where` clause

We're trying to create a pointer to a trait object and getting a warning that the trait cannot be made into an object.
use async_trait::async_trait;
use std::sync::{Arc, Weak};
fn main() {
println!("Hello, world!");
}
pub type MyTraitPtr = Arc<dyn MyTrait>;
#[async_trait]
pub trait MyTrait {
async fn foo(&self) {}
}
pub struct Parent {
child: MyTraitPtr,
}
This produces the following warning:
warning: the trait `MyTrait` cannot be made into an object
--> src/main.rs:14:14
|
14 | async fn foo(&self) {}
| ^^^
|
= note: `#[warn(where_clauses_object_safety)]` on by default
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #51443 <https://github.com/rust-lang/rust/issues/51443>
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
--> src/main.rs:14:14
|
13 | pub trait MyTrait {
| ------- this trait cannot be made into an object...
14 | async fn foo(&self) {}
| ^^^ ...because method `foo` references the `Self` type in its `where` clause
= help: consider moving `foo` to another trait
Full code: https://github.com/lunar-mining/trait_ptr
Update: if we add Sync to the Trait definition then it compiles, like so:
use std::sync::{Arc, Weak};
fn main() {
println!("Hello, world!");
}
pub type MyTraitPtr = Arc<dyn MyTrait>;
#[async_trait]
pub trait MyTrait: Sync {
async fn foo(&self) {}
}
pub struct Parent {
child: MyTraitPtr,
}
However if the trait's methods take an Arc reference to Self, then it does not compile:
use async_trait::async_trait;
use std::sync::{Arc, Weak};
fn main() {
println!("Hello, world!");
}
pub type MyTraitPtr = Arc<dyn MyTrait>;
#[async_trait]
pub trait MyTrait: Sync {
async fn foo(self: Arc<Self>) {}
}
pub struct Parent {
child: MyTraitPtr,
}
Discussion of the problem here: https://github.com/rust-lang/rust/issues/51443
From async-trait docs:
Dyn traits
Traits with async methods can be used as trait objects as long as they meet the usual requirements for dyn – no methods with type parameters, no self by value, no associated types, etc.
...
The one wrinkle is in traits that provide default implementations of async methods. In order for the default implementation to produce a future that is Send, the async_trait macro must emit a bound of Self: Sync on trait methods that take &self and a bound Self: Send on trait methods that take &mut self. An example of the former is visible in the expanded code in the explanation section above.
If you make a trait with async methods that have default implementations, everything will work except that the trait cannot be used as a trait object. Creating a value of type &dyn Trait will produce an error that looks like this:
error: the trait `Test` cannot be made into an object
--> src/main.rs:8:5
|
8 | async fn cannot_dyn(&self) {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For traits that need to be object safe and need to have default implementations for some async methods, there are two resolutions. Either you can add Send and/or Sync as supertraits (Send if there are &mut self methods with default implementations, Sync if there are &self methods with default implementations) to constrain all implementors of the trait such that the default implementations are applicable to them:
#[async_trait]
pub trait ObjectSafe: Sync { // added supertrait
async fn can_dyn(&self) {}
}
let object = &value as &dyn ObjectSafe;
or you can strike the problematic methods from your trait object by bounding them with Self: Sized:
#[async_trait]
pub trait ObjectSafe {
async fn cannot_dyn(&self) where Self: Sized {}
// presumably other methods
}
let object = &value as &dyn ObjectSafe;
Arc is considered the same as mutable references in this regard, and requires a Send bound. See also the bug report I posted to async-trait.
You can implement that by moving the implementation out of the trait:
use async_trait::async_trait;
use std::sync::{Arc, Weak};
pub type MyTraitPtr = Arc<dyn MyTrait>;
#[async_trait]
pub trait MyTrait {
async fn foo(&self);
}
impl dyn MyTrait {
async fn foo(&self) { }
}
pub struct Parent {
child: MyTraitPtr,
}
You can also implement that directly to Arc:
pub type MyTraitPtr = Arc<dyn MyTrait>;
#[async_trait]
pub trait MyTrait: Sync + Send {
async fn foo(&self);
}
#[async_trait]
impl MyTrait for MyTraitPtr {
async fn foo(&self) { }
}
pub struct Parent {
child: MyTraitPtr,
}

Rust - How to use a synchronous and an asynchronous crate in one application

I began writing a program using the Druid crate and Crabler crate to make a webscraping application whose data I can explore. I only realized that merging synchronous and asynchronous programming was a bad idea long after I had spent a while building this program. What I am trying to do right now is have the scraper run while the application is open (preferably every hour).
Right now the scraper doesn't run until after the application is closed. I tried to use Tokio's spawn to make a separate thread that starts before the application opens, but this doesn't work because the Crabler future doesn't have the "Send" trait.
I tried to make a minimal functional program as shown below. The title_handler doesn't function as expected but otherwise it demonstrates the issue I'm having well.
Is it possible to allow the WebScraper to run while the application is open? If so, how?
EDIT: I tried using task::spawn_blocking() to run the application and it threw out a ton of errors, including that druid doesn't implement the trait Send.
use crabler::*;
use druid::widget::prelude::*;
use druid::widget::{Align, Flex, Label, TextBox};
use druid::{AppLauncher, Data, Lens, WindowDesc, WidgetExt};
const ENTRY_PREFIX: [&str; 1] = ["https://duckduckgo.com/?t=ffab&q=rust&ia=web"];
// Use WebScraper trait to get each item with the ".result__title" class
#[derive(WebScraper)]
#[on_response(response_handler)]
#[on_html(".result__title", title_handler)]
struct Scraper {}
impl Scraper {
// Print webpage status
async fn response_handler(&self, response: Response) -> Result<()> {
println!("Status {}", response.status);
Ok(())
}
async fn title_handler(&self, _: Response, el: Element) -> Result<()> {
// Get text of element
let title_data = el.children();
let title_text = title_data.first().unwrap().text().unwrap();
println!("Result is {}", title_text);
Ok(())
}
}
// Run scraper to get info from https://duckduckgo.com/?t=ffab&q=rust&ia=web
async fn one_scrape() -> Result<()> {
let scraper = Scraper {};
scraper.run(Opts::new().with_urls(ENTRY_PREFIX.to_vec()).with_threads(1)).await
}
#[derive(Clone, Data, Lens)]
struct Init {
tag: String,
}
fn build_ui() -> impl Widget<Init> {
// Search box
let l_search = Label::new("Search: ");
let tb_search = TextBox::new()
.with_placeholder("Enter tag to search")
.lens(Init::tag);
let search = Flex::row()
.with_child(l_search)
.with_child(tb_search);
// Describe layout of UI
let layout = Flex::column()
.with_child(search);
Align::centered(layout)
}
#[async_std::main]
async fn main() -> Result<()> {
// Describe the main window
let main_window = WindowDesc::new(build_ui())
.title("Title Tracker")
.window_size((400.0, 400.0));
// Create starting app state
let init_state = Init {
tag: String::from("#"),
};
// Start application
AppLauncher::with_window(main_window)
.launch(init_state)
.expect("Failed to launch application");
one_scrape().await
}

Resources