How to process a vector as an asynchronous stream? - asynchronous

In my RSS reader project, I want to read my RSS feeds asynchronously. Currently, they're read synchronously thanks to this code block
self.feeds = self
.feeds
.iter()
.map(|f| f.read(&self.settings))
.collect::<Vec<Feed>>();
I want to make that code asynchronous, because it will allow me to better handle poor web server responses.
I understand I can use a Stream that I can create from my Vec using stream::from_iter(...) which transforms the code into something like
self.feeds = stream::from_iter(self.feeds.iter())
.map(|f| f.read(&self.settings))
// ???
.collect::<Vec<Feed>>()
}
But then, I have two questions
How to have results joined into a Vec (which is a synchronous struct)?
How to execute that stream? I was thinking about using task::spawn but it doesn't seems to work ...

How to execute that stream? I was thinking about using task::spawn but it doesn't seems to work
In the async/await world, asynchronous code is meant to be executed by an executor, which is not part of the standard library but provided by third-party crates such as tokio. task::spawn only schedules one instance of async fn to run, not actually running it.
How to have results joined into a vec (which is a sync struct)
The bread and butter of your rss reader seems to be f.read. It should be turned into an asynchronous function. Then the vector of feeds will be mapped into a vector of futures, which need to be polled to completion.
The futures crate has futures::stream::futures_unordered::FuturesUnordered to help you do that. FuturesUnordered itself implements Stream trait. This stream is then collected into the result vector and awaited to completion like so:
//# tokio = { version = "0.2.4", features = ["full"] }
//# futures = "0.3.1"
use tokio::time::delay_for;
use futures::stream::StreamExt;
use futures::stream::futures_unordered::FuturesUnordered;
use std::error::Error;
use std::time::{Duration, Instant};
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let start = Instant::now();
let feeds = (0..10).collect::<Vec<_>>();
let res = read_feeds(feeds).await;
dbg!(res);
dbg!(start.elapsed());
Ok(())
}
async fn read_feeds(feeds: Vec<u32>) -> Vec<u32> {
feeds.iter()
.map(read_feed)
.collect::<FuturesUnordered<_>>()
.collect::<Vec<_>>()
.await
}
async fn read_feed(feed: &u32) -> u32 {
delay_for(Duration::from_millis(500)).await;
feed * 2
}
delay_for is to simulate the potentially expensive operation. It also helps to demonstrate that these readings indeed happen concurrently without any explicit thread related logic.
One nuance here. Unlike its synchronous counterpart, the results of reading rss feeds aren't in the same order of feeds themselves any more, whichever returns the first will be at the front. You need to deal with that somehow.

Related

Testing Async Functions

I am using Wasm-Pack and I need to write a unit test for a asynchronous function that references a JavaScript Library. I tried using the futures::executor::block_on in order to get the asynchronous function to return so I could make an assert. However, blocking is not supported in the wasm build target. I can't test in a different target because the asynchronous function I am testing is referencing a JavaScript library. I also don't think I can spawn a new thread and handle the future there, because it need to return to the assert statement in the original thread. What is the best way to go about testing this asynchronous function?
Code being tested in src/lib.rs
use wasm_bindgen::prelude::*;
#[wasm_bindgen]
pub async fn func_to_test() -> bool {
return some_long_running_fuction().await;
}
Testing code in tests/web.rs
#![cfg(target_arch = "wasm32")]
extern crate wasm_bindgen_test;
use test_crate;
use futures::executor::block_on;
#[wasm_bindgen_test]
fn can_return_from_async(){
let ret = block_on(test_crate::func_to_test());
assert!(ret);
}
How do I test an async function if I can't use any blocking?
Rust can handle tests that are async functions themselves. Just change the test fuction to be async and throw in an await.
#[wasm_bindgen_test]
async fn can_return_from_async(){
let ret = test_crate::func_to_test().await
assert!(ret);
}

Is there a way to poll several futures simultaniously in rust async

I'm trying to execute several sqlx queries in parallel given by a iterator.
This is probably the closest I've got so far.
let mut futures = HahshMap::new() // placeholder, filled HashMap in reality
.iter()
.map(async move |(_, item)| -> Result<(), sqlx::Error> {
let result = sqlx::query_file_as!(
// omitted
)
.fetch_one(&pool)
.await?;
channel.send(Enum::Event(result)).ignore();
Ok(())
})
.clollect();
futures::future::join_all(futures);
All queries and sends are independent from each other, so if one of them fails, the others should still get processed.
Futthermore the current async closure is not possible like this.
Rust doesn't yet have async closures. You instead need to have the closure return an async block:
move |(_, item)| async move { ... }
Additionally, make sure you .await the future returned by join_all in order to ensure the individual tasks are actually polled.

Should I return await in Rust?

In JavaScript, async code is written with Promises and async/await syntax similar to that of Rust. It is generally considered redundant (and therefore discouraged) to return and await a Promise when it can simply be returned (i.e., when an async function is executed as the last thing in another function):
async function myFn() { /* ... */ }
async function myFn2() {
// do setup work
return await myFn()
// ^ this is not necessary when we can just return the Promise
}
I am wondering whether a similar pattern applies in Rust. Should I prefer this:
pub async fn my_function(
&mut self,
) -> Result<()> {
// do synchronous setup work
self.exec_command(
/* ... */
)
.await
}
Or this:
pub fn my_function(
&mut self,
) -> impl Future<Output = Result<()>> {
// do synchronous setup work
self.exec_command(
/* ... */
)
}
The former feels more ergonomic to me, but I suspect that the latter might be more performant. Is this the case?
One semantic difference between the two variants is that in the first variant the synchronous setup code will run only when the returned future is awaited, while in the second variant it will run as soon as the function is called:
let fut = x.my_function();
// in the second variant, the synchronous setup has finished by now
...
let val = fut.await; // in the first variant, it runs here
For the difference to be noticeable, the synchronous setup code must have side effects, and there needs to be a delay between calling the async function and awaiting the future it returns.
Unless you have specific reason to execute the preamble immediately, go with the async function, i.e. the first variant. It makes the function slightly more predictable, and makes it easier to add more awaits later as the function is refactored.
There is no real difference between the two since async just resolves down to impl Future<Output=Result<T, E>>. I don't believe there is any meaningful performance difference between the two, at least in my empirical usage of both.
If you are asking for preference in style then in my opinion the first one is preferred as the types are clearer to me and I agree it is more ergonomic.

How do I execute an async/await function without using any external dependencies?

I am attempting to create simplest possible example that can get async fn hello() to eventually print out Hello World!. This should happen without any external dependency like tokio, just plain Rust and std. Bonus points if we can get it done without ever using unsafe.
#![feature(async_await)]
async fn hello() {
println!("Hello, World!");
}
fn main() {
let task = hello();
// Something beautiful happens here, and `Hello, World!` is printed on screen.
}
I know async/await is still a nightly feature, and it is subject to change in the foreseeable future.
I know there is a whole lot of Future implementations, I am aware of the existence of tokio.
I am just trying to educate myself on the inner workings of standard library futures.
My helpless, clumsy endeavours
My vague understanding is that, first off, I need to Pin task down. So I went ahead and
let pinned_task = Pin::new(&mut task);
but
the trait `std::marker::Unpin` is not implemented for `std::future::GenFuture<[static generator#src/main.rs:7:18: 9:2 {}]>`
so I thought, of course, I probably need to Box it, so I'm sure it won't move around in memory. Somewhat surprisingly, I get the same error.
What I could get so far is
let pinned_task = unsafe {
Pin::new_unchecked(&mut task)
};
which is obviously not something I should do. Even so, let's say I got my hands on the Pinned Future. Now I need to poll() it somehow. For that, I need a Waker.
So I tried to look around on how to get my hands on a Waker. On the doc it kinda looks like the only way to get a Waker is with another new_unchecked that accepts a RawWaker. From there I got here and from there here, where I just curled up on the floor and started crying.
This part of the futures stack is not intended to be implemented by many people. The rough estimate that I have seen in that maybe there will be 10 or so actual implementations.
That said, you can fill in the basic aspects of an executor that is extremely limited by following the function signatures needed:
async fn hello() {
println!("Hello, World!");
}
fn main() {
drive_to_completion(hello());
}
use std::{
future::Future,
ptr,
task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
};
fn drive_to_completion<F>(f: F) -> F::Output
where
F: Future,
{
let waker = my_waker();
let mut context = Context::from_waker(&waker);
let mut t = Box::pin(f);
let t = t.as_mut();
loop {
match t.poll(&mut context) {
Poll::Ready(v) => return v,
Poll::Pending => panic!("This executor does not support futures that are not ready"),
}
}
}
type WakerData = *const ();
unsafe fn clone(_: WakerData) -> RawWaker {
my_raw_waker()
}
unsafe fn wake(_: WakerData) {}
unsafe fn wake_by_ref(_: WakerData) {}
unsafe fn drop(_: WakerData) {}
static MY_VTABLE: RawWakerVTable = RawWakerVTable::new(clone, wake, wake_by_ref, drop);
fn my_raw_waker() -> RawWaker {
RawWaker::new(ptr::null(), &MY_VTABLE)
}
fn my_waker() -> Waker {
unsafe { Waker::from_raw(my_raw_waker()) }
}
Starting at Future::poll, we see we need a Pinned future and a Context. Context is created from a Waker which needs a RawWaker. A RawWaker needs a RawWakerVTable. We create all of those pieces in the simplest possible ways:
Since we aren't trying to support NotReady cases, we never need to actually do anything for that case and can instead panic. This also means that the implementations of wake can be no-ops.
Since we aren't trying to be efficient, we don't need to store any data for our waker, so clone and drop can basically be no-ops as well.
The easiest way to pin the future is to Box it, but this isn't the most efficient possibility.
If you wanted to support NotReady, the simplest extension is to have a busy loop, polling forever. A slightly more efficient solution is to have a global variable that indicates that someone has called wake and block on that becoming true.

IRC server doesn't respond to Rust IRC Client identify requests

I'm working on an IRC bot using TcpStream from the standard library.
I'm able to read all the lines that come in, but the IRC server doesn't seem to respond to my identify requests. I thought I was sending the request too soon so I tried sleeping before sending the IDENT but that doesn't work. I write using both BufReader, BufWriter and calling read and write directly on the stream to no avail.
use std::net::TcpStream;
use std::io::{BufReader, BufWriter, BufRead, Write, Read};
use std::{thread, time};
struct Rusty {
name: String,
stream: TcpStream,
reader: BufReader<TcpStream>,
writer: BufWriter<TcpStream>,
}
impl Rusty {
fn new(name: &str, address: &str) -> Rusty {
let stream = TcpStream::connect(address).expect("Couldn't connect to server");
let reader = BufReader::new(stream.try_clone().unwrap());
let writer = BufWriter::new(stream.try_clone().unwrap());
Rusty {
name: String::from(name),
reader: reader,
writer: writer,
stream: stream,
}
}
fn write_line(&mut self, string: String) {
let line = format!("{}\r\n", string);
&self.writer.write(line.as_bytes());
}
fn identify(&mut self) {
let nick = &self.name.clone();
self.write_line(format!("USER {} {} {} : {}", nick, nick, nick, nick));
self.write_line(format!("NICK {}", nick));
}
fn read_lines(&mut self) {
let mut line = String::new();
loop {
self.reader.read_line(&mut line);
println!("{}", line);
}
}
}
fn main() {
let mut bot = Rusty::new("rustyrusty", "irc.rizon.net:6667");
thread::sleep_ms(5000);
bot.identify();
bot.read_lines();
}
It's very important to read the documentation for the components we use when programming. For example, the docs for BufWriter states (emphasis mine):
Wraps a writer and buffers its output.
It can be excessively inefficient to work directly with something that
implements Write. For example, every call to write on TcpStream
results in a system call. A BufWriter keeps an in-memory buffer of
data and writes it to an underlying writer in large, infrequent
batches.
The buffer will be written out when the writer is dropped.
Said another way, the entire purpose of a buffered reader or writer is that read or write requests don't have a one-to-one mapping to the underlying stream.
That means when you call write, you are only writing to the buffer. You also need to call flush if you need to ensure that the bytes are written to the underlying stream.
Additionally, you should:
Handle the errors that can arise from read, write, and flush.
Re-familiarize yourself with what each function does. For example, read and write don't guarantee that they read or write as much data as you ask them to. They may perform a partial read or write, and it's up to you to handle that. That's why there are helper methods like read_to_end or write_all.
Clear your String that you are reading into. Otherwise the output just repeats every time the loop cycles.
Use write! instead of building up a string that is immediately thrown away.
fn write_line(&mut self, string: &str) {
write!(self.writer, "{}\r\n", string).unwrap();
self.writer.flush().unwrap();
}
With these changes, I was able to get a PING message from the server.

Resources