I am working on a simple Lambda function and I was wondering if I could pass in client (dynamodb this time) to the handler, so we do not re-connect for every request.
The macro is defined here:
https://docs.rs/lambda_http/0.1.1/lambda_http/macro.lambda.html 3
My function so far:
fn main() -> Result<(), Box<dyn Error>> {
simple_logger::init_with_level(log::Level::Debug)?;
info!("Starting up...");
let dynamodb_client = DynamoDbClient::new(Region::EuCentral1);
lambda!(router);
return Ok(());
}
fn router(req: Request, ctx: Context) -> Result<impl IntoResponse, HandlerError> {
let h_req = HReq {
http_path: req.uri().path(),
http_method: req.method(),
};
match h_req {
HReq {
http_path: "/login",
http_method: &Method::POST,
} => user_login(req, ctx),
_ => {
error!(
"Not supported http method or path {}, {}",
h_req.http_path, h_req.http_method
);
let mut resp = Response::default();
*resp.status_mut() = StatusCode::METHOD_NOT_ALLOWED;
Ok(resp)
}
}
}
Is it possible to extend this macro to have a second option so I can add the client all the way down to the functions which are actually talking to the database?
DynamoDB is a web service, each request to it is treated as a distinct API call.
There is no functionality to keep a client connection alive in the same way you would with a regular database connection (e.g. MySQL).
My rust knowledge is a little lacking, so I don't know if http keepalive is set by default with the DynamoDBClient, but making sure http keepalive is set will help performance.
After considering all the options I decided to implement this with lazy_static.
#[macro_use]
extern crate lazy_static;
lazy_static! {
static ref DYNAMODB_CLIENT: DynamoDbClient = DynamoDbClient::new(Region::EuCentral1);
}
This is getting instantiated at run time and can be used internally in the module without any problems.
Related
I am finding it difficult to understand why and when I need to explicitly do something with the Context and/or its Waker passed to the poll method on an object for which I am implementing Future. I have been reading the documentation from Tokio and the Async Book, but I feel the examples/methods are too abstract to be applied to real problems.
For example, I would have thought the following MRE would deadlock since the future generated by new_inner_task would not know when a message has been passed on the MPSC channel, however, this example seems to work fine. Why is this the case?
use std::{future::Future, pin::Pin, task::{Context, Poll}, time::Duration};
use futures::{FutureExt, StreamExt}; // 0.3
use tokio::sync::mpsc; // 1.2
use tokio_stream::wrappers::UnboundedReceiverStream; // 0.1
async fn new_inner_task(rx: mpsc::UnboundedReceiver<()>) {
let mut requests = UnboundedReceiverStream::new(rx);
while let Some(_) = requests.next().await {
eprintln!("received request");
}
}
pub struct ActiveObject(Pin<Box<dyn Future<Output = ()> + Send>>);
impl ActiveObject {
pub fn new() -> (Self, mpsc::UnboundedSender<()>) {
let (tx, rx) = mpsc::unbounded_channel();
(Self(new_inner_task(rx).boxed()), tx)
}
}
impl Future for ActiveObject {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
eprintln!("[polled]");
self.get_mut().0.as_mut().poll(cx)
}
}
async fn delayed_send(delay: u64, sender: mpsc::UnboundedSender<()>) {
tokio::time::sleep(Duration::from_millis(delay)).await;
sender.send(()).unwrap();
eprintln!("sent request");
}
#[tokio::main]
async fn main() {
let (obj, tx) = ActiveObject::new();
let ds = delayed_send(500, tx.clone());
let ds2 = delayed_send(1000, tx);
tokio::join!(obj, ds, ds2);
}
The output that I get from running this example locally is:
[polled]
[polled]
sent request
[polled]
received request
[polled]
sent request
[polled]
received request
So, although I haven't done anything with Context or Waker, ActiveObject appears to get polled at a reasonable rate, that is, more frequently than required, but not busy-waiting. What is causing ActiveObject to be woken up/polled at this rate?
You are passing the same Context (and thus Waker) to the poll() method of the Future returned by new_inner_task, which passes it down the chain to the poll() of the Future returned by UnboundedReceiverStream::next(). The implementation of that arranges to call wake() on this Waker at the appropriate time (when new elements appear in the channel). When that is done, Tokio polls the top-level future associated with this Waker - the join!() of the three futures.
If you omitted the line that polls the inner task and just returned Poll::Pending instead, you would get the expected situation, where your Future would be polled once and then "hang" forever, as nothing would wake it again.
Background
I'm developing a database manager via http connection in Rust. The database has a authenticaition-expiration strategy, and I need to persist the connection in my database manager by re-login after the expiration time.
Pseudocode
My database manager is Database, and there are some methods of it:
struct Database {
access_token: Option<String>
}
impl Database {
async fn login(&mut self, auth: Option<String>) {
if let Some(auth) = auth {
// If user provide auth, login with auth
let response = do_log_in(auth).await;
// get access_token from response
self.access_token = Some(response.access_token);
} else {
// If user doesn't provide auth, login with access_token
do_log_in(access_token).await;
}
restart_timer();
}
// restart the login timer
fn restart_timer(&mut self) {
self.cancel_timer();
self.do_start_timer();
}
}
My implementation
I use tokio to implement the restart_timer thing:
First, I add a field timer: Option<tokio::task::JoinHandle> to Database
Then, when do_start_timer() is called, use tokio::spawn to spawn a new task and assign the returning join handle to self.timer.
Inside the tokio::spawn closure, I await the tokio::time::delay_for, then await the login.
To cancel the timer, I just assign self.timer with None, making the task detached.
Problem
In the detail code of the above implementation, there is a touch thing:
self.timer = Some(tokio::spawn(async move {
tokio::time::delay_for(/* ... */).await;
self.login().await;
}));
The above snippet is in the do_start_timer thing, but this can't compile:
expected `&mut Database`
found `&mut Database`
but, the lifetime must be valid for the static
I don't know what to do.
I have a function which may or may not be called as an asynchronous go-routine.
func APICall(request *HTTPRequest) *HTTPResponse
*HTTPRequest is a pointer to a struct which contains various pieces of data required in order to build a request:
type HTTPRequest struct {
// Represents a request to the twitter API
method string
baseurl string
urlParams map[string]string
bodyParams map[string]string
authParams map[string]string
responseChan chan *HTTPResponse
}
If called as a goroutine, i.e a channel is passed in; we build the request and write the response into the *HTTPResponse object (also a struct) of the provided channel. What is the most graceful / idiomatic way to accept a call to the function without a channel (ie. Not async)
At the moment, we do something like this within the body of APICall to deal with both kinds of function call:
if request.responseChan != nil { // If a response channel has been specified, write to that channel
request.responseChan <- &twitterHTTPResponse{body, nil}
return nil // Not returning a struct
} else {
return &twitterHTTPResponse{body, nil} // Return a pointer to a new struct representing the response
}
Are we along the right lines?
The idiomatic approach is to provide a synchronous API:
type HTTPRequest struct {
// Represents a request to the twitter API
method string
baseurl string
urlParams map[string]string
bodyParams map[string]string
authParams map[string]string
}
func APICall(request *HTTPRequest) *HTTPResponse {
...
return &twitterHTTPResponse{body, nil}
}
The caller an can easily create a goroutine if it needs to run the call concurrently. For example:
r := make(chan *HTTPResponse)
go func() {
r <- APICall(req)
}()
... do some other work
resp := <- r
Synchronous APIs are idiomatic for a couple of reasons:
Synchronous APIs are easier to use and understand.
Synchronous APIs don't make incorrect assumptions about how the application is managing concurrency. For example, the application may want to use a wait group to wait for completion instead of receiving on a channel as assumed by the API.
I want to do message broadcasting: when one of the clients sends a message, the server writes it to each socket. My main problem is I can't figure out how to send the Vec to the threads. I can't use Mutex because that will lock the access of other threads to the Vec for reading. I can't clone and send because TcpStream can't be cloned and sent. Here's my attempt until now
use std::net::{TcpStream, TcpListener};
use std::io::prelude::*;
use std::sync::{Arc, Mutex};
use std::thread;
use std::sync::mpsc::{channel, Receiver};
use std::cell::RefCell;
type StreamSet = Arc<RefCell<Vec<TcpStream>>>;
type StreamReceiver = Arc<Mutex<Receiver<StreamSet>>>;
fn main() {
let listener = TcpListener::bind("0.0.0.0:8000").unwrap();
let mut connection_set: StreamSet = Arc::new(RefCell::new(vec![]));
let mut id = 0;
let (tx, rx) = channel();
let rx = Arc::new(Mutex::new(rx));
for stream in listener.incoming() {
let receiver = rx.clone();
let stream = stream.unwrap();
(*connection_set).borrow_mut().push(stream);
println!("A connection established with client {}", id);
thread::spawn(move || handle_connection(receiver, id));
id += 1;
tx.send(connection_set.clone()).unwrap();
}
}
fn handle_connection(rx: StreamReceiver, id: usize) {
let streams;
{
streams = *(rx.lock().unwrap().recv().unwrap()).borrow();
}
let mut connection = &streams[id];
loop {
let mut buffer = [0; 512];
if let Err(_) = connection.read(&mut buffer) {
break;
};
println!("Request: {}", String::from_utf8_lossy(&buffer[..]));
if let Err(_) = connection.write(&buffer[..]) {
break;
};
if let Err(_) = connection.flush() {
break;
};
}
}
Another idea is to spawn a single "controller" thread and a thread for every socket. Each thread would own the socket and have a channel to send data back to the controller. The controller would own a Vec of channels to send to each thread. When a thread receives data, you send it to the controller which duplicates it and sends it back to each worker thread. You can wrap the data in an Arc to prevent unneeded duplication, and you should provide an ID to avoid echoing the data back to the original sender (if you need to).
This moves the ownership completely within a single thread, which should avoid the issues you are experiencing.
You may also wish to look into Tokio, which should allow you to do something similar but without the need to spawn threads in a 1-1 mapping.
I can't use Mutex because that will lock the access of other threads
You can always try a different locking mechanism such as a RwLock.
because TcpStream can't be cloned
Sure it can: TcpStream::try_clone.
I wanna make a web crawling, currently i am reading a txt file with 12000 urls, i wanna use concurrency in this process, but the requests don't work.
typealias escHandler = ( URLResponse?, Data? ) -> Void
func getRequest(url : URL, _ handler : #escaping escHandler){
let session = URLSession(
configuration: .default,
delegate: nil,
delegateQueue: nil)
var request = URLRequest(url:url)
request.httpMethod = "GET"
let task = session.dataTask(with: request){ (data,response,error) in
handler(response,data)
}
task.resume()
}
for sUrl in textFile.components(separatedBy: "\n"){
let url = URL(string: sUrl)!
getRequest(url: url){ response,data in
print("RESPONSE REACHED")
}
}
If you have your URLSessions working correctly, all you need to go is create separate OperationQueue create a Operation for each of your async tasks you want completed, add it to your operation queue, and set your OperationQueue's maxConcurrentOperationCount to control how many of your tasks can run at one time. Puesdo code:
let operationQueue = OperationQueue()
operationQueue.qualityOfService = .utility
let exOperation = BlockOperation(block: {
//Your URLSessions go here.
})
exOperation.completionBlock = {
// A completionBlock if needed
}
operationQueue.addOperation(exOperation)
exOperation.start()
Using a OperationQueue subclass and Operation subclass will give you additional utilities for dealing with multiple threads.