Does async programming make sense for AWS lambda? - asynchronous

Asynch programming helps to increase the number of requests a server can handle simultaneously but not necessarily the time to produce the response.
Since AWS cost depends only on the number of requests and on the execution time, does it make sense to use asynch programming?
In my case, I have a java lambda that needs to call a single http server. No more than that.
I could either use an http synch or asynch library.
The synch code is more readable. I understand the asynch code is generally better in case of concurrency but probably this not in case of AWS lambda.

What async has to do with AWS Lambda? I think you are mixing up things here. You can invoke AWS Lambda both in synch and asynch way (If AWS resources, there are few limitations on which type of invocation they can do) . AWS Lambda are really for state-less, short running, scale-able (concurrently) computing requests. How invoke them depends on your need.

Related

Asynchronous GRPC?

I am working on designing a new system that will take a an array of hashes of car data and then use this data to call a separate API that returns a Boolean, after which I will return to the original caller the car model and either true or false.
The system needs to be callable from other applications so I am looking into GRPC to solve the problem. My question revolves around how to implement this solution in GRPC and whether or not something like RabbitMQ would be better?
Would it make sense to make a bidirectional streaming GRPC solution where the client streams in the list of cars and then on the servers end I spawn off say a delayed job for each request on the server? And then when each delayed job finishes processing I return that value to the original caller in a stream?
Is this an elegant solution or are there better ways to achieve my goal? Thanks.
The streaming system of gRPC is typically designed for asynchronous communication, so it should fit your usage case neatly.
The general design philosophy in this case is to consider each individual message sent in the stream as independent. Basically, make sure your proto message contains all the information it needs to be parsed and processed by your application without needing any context from previous calls.

Design to support a fast and slow client

I have a situation where I host a high RPS highly available service that receives requests aka commands. These commands have to be sent to N downstream clients, who actually execute them. Each downstream client is separate microsevice and has different constraints like mode (sync,async), execution cadence etc.
Should a slow downstream client build the logic to receive all requests and execute them in batches as they want ? Or my service should build logic to talk to slow and fast clients by maintaining state for commands across downstream clients. Share your opinions
Not enough info to give any prescriptive advice, but I'd start with dividing the tasks into async and sync first. Those are 2 completely different workloads that, most likely, would require different implementation stacks. I'll give you an idea of what you can start with in the world of AWS...
Not knowing what you mean by async, I'd default to a message-bus setup. In that case you can use something like Amazon Kinesis or Kafka for ingestion purposes, and kicking off Lambda or EC2 instance. If the clients need to be notified of a finished job they can either long-poll an SQS queue, subscribe to an SNS topic, or use MQTT with websockets for a long-running connection.
The sync tasks are easier, since it's all about processing power. Just make sure you have your EC2 instances in an auto-scaling group behind an ALB or API Gateway to scale out, and in, appropriately.
This is a very simple answer since I don't have any details needed to be more precise, but this should give you an idea of where to get started.

How Datastax implements its async API driver for Cassandra?

I'm trying to convince a coworker of the benefits of using the Session#executeAsync.
However, since we are using the driver from Scala, it would be rather easy to wrap the sync call Session#execute in a Future and that would be all to transform it in an async call. This will be already an improvement because it will give us the opportunity of avoid blocking the current thread (in our case that would represent blocking the threads that handles http requests in play with a huge impact on the number of requests that can be handled concurrently)
I argue that if the work needed to implement an async driver will be wrap it in a Future it won't exist implementations like ReactiveMongo an the Async Api for Cassandra from Datastax.
So,
What are the benefits of using the async api?
How is the async api implemented in Datastax driver and it what libraries and OS features relies on?
What kind of problems were to be solved beyond the asynchronous networks calls? (I mean, implement the async driver must be more than just using java nio)
How is the async api implemented in Datastax driver and it what libraries and OS features relies on?
Datastax java driver based on Netty networking framework. Netty itself based on Event Driven model. Also for some operating systems Netty provides native transports to improve performance e.g. epoll for Linux.
What are the benefits of using the async api?
I'm not a Scala expert but as I know Scala Future based on Threads model (Execution contexts). It means you need to submit a request to another thread to execute the request asynchronously. For IO tasks you just need request another system and wait response from this system. If you have a big number of requests, all threads in your pool will be busy but will not do anything useful. Thread is a fairly expensive resource and it can be a problem to have thousands threads in the same physical resource. Threads are good for parallel calculation tasks but not for IO tasks.
From other hand Datastax java driver based on Event Driven model (Netty). It means the each request will be submitted in event loop queue. For each iteration of event loop, Netty will define the state of request and will execute handlers associated with this request.
This approach avoids of memory usage overhead for threads and allows you to perform thousands of IO requests in the same time. But in this case you should define slow or blocking request callbacks in another thread to avoid blocking of event-loop.

Multithreaded comet server library

I'm looking for multithreaded comet server library - what I need is async io (using epoll) working on a threadpool (4-8 threads). Tornado would be ideal if it was multithreaded.
Why multithreaded? I need to process and serve data which could come from every connected user - it could be synchronised between tornado instances using database but even nosql would be too big slowdown - almost every request would end up with database write/update - which even by using async drivers isn't a good idea. I can store everything in local volataile memory so it can be very fast - but must be run on single process to avoid inter-process communication. I don't need to scale - single box is enough - but it MUST be fast. Some data will be stored in MongoDB - but number of mongo queries will be like 5% of normal requests.
And important thing - semaphores (and other higher level approaches) are not rocket science for me so I'm not afraid of synchronisation.
Requirements:
async io
non-blocking
thousands of concurrent connections
FAST
basic HTTP features (GET, POST, cookies)
ability to process request asynchronously (do something, async call with callback (ex. database query), process callback, return data)
thread pool
C++/Java/Python
simple and lightweight
It would be nice to have async mongo driver too
I've looked into Boost ASIO and it seems to be capable of doing what I need - but I want to focus on application - not writing http request processing.
I've read about Tornado (seems ideal but is single threaded), Simple (not sure if it can process request asynchronously and return data after async call), BOOST ASIO (very nice, but too low-level)
Well, after more digging I decided to change technology... I decided to create my own protocol on top of TCP and Netty

How can a LuaSocket server handle several requests simultaneously?

The problem is the inability of my Lua server to accept multiple request simultaneously.
I attempted to make each client message be processed in its on coroutine, but this seems to have failed.
while true do
local client = server:accept()
coroutine.resume(coroutine.create( function()
GiveMessage( client )
end ) )
end
This code seems to not actually accept more than one client message at the same time. What is wrong with this method? Thank you for helping.
You will not be able to create true simultaneous handling with coroutines only — coroutines are for cooperative multitasking. Only one coroutine is executed at the same time.
The code that you've wrote is no different from calling GiveMessage() in a loop directly. You need to write a coroutine dispatcher and find a sensible reason to yield from GiveMessage() for that approach to work.
There are least three solutions, depending on the specifics of your task:
Spawn several forks of your server, handle operations in coroutines in each fork. Control coroutines either with Copas or with lua-ev or with home-grown dispatcher, nothing wrong with that. I recommend this way.
Use Lua states instead of coroutines, keep a pool of states, pool of worker OS threads and a queue of tasks. Execute each task in a free Lua state with a free worker thread. Requires some low-level coding and is messier.
Look for existing more specialized solutions — there are several, but to advice on that I need to know better what kind of server you're writing.
Whatever you choose, avoid using single Lua state from several threads at the same time. (It is possible, with the right amount of coding, but a bad idea.)
AFAIK coroutines don't play nice with luaSocket out-of-the-box. But there is Copas you can use.

Resources