How to get the internal queueing time of requests sent to grpc? - grpc

I want to get the time spent by a request waiting in the internal queue that GRPC maintains before it's picked up by a worker thread for execution on the RPC code.
I took a look at the interceptor offered by grpc (say to add a timestamp when the request is received by the server) but after looking at the server implementation of GRPC it seems like the interceptor pipeline is executed after the request is polled from the completion queue.
Any suggestion to get the queueing time for each request before it's executed is appreciated. Or an explanation, if my understanding above is wrong, is too.

Related

How to mock "io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason" when doing grpc call?

I'm implementing a retry logic for my grpc call, when it saw StatusRuntimeException it will do retry several times.
My question is how can I mock the call will throw StatusRuntimeException?
My thought is to set the alive time and alive timeout really small, like 5mills, is that works? Or is there any other good way to do that
NettyChannelBuilder.forAddress()
.keepAliveTime(5, TimeUnit.MILLISECONDS)
.keepAliveTimeout(5, TimeUnit.MILLISECONDS)
.keepAliveWithoutCalls(true)
grpc-java has retry feature out-of-box, you might try:
https://github.com/grpc/grpc-java/tree/master/examples/src/main/java/io/grpc/examples/retrying
gRPC A8 suggests never set keepalive time within 1min, you would see GOAWAY from server if so.
Suggests for clients to avoid configuring their keepalive much below one minute (see Server Enforcement section for additional details)
Actually, if you simply no not start a server, you see StatusRuntimeException: UNAVAILABLE: io exception

In async grpc server is there a way to measure the time a request has spent in the completion queue?

We have an async grpc c++ server and we'd like to get stats about how long a request has spent in the completion queue before it has been picked up by an application. Is there a way to get this stat in grpc?
No, gRPC C++ doesn't have metrics for that.

High response time vs queuing

Say I have a webserivce used internally by other webservices with an average response time of 1 minute.
What are the pros and cons of such a service with "synchronous" responses versus making the service return id of the request, process it in the background and make the clients poll for results?
Is there any cons with HTTP connections which stay active for more than one minute? Does the default keep alive of TCP matters here?
Depending on your application it may matter. Couple of things worth mentioning are !
HTTP protocol is sync
There is very wide misconception that HTTP is async. Http is synchronous protocol but your client could deal it async. E.g. when you call any service using http, your http client may schedule is on the background thread (async). However The http call will be waiting until either it's timeout or response is back , during all this time the http call chain is awaiting synchronously.
Sockets
Since HTTP uses socket and there is hard limit on sockets. Every HTTP connection (if created new every time) opens up new socket . if you have hundreds of requests at a time you can image how many http calls are scheduled synchronously and you may run of sockets. Not sure for other operation system but on windows even if you are done with request sockets they are not disposed straight away and stay for couple of mins.
Network Connectivity
Keeping http connection alive for long is not recommended. What if you loose network partially or completely ? your http request would timeout and you won't know the status at all.
Keeping all these things in mind it's better to schedule long running tasks on background process.
If you keep the user waiting while your long job is running on server, you are tying up a valuable HTTP connection while waiting.
Best practice from RestFul point of view is to reply an HTTP 202 (Accepted) and return a response with the link to poll.
If you want to hang the client while waiting, you should set a request timeout at the client end.
If you've some Firewalls in between, that might drop connections if they are inactive for some time.
Higher Response Throughput
Typically, you would want your OLTP (Web Server) to respond quickly as possible, Since your queuing the task on the background, your web server can handle more requests which results to higher response throughput and processing capabilities.
More Memory Friendly
Queuing long running task on background jobs via messaging queues, prevents abusive usage of web server memory. This is good because it will increase the Out of memory threshold of your application.
More Resilient to Server Crash
If you queue task on the background and something goes wrong, the job can be queued to a dead-letter queue which helps you to ultimately fix problems and re-process the request that caused your unhandled exceptions.

Multiple consumer on single JMS queue

JMS Queue is having 2 consumers, synchronous and asynchronous Java application process waiting for the response.
1)Synchronous application send request and will be waiting for the response for 60 seconds based on the JMS correlation ID.
2)Asynchronous thread will be constantly listening on the same queue.
In this scenario, when the response is received on the queue within 60 second I would expect load is distributed on both synchronous and asynchronous application. However, for some unknown reason almost all the response messages are consumed by synchronous process. And,only in some cases the messages are picked up asynchronous process.
Are there any factors that could cause only synchronous application to pick almost all the messages?
There is usually no guarantee that the load will be distributed evenly, especially if its synchronous versus async. consumer. The synchronous consumer will have to poll, wait, poll, wait while the async. consumer is probably waiting on the socket in a separate thread until a message arrives and then call your callback. So the async. consumer will most always be there first.
Any chance you can change to Topics and discard messages you don't wont ? Or change your sync. consumer to be async ? Another alternative would be to build a small 'asnyc' gateway in front of your synchronous consumer: a little application that makes an async consumption and then copies each message received to a second queue where the sync. consumer picks it up. Depending on your JMS provider it might support this type of 'JMS bridge' already - what are you using ?

Async Netty HttpServer and HttpClient

I have been exploring Netty for the past days, as I am writing a quick and tight HTTP server that should receive lots of requests, and Netty's HTTP server implementation is quite simple and does the job.
My next step is as part of the request handling, I need to launch an HTTP request to an external web server. My intuition is to implement an asynchronous client that can send a lot of requests simultaneously, but I am a little confused as what is the right approach. My understanding is that Netty server uses a worker thread for each incoming message, therefore that worker thread would not be freed to accept new messages until my handler finishes its work.
Here is the punch: even if I have an asynchronous HTTP client in hand, it won't matter if I need to wait for each response and process it back with my server handler - the same worker thread would remain blocking all this time. The alternative is to use the async nature of the client, returning a future object quickly to release the thread and place a listener (meaning I have to return 200 or 202 status to the client), and check my future object to indicate when the response is received and I can push it to the client.
Does this make sense? Am I way off with my assumptions? What is a good practice to implement such kind of Netty acceptor server + external client with high concurrency?
Thanks,
Assuming you're asking about Netty 4.
Netty configured with a ServerBootstrap will have a fixed number of worker threads that it uses to accept requests and execute the channel, like so:
Two threads accepting / processing requests
bootstrap.group(NioEventLoopGroup(2))
One thread accepting requests, two threads processing.
bootstrap.group(NioEventLoopGroup(1), NioEventLoopGroup(1))
In your case, you have a channel includes a bunch of Http Codec decoding/encoding stuff and your own handler which itself makes an outgoing Http request. You're right that you don't want to block the server from accepting incoming requests, or decoding the incoming Http message, and there are two things you can do to mitigate that, you've struck on the first already.
Firstly, you want to use an Async netty client to make the outgoing requests, have a listener write the response to the original requests channel when the outgoing request returns. This means you don't block and wait, meaning you can handle many more concurrent outgoing requests than the number of threads available to process those requests.
Secondly, you can have your custom handler run in its own EventExecutorGroup, which means it runs in a separate threadpool from the acceptor / http codec channel handlers, like so:
// Two separate threads to execute your outgoing requests..
EventExecutorGroup separateExecutorGroup new DefaultEventExecutorGroup(2);
bootstrap.childHandler(new ChannelInitializer<SocketChannel>() {
#Override
public void initChannel(SocketChannel ch) {
ChannelPipeline pipeline = ch.pipeline();
.... http codec stuff ....
pipeline.addLast(separateExecutorGroup, customHandler);
}
};
Meaning your outgoing requests don't hog the threads that would be used for accepting / processing incoming ones.

Resources