keep track of the request queue in grpc server - grpc

GRPC server does queue the requests and serve them based on the maxWorker configuration which is passed when the sever starts up. How do I print the metric- number of items in the queue .? Essentially, I would like to keep track of the numbers of requests in waiting state.

You can pass your own executor to serverBuilder.executor(). Note that you are then responsible for shutting down the executor service after the server has terminated.

The solution for grpc-python is similar to grpc-java. You can pass your customized future executor to the server constructor, and monitor the submission of task yourself.
gRPC Python API: grpc.server(thread_pool, ...)
The executor class to extend: concurrent.futures.ThreadPoolExecutor

Related

How to get the internal queueing time of requests sent to grpc?

I want to get the time spent by a request waiting in the internal queue that GRPC maintains before it's picked up by a worker thread for execution on the RPC code.
I took a look at the interceptor offered by grpc (say to add a timestamp when the request is received by the server) but after looking at the server implementation of GRPC it seems like the interceptor pipeline is executed after the request is polled from the completion queue.
Any suggestion to get the queueing time for each request before it's executed is appreciated. Or an explanation, if my understanding above is wrong, is too.

Using connection keepalive in grpc for blockingStubs

Want to enable connection keep alive option for the gRPC API calls. Current code makes use of blocking stubs (synchronous calls using java client). I would like to know if the connection keep alive options (described in the link below) are expected to work with he blocking stubs?
https://cs.mcgill.ca/~mxia3/2019/02/23/Using-gRPC-in-Production/
Desired behavior - Blocking API calls should fail in reasonable time if there is any issues with server (say server crashes or killed for some reason)
In grpc-java, the stubs are a thin layer on top of a more advanced API (ClientCall/ServerCall). The stub type does not impact the Channel-level features. The keepalive channel option will work independent of the stub type.
Keepalive will kill pending RPCs on a connection to a remote server that crashes/hangs/etc. It does not kill RPCs when the server just takes a long time to respond to the RPC.

Does one GRPC channel request generate one more new thread in the GRPC server side?

I'm new to GRPC. I want to know that if the server start a new thread to process when a GRPC client start one request.
There may be up to one Runnable enqueued to the Server's executor for application processing. Each request may generate more than one Runnable over time, but only one at a given time. The default executor is an unbounded cached thread pool, so worst-case each request gets its own thread initially, but later requests will generally reuse previous threads.
It is good practice for high QPS services to specify a fixed-sized executor, to avoid excessive number of threads and reduced context switch thrashing.

How to continue serving requests in Java Servlet container while load balancer removes the server from the pool based on a health check?

I want to continue serving requests while my load balancer removes the server from the pool based on a health check. If I know the server is being shut down, I can change the health check to unhealthy and give it time to be removed without dropping any requests.
My attempt: contextDestroyed in ServletContextListener
I'm delaying shutdown of my web app like this in a ServletContextListener:
override def contextDestroyed(sce: ServletContextEvent) {
logger.info(s"contextDestroyed, suspend for $ShutdownDelayMS ms")
Thread.sleep(ShutdownDelayMS)
logger.info("Suspend finished, shutting down")
}
I see the logs and the delay in between, but as soon as contextDestroyed is called, the server will no longer serve requests, and in stdout I immediately see:
[INFO] Stopped ServerConnector#19877453{HTTP/1.1}{0.0.0.0:8080}
Am I misunderstanding how contextDestroyed is supposed to work? Jetty 9.2.9.v20150224, javax.servlet 3.0.1.
Docs for contextDestroyed say:
Receives notification that the ServletContext is about to be shut
down. All servlets and filters will have been destroyed before any
ServletContextListeners are notified of context destruction.
So that means it immediately stops serving requests if I'm reading it correctly. Is there another way to achieve what I'm trying to do, create a delay that allows requests to continue being served?

Job Manager Message Queue over HTTP

I have a worker that can currently only accept jobs/tasks over HTTP. That is, instead of running a daemon listening to TCP ports and just getting raw messages, it only listens to HTTP messages. (I know HTTP is just an additional layer over TCP). So jobs have to be constructed and wrapped around HTTP messages.
I want to use a job manager to queue tasks and send these tasks over HTTP to a pool of the workers as described above.
Is there any job managers that relays tasks over HTTP? I don't mean accepting tasks over HTTP, that doesn't matter, but they must be able to send tasks to workers over HTTP.
There are other functionalities a job manager possesses, such as fault tolerance. And even though the HTTP connection is not persistent, is it possible to replicate all the TCP signals a worker will return to the job manager over HTTP?
One solution I was thinking of having a proxy in between that translates the TCP messages into HTTP messages. But this seemed difficult to do.
I believe that better architecture would be mature job queue + wrapper for your worker's API.
You choose job scheduler/queue with your requirements (Celery or whatever you like)
Write a wrapper script which is able to submit jobs to your worker, report worker's status, etc.

Resources