gRPC bi-directional stream processing with multi-threaded clients - grpc

A gRPC newbie question here.
We have a source system that exposes a bi directional gRPC stream. In order to scale our application, we wanted to process the stream data in parallel. Is it possible to have concurrent / multiple gRPC clients consuming from the stream without any conflicts in data processing / during acknowledgement process etc?
Thanks

Is this in the context of a single streaming call? In that case the answer is no. You have a single gRPC client receiving one response stream and it can use worker threads to hand off messages from the stream.
If you are thinking of multiple gRPC clients in an application talking to the same server (I don't see any advantage of doing that) each one will make a separate call and will receive a separate response stream.

Related

GRPC - Separate proto serialization cpu stage from IO stage

We have a service which sends grpc requests under heavy load.
Normally, grpc send is considered to be an IO stage, which means it should use an IO thread pool.
We've noticed that the serialization time of the proto objects consumes rather a-lot of cpu, which is not suitable to use with an IO thread pool.
Is there some way to separate the serialization step from the IO step, e.g can we somehow do something like:
val bytes = myProto.toBytes()
myService.send(bytes)
We have no control over the .proto files and the server, but we can generate a client and a stub. We currently use scalapb but java might do fine as well.
Another option is to somehow translate the client calls to http2 calls, anyone ever tried that?
Any other suggestions on how to tackle this issue?
Thanks in advance.
gRPC itself separates protobuf encoding from transmission. Protobuf encoding and decoding is performed on application threads: the threads calling gRPC and the threads delivering callbacks (e.g., channelBuilder.executor()). Then the I/O is processed asynchronously using separate network threads (Netty event loops). Just interact with gRPC on threads that can handle the protobuf CPU cost.

When does AkkaHttp backpressure kick in?

.. when the http response entity is not consumed, or the client tcp buffer becomes full, or when the rate of client taking from its tcp buffer is lower then the rate of server pushing data to it?
I am looking for a way for to achieve the following:
Let's assume that there is a backpressure-able source of data on the server, such as an Apache Kafka topic.
If I consume this source from a remote location it may be possible that the rate at which that remote location can consume is lower - this is solved if Kafka client or consumer is used.
However let's assume that the client is a browser and that exposing direct Kafka protocol / connectivity is not a possibility.
Further, let's assume that there is a possibility of getting all the value even if jumping over some messages.
For instance in case of compacted topics, getting only the latest values for each key is enough for a client, no need to go through intermediate values.
This would be equivalent to Flowable.onBackpressureLatest() or AkkaStreams.aggregateOnBackpressure or onBackpressureAggregate.
Would it be a way to expose the topic over HTTP REST (e.g. Server Side Events / chunked transfer-encoding) or over web-sockets, that would achieve this effect of skipping over intermediate values for each key?
Please advise, thanks
Akka http supports back pressure based on TCP protocol very well and you can read about using it in combination with streaming here
Kafka consumption and exposure via http with back pressure can be easily achieved in combination of akka-http, akka-stream and alpakka-kafka.
Kafka consumers need to do polling and alpakka covers back pressure with reduction of polling requests.
I don't see the necessity of skipping over the messages when back pressure is fully supported. Kafka will keep track of the offset consumed by a consumer group (the one you pick for your service or http connection) and this will guarantee eventual consumption of all messages. Of course, if you produce messages way faster in a topic, the consumer will never catch up. Let me know if this is your case.
As a final note, you may check out Confluent REST Proxy API, which allows you to read Kafka messages in a restful manner.

TCP sessions with gRPC

Sorry if this question is naive. (gRPC novice here). But, I would like to understand this.
Let's say I have a gRPC service definition like this:
service ABC {
// Update one or more entities.
rpc Write(WriteRequest) returns (WriteResponse) {
}
// Read one or more entities.
rpc Read(ReadRequest) returns (stream ReadResponse)
{
}
// Represents the bidirectional stream
rpc StreamChannel(stream StreamMessageRequest)
returns (stream StreamMessageResponse) {
}
}
Our potential use case would be the server built using C++ and the client using Java. (Not sure is that matters).
I would like to understand how the TCP sessions are managed. The Stream Channel would be used for constant telemetry data streaming between the client and the server. (Constant data transfer, but the bulk from the server to the client).
Does the StreamChannel have a separate TCP session, while for every Write and Read a new session would be established and terminated after the call is done?
Or is there a single TCP session over which all the communication happens?
Again, please excuse me if this is very naive.
Thanks for your time.
Since gRPC uses HTTP/2, it can multiplex multiple RPCs on the same TCP connection. The Channel abstraction in gRPC lets gRPC make connection decisions without the application needing to be strongly-aware.
By default, gRPC uses the "pick first" load balancing policy, which will use a single connection to the backend. All new RPCs would go over that connection.
Connections may die (due to I/O failures) or need to be shut down (various reasons), so gRPC handles reconnecting automatically. Because it can take a very long time to shut down a connection (as gRPC waits for RPCs on that connection to complete), it's still possible that gRPC would have 2 or more connections to the same backend.
So for your case, all the RPCs would initially exist on the same connection. As time goes on new RPCs may use a newer connection, and an old, long-lived StreamChannel RPC may keep the initial TCP connection alive. If that long-lived StreamChannel is closed and re-created by the application, then it could share the newer connection again.
I also posted the same question in grpc.io, and the response I got was inline with the marked answer.
Summary:
If there is no load-balancing, all the RPCs use the same session. The session remains connected across requests. The session establishment happens the first time a call is attempted on the channel.

How to cluster multiple HTTP Clients?

I have a use case which involves pulling/streaming data from numerous HTTP endpoints, in excess of over 100.
I have a standalone java app that can manage up to 30+ client requests using ning async client http library but was looking for some ideas on how I could scale this up to handle much more.
The use case is to pull the data from the end points and push them into a kafka queue (similar to jms queue) for processing by a storm topology. The bit I'm stuck on is how to best efficiently get the http end point data into the Kafka queues in the first place.
thanks

implementing a background process responding to the client in an atmosphere+netty/jetty application

We have a requirement to to support 10k+ users, where every user initiate a request and waits for a response from the server (the response can take as long as 20-30 seconds to arrive). it is only one request from the client, and after a long processing by the server, a response will be transmitted and then the connection will disconnect.
in the background, the server will do some DB search and wait for other background processes to notify on completion before responding to the client.
after doing some research i figured out we will need to use something like the atmosphere framework to support websockets/sse event/long polling along with an asynchronous server like netty (=> nettosphere) or jetty.
As for my experience - mostly Java EE world and Tomcat server.
my questions are:
what will be easier to implement in regard to my experience and our requirement: atmosphere + netty or atmoshphere+jetty? which one can scale better, has an easier learning curve and easier to implement other java technologies?
how do u implement in atmosphere a response that is sent only to the originating client and not broadcast to the rest of the clients? (all the examples i found are broadcast).
how can i implement in netty (or jetty) when using the atmosphere framework our response? i.e., the client send a request, after it is received in the server some background processes are run, and when they finish i need to locate the connection and transmit the response. is that achievable?
Some thoughts:
At 10k+ users, with 20-30 second response latency, you likely hit file descriptor limits if using just 1 network interface. Consider a solution that uses multiple network interfaces.
Your description of your request/response can be handled entirely with standard Servlet 3.0, standard HTTP/1.1, Async request handling, and large timeouts.
If your clients are web browsers, and you don't start sending a response from the server until the 20-30 second window, you might hit browser idle timeouts.
Atmosphere and Cometd do the same things, supporting long duration connections, with connection technique fallbacks, and with logical channel APIs.
I believe the AKKA framework will handle this sort of need. I am looking at using it to handle scaling issues possibly with a RabbitMQ to help off load work to potentially other servers that may be added later to scale as needed.

Resources