TCP sessions with gRPC - tcp

Sorry if this question is naive. (gRPC novice here). But, I would like to understand this.
Let's say I have a gRPC service definition like this:
service ABC {
// Update one or more entities.
rpc Write(WriteRequest) returns (WriteResponse) {
}
// Read one or more entities.
rpc Read(ReadRequest) returns (stream ReadResponse)
{
}
// Represents the bidirectional stream
rpc StreamChannel(stream StreamMessageRequest)
returns (stream StreamMessageResponse) {
}
}
Our potential use case would be the server built using C++ and the client using Java. (Not sure is that matters).
I would like to understand how the TCP sessions are managed. The Stream Channel would be used for constant telemetry data streaming between the client and the server. (Constant data transfer, but the bulk from the server to the client).
Does the StreamChannel have a separate TCP session, while for every Write and Read a new session would be established and terminated after the call is done?
Or is there a single TCP session over which all the communication happens?
Again, please excuse me if this is very naive.
Thanks for your time.

Since gRPC uses HTTP/2, it can multiplex multiple RPCs on the same TCP connection. The Channel abstraction in gRPC lets gRPC make connection decisions without the application needing to be strongly-aware.
By default, gRPC uses the "pick first" load balancing policy, which will use a single connection to the backend. All new RPCs would go over that connection.
Connections may die (due to I/O failures) or need to be shut down (various reasons), so gRPC handles reconnecting automatically. Because it can take a very long time to shut down a connection (as gRPC waits for RPCs on that connection to complete), it's still possible that gRPC would have 2 or more connections to the same backend.
So for your case, all the RPCs would initially exist on the same connection. As time goes on new RPCs may use a newer connection, and an old, long-lived StreamChannel RPC may keep the initial TCP connection alive. If that long-lived StreamChannel is closed and re-created by the application, then it could share the newer connection again.

I also posted the same question in grpc.io, and the response I got was inline with the marked answer.
Summary:
If there is no load-balancing, all the RPCs use the same session. The session remains connected across requests. The session establishment happens the first time a call is attempted on the channel.

Related

gRPC call, channel, connection and HTTP/2 lifecycle

I read the gRPC Core concepts, architecture and lifecycle, but it doesn't go into the depth I like to see. There is the RPC call, gRPC channel, gRPC connection (not described in the article) and HTTP/2 connection (not described in the article).
I'm interested in knowing how these come together. For example, what happens to the channel when a RPC throws an exception? What happens to the gRPC connection when the channel is closed? When is the channel closed? When is the gRPC connection closed? Heart beats? What if the deadline is exceeded?
Can anyone answer these questions, or point me to resources that can?
The connection is not a gRPC concept. It is not part of the normal API and is an implementation detail. This should be seen as fairly normal, like HTTP libraries providing details about HTTP exchanges but not exposing connections.
It is best to view RPCs and connections as two mostly-separate systems.
The only real guarantee is that "connections are managed by channels," for varying definitions of "managed." You must shut down channels when no longer used if you want connections and other resources to be freed. Other details are either an implementation detail or an advanced API detail.
There is no "gRPC connection." A "gRPC connection" would just be a standard "HTTP/2 connection." Except that is even an implementation detail of the transport in many gRPC implementations. That allows having alternative "connection" types like "inprocess" or QUIC (via Cronet, where there is not a classic "connection" at all).
It is the channel's job to hold all the connections and reconnect as necessary. It delegates part of that responsibility to load balancers and the load balancing APIs do have a concept of connections (subchannels). By not exposing connections to the application, load balancers have a lot of freedom to operate.
I'll note that gRPC C-core based implementations share connections across channels.
What happens to the channel when a RPC throws an exception?
The channel and connection is not impacted by a failed RPC. Note that connection-level failures typically cause RPCs to fail. But things like retries could allow the RPC to be re-sent on a new connection.
What happens to the gRPC connection when the channel is closed?
The connections are closed, eventually. Channel shutdown isn't instantaneous because existing RPCs can continue, and connection shutdown isn't instantaneous as well. But once all RPCs complete the connections are closed. Although C-core won't shut down a connection until no channels are using it.
When is the channel closed?
Only when the user closes it.
When is the gRPC connection closed?
Lots of times. The client may close it when no longer needed. For example, let's say the server IP address changes and the client need to connect to 1.1.1.2 instead of 1.1.1.1. A new connection will be created and new RPCs will go to the new IP address. The client may also close connections it thinks are dead (e.g., via keepalive timeouts).
Servers have a lot of say of when to close connections. They may close them simply because they are old, or because they have been idle, or because the server is overloaded. But those are simply use-cases; the server can shut down a connection at-will.
What if the deadline is exceeded?
Deadline only applies to RPCs and doesn't impact the channel or a connection.
I was actually waiting for Eric to answer this as he is the expert in this!
I also have been playing with gRPC for a while now, I would like to add few things here for beginners. Anyone more experienced, please feel free to edit!
Channel is an abstraction over a long-lived connection! The client application will create a channel on start up. The channel can be reused/shared among multiple threads. It is thred safe. One channel is enough (for most of the use cases) for multiple threads and multiplexing concurrent requests. It is channel's responsibility to close / reconnect / keep the connection alive etc. We as the users do not have to worry about this in general. The client application can close the channel anytime it wants. Channel creation seems to be an expensive process. So we would not open/close for every RPC.
When you use gRPC loadbalancer/nameresolver for a domain name and the nameresolver resolves the domain with multiple ip addresses, a channel creates multiple subchannels where each subchannel is an abstraction over a connection to 1 server. So a channel can also represent multiple connections!!
Adding some points to note from Eric's comment.
adding the default load balancer still only creates (approximately)
one connection if the name resolver returns multiple addresses, as the
default is pick_first. But if you change the load balancer to
round_robin or virtually any other policy, then yes, there will be
multiple connections in a channel. Even if a name resolver returns one
address, the load balancer is free to create multiple connections
(e.g., for higher throughput), but that's not common today
An underlying connection can be closed any time for any reason. For ex: remote server is shutting down gracefully for a scheduled maintenance or a connection is idle for longer duration. In that case, the server could send GOAWAY signal to the client and client might disconnect and reconnect to some other server. or Server might crash due to OOM error. In this case channel will detect connection failure and will retry for new connection for some other server etc.
A channel can keep sending PING frame to the server to keep the connection alive. These are all configurable via channel builder.
With these information above, if we look at your questions,
what happens to the channel when a RPC throws an exception?
Nothing happens to the channel. The unhandled exception on the server might the fail the RPC on the client side. But channel is still usable for any RPC calls.
What happens to the gRPC connection when the channel is closed?
Channel is an abstraction over the connection. So it will be closed. (again there is no gRPC connection as such as Eric had mentioned. It would be a HTTP2 connection)
When is the channel closed?
Any time you want. But normally when the application shuts down.
When is the gRPC connection closed?
It is not our problem. Channel takes care of this.
Heart beats?
Channel sends PING frames periodicaly to keep the connection alive.
What if the deadline is exceeded?
It is something like timeout on the client side. When the deadline exceeds, the client might cancel the request. Once again nothing happens to the channel. (But it might trigger exception on the server side which I had noticed few times. (Received DATA frame for an unknown stream. https://github.com/grpc/grpc-java/issues/3548). It seems to have been fixed now).

Using connection keepalive in grpc for blockingStubs

Want to enable connection keep alive option for the gRPC API calls. Current code makes use of blocking stubs (synchronous calls using java client). I would like to know if the connection keep alive options (described in the link below) are expected to work with he blocking stubs?
https://cs.mcgill.ca/~mxia3/2019/02/23/Using-gRPC-in-Production/
Desired behavior - Blocking API calls should fail in reasonable time if there is any issues with server (say server crashes or killed for some reason)
In grpc-java, the stubs are a thin layer on top of a more advanced API (ClientCall/ServerCall). The stub type does not impact the Channel-level features. The keepalive channel option will work independent of the stub type.
Keepalive will kill pending RPCs on a connection to a remote server that crashes/hangs/etc. It does not kill RPCs when the server just takes a long time to respond to the RPC.

how does ASP.net "HttpResponse.IsClientConnected" work?

if HTTP is connection-less, how does ASP.net response property, HttpResponse.IsClientConnected detect client is connected or not?
HTTP is not "connection-less" - you still need a connection to receive data from the server; more correctly, HTTP is stateless. Applications running on-top of HTTP will most likely actually be stateful, but HTTP itself is not.
"Connectionless" can also refer to a system using UDP as the transport instead of TCP. HTTP primarily runs over TCP and pretty much every real webserver expects, and returns, TCP messages instead of UDP. You might see HTTP-like traffic in UDP-based protocols like UPnP, but because you want your webpage to be delivered reliably, TCP will always be used instead of UDP.
As for IsClientConnected, when you access that property it calls into the current HttpWorkerRequest which is an abstract class implemented by the current host environment.
IIS7+ implements it such that if it previously received a TCP disconnect message (that sets a field) the method would now return false.
The ISAPI implementation (IIS 6) instead calls into a function within IIS that informs the caller if the TCP client on the current request/response context is still connected, though presumably it works on the same basis: when the webserver receives a TCP timeout, disconnect or connection-reset message it sets a flag and lets execution continue instead of terminating the response-generator thread.
Here's the relevant source code:
HttpResponse.IsClientConnected: http://referencesource.microsoft.com/#System.Web/HttpResponse.cs,80335a4fb70ac25f
IIS7WorkerRequest.IsClientConnected: http://referencesource.microsoft.com/#System.Web/Hosting/IIS7WorkerRequest.cs,1aed87249b1e3ac9
ISAPIWorkerRequest.IsClientConnected: http://referencesource.microsoft.com/#System.Web/Hosting/ISAPIWorkerRequest.cs,f3e25666672e90e8
It all starts with an HTTP request. Inside it, you can, for example, spawn worker threads, that can outlive the request itself. Here is where IsClientConnected comes in handy, so that the worker thread knows that the client has already received the response and disconnected or not.

Should my IoT device poll with HTTP or listen with TCP?

I'm creating an IoT Device + Server system using .NET Micro Framework and ASP.NET WebAPI (Probably in Azure).
The IoT device needs to be able to frequently update the server with stats whilst also being able to receive occasional incoming commands from the server that would change its behaviour. In this sense, the device needs to act as both client and server itself.
My concern is getting the best balance between the security of the device and the load on the server. Furthermore, there must be a relatively low amount of latency between the server needing to issue a command and the device carrying it out; of the order of a few seconds.
As I see it my options are:
Upon connection to the internet, the device establishes a persistent TCP connection to the server which is then used for both polling and receiving commands.
The device listens on a port (e.g. HttpListener) for incoming commands whilst updating the server via frequent HTTP requests.
The device only ever polls the server with HTTP requests. The server uses the response to give the device commands.
The 2nd option seems to be the least secure as the device would have open incoming ports. The 1st option looks the most difficult to reliably implement as it would require low level socket programming. The 3rd option seems easy and secure but due to the latency requirements the device would need to poll every few seconds. This impacts the scalability of the system.
So at what frequency does HTTP polling create more overhead than just constantly keeping a TCP connection open? 5s? 3s? 1s? Or am I overstating the overhead of keeping a TCP connection open in ASP.NET? Or is there a completely different way that this can be implemented?
Thanks.
So at what frequency does HTTP polling create more overhead than just constantly keeping a TCP connection open? 5s? 3s? 1s?
There is nothing to do to keep a TCP connection open. The only thing you might need to do is to use TCP keep-alive (which have nothing to do with HTTP keep-alive!) in case you want to keep the connection idle (i.e no data to send) for a long time.
with HTTP your overhead already starts with the first request, since your data need to be encapsulated into a HTTP message. This overhead can be comparable small if the message is large or it can easily be much larger than the message itself for small messages. Also, HTTP server close the TCP connection after some idle time so you might need to re-establish the TCP connection for the next data exchange which is again overhead and latency.
HTTP has the advantage to pass through most firewalls and proxies, while plain TCP does not. You also get encryption kind of free with HTTPS, i.e. there are established standards for direct encrypted connection and for tunneling through a proxy.
WebSockets is something in between: you do a HTTP request and then upgrade HTTP to WebSocket. The initial overhead is thus as large as for HTTP but for the next messages the overhead is not that much higher than TCP. And you can do also WebSockets with HTTPS (i.e. wss:// instead of ws://). It passes through most simple firewalls and proxies, but more deeper inspection firewalls might still have trouble with it.
Setting up a TCP listener will be a problem if you have your IoT device behind some NAT router, i.e. the usual setup inside private or SoHo networks. To reach the device one would need to open a tunnel at the router from outside into the network, either by administrating the router by hand or with UPnP (which is often switched off for security reasons). So you would introduce too much problems for the average user.
Which means that the thing which the fewest problems for the customer is probably HTTP polling. But this is also the one with the highest overhead. Still mostly compatible are WebSockets which have less overhead and more problems but even less overhead can be reached with simple TCP to the server. TCP listener instead would cause too much trouble.
As for resources on the server side: each HTTP polling request might use new TCP connection but you can also reuse an existing one. In this case you could decide between more overhead and latency one the client side (new TCP connection for each request) which needs few resources on the server side and less overhead and latency on the client side which needs more resources on the server side (multiple HTTP requests per TCP connection). With WebSockets and plain TCP connection you always need more server side resources, unless your client will automatically re-establish the connection on loss of connectivity.
These days you should use a IOT Specific communication protocol over TLS 2.0 for secure light weight connections. For example AWS uses MQTT http://mqtt.org/ and Azure uses AMQP https://www.amqp.org/
The idea is you get a broker you can connect to securely then you use a messaging protocol with a topic to route messages to the proper devices. Also IBM has been using MQTT for a long time and routers now typically come with port 8883 open which is MQTT over TLS.
Good Luck!
Simply use SignalR to connect client and server. It provides you minimal latency without polling. The API is very simple to use.
Physically, this runs over WebSockets which are scalable to a large number of concurrent connections. If you don't have a need for more than 100k per Windows server this would not be a concern.

Can both ends of a gRPC connection accept method calls?

From the introduction on gRPC:
In gRPC a client application can directly call methods on a server application on a different machine as if it was a local object, making it easier for you to create distributed applications and services. As in many RPC systems, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the client has a stub that provides exactly the same methods as the server.
The above paragraph talks about a client and a server, with the former being the one who is invoking methods to the other. What am I wondering is: can the server-end of the connection invoke methods that have been registered on the client?
No, a server cannot invoke calls on the client. gRPC works with HTTP, and HTTP has not had such semantics in the past.
There has been discussion as to various ways to achieve such a feature, but I'm unaware of any work having started or general agreement on a design. gRPC does support bidirectional streaming, which may get you some of what you need. With bidirectional streaming the client can respond to messages from server, but the client still calls the server and only one type of message can be sent for that call.
The protocol does not implement it, but you may pretend this situation.
Define a server method that returns a stream of a ServerRequest message:
import "google/protobuf/any.proto";
service FullDuplex {
rpc WaitRequests (google.protobuf.Any) returns (stream ServerRequest);
}
message ServerRequest {
float someValue = 1;
float anotherAnother = 1;
}
ServerRequest may be an Oneof, so your may receive different types of server requests.
If you need that your client sends back a response for each request, you may create a stream from your client to the server, but you will need to implement a logic in your server side that triggers a timeout waiting for that response.
service FullDuplex {
rpc WaitRequests (stream ClientResponse) returns (stream ServerRequest);
}
What you can do is start a HTTP server in both processes and use clients at each end to initiate communication. There's a bit of boilerplate involved and you have to design a simple handshaking protocol (one end registers with the other, advertising its listen address) but it's not too much work.

Resources