gRPC vs WebFlux (Project Reactor) - grpc

Please ignore this question. I had wrong setup which caused poor performance of gRPC.
Is it OK to compare GRPC vs Project Reactor?
I just wanted to compare the performance of REST and GRPC. I do not see that GRPC is any faster than reactor. In fact it is worse.
GRPC set up:
api-server --> grpc-server
This grpc server responds 1000 "Hello" using server-side streaming for each request from api-server.
The api-server returns Flux<String>
WebFlux Setup:
api-server --> rest-server
This rest server responds 1000 "Hello" as Flux<String> for each request from api-server.
The api-server returns Flux<String>
I did the performance test for 10000 iterations with 10 concurrent users.
The WebFlux setup is much much faster than GRPC. I am kind of curious if it is gRPC really faster? If yes, in which cases?
Note: The request and response payload is very small in size in both cases.

have a look at the rsocket protocol https://rsocket.io for something even closer, reactive streams but across the network. grpc has more higher level components for the rpc stuff (stubs generation, etc...) but as a result one could argue rsocket gives you more control

Related

Is this an intention of bidirectional streaming RPC in gRPC?

I am maxing out my CPU on my gRPC client doing unary RPC. I wonder if it makes any sense to try replacing unary RPC with a bidirectional streaming RPC (or set of them?) that basically lasts throughout the life of the application? I can't tell if bidirectional streaming RPC is intended for standard 1 request/1 response communication like this. The motivation would be to avoid creating new TCP connections.
I can't tell if bidirectional streaming RPC is intended for standard 1 request/1 response communication
If you are going to use a request-reply message pattern, just use unary request-reply (RPC). It is designed for that pattern and semantics for e.g. retry is well known.
The motivation would be to avoid creating new TCP connections.
gRPC uses HTTP/2, so all unary RPC requests already use the same TCP connection - since HTTP/2 multiplexes all requests over the same TCP connection.
I am maxing out my CPU on my gRPC client doing unary RPC.
This sounds a bit rare. Can you change your communication pattern, e.g. stream multiple requests before waiting for a response? Alternative batch data and send bigger requests more seldom? Or can you elaborate more about your message pattern?
Like #Jonas suggested, using a bidirectional stream just for higher throughput would be a bad idea.
Google’s gRPC team advises against it, but nevertheless, few seem to argue that theoretically, streams should have lower overhead. But that does not seem to be true.
Probably because streams ensure the messages are delivered in the order they were sent, thus creating some kind of bottleneck when there are concurrent messages.
For lower concurrent requests, both have comparable latencies. However, for higher loads, unary calls are much more performant.
A detailed analysis here: https://nshnt.medium.com/using-grpc-streams-for-unary-calls-cd64a1638c8a.

How is non-blocking IO actually works from client's perspective?

So I came across the idea of blocking and non-blocking I/O. But what I understood from the concept and some of the sample implementations is that we implement code on the server side to achieve this nature of the code.
But now my question is, if (for example postman sending HTTP request to the server) the request has to wait for the server to respond, then what's the point of non-blocking I/O? (Please correct me if I am wrong) Or the whole concept is just for the increase of throughput of the server instead of actual async nature w.r.t. to client.
For example, in one of my project what I did was created a post request to create a request in the system for processing which will return the transaction ID, now using this transaction id, I can query the server to know the outcome.
I may sound too naive, but the concept has confused me a lot. I do not understand this concept clearly. Please help.
Thanks
the request has to wait for the server to respond, then what's the point of non-blocking I/O?
There's a confusion. Waiting for a response and (non)blocking i/o are very loosely related. You always have to wait for response. That's why youve made the request to begin with. But the question is: how?
Non-blocking HTTP: "Dear server, here's my request, please process it and send me a response, I'm going to do something else in the meantime, like calculating n-th digit of Pi (I'm a weirdo)".
Blocking HTTP: "Dear server, here's my request, please process it and send me a response, I'm going to patiently wait for it doing nothing".
Or the whole concept is just for the increase of throughput of the server instead of actual async nature w.r.t. to client.
The whole concept is to be able to do other things while waiting for i/o at the same time. And to do that while minimizing the usage of threads which don't scale well.
Asynchronous systems, i.e. systems without "I'm going to wait idly" part tend to perform better at the cost of complexity.
Side note: nonblocking i/o can be used both on the server side and client side. For example almost all JS engines in browsers are built on top of some asynchronous engine. JS is often single-threaded, meaning nonblocking i/o is necessary to achieve any concurrency.
But what I understood from the concept and some of the sample implementations is that we implement code on the server side to achieve this nature of the code.
You implement code in whereever you are doing the non-blocking UI. What a server does has no bearing on whether a client uses blocking or non-blocking UI, and what a client does has no bearing on whether a server uses blocking or non-blocking UI.
if (for example postman sending HTTP request to the server) the request has to wait for the server to respond, then what's the point of non-blocking I/O?
So that you're not wasting resources.
Let's consider first a simple console application that hits the web and then does something with the results. In this case there's very little to gain with non-blocking I/O as the application is just going to be sitting around waiting for something to do anyway.
Now let's consider a simple console application that hits 50 different web resources and collates the responses. Now non-blocking I/O is more useful, because with blocking I/O it would have to either get one resource after the other, or spin up 50 threads. With non-blocking I/O one, a small number of threads is all that is needed to hit 50 resources and respond promptly to each returning a response.
Now let's consider a GUI version of this application that wants to remain responsive to user input, while also running on low-power low-memory devices in which blocked threads are all the more expensive. The advantages of the above are increased.
Finally, consider a web application that is doing I/O both with the client and also as a client to a database, file system and maybe other web applications. It may have multiple requests at the same time, and blocking on either the I/O it does with the client or any of the I/O it does with db, file or other applications would cost a thread, which would put a scalability limit on how many requests it can handle simultaneously. Not blocking on I/O allows threads to be used for other requests while the I/O is pending.

Communication between REST Microservices: Latency

The problem I'm trying to solve is latency between Microservice communication on the backend. Scenario. Client makes a request to service A, which then calls service B that calls service C before returning a response to B which goes to A and back to the client.
Request: Client -> A -> B -> C
Response: C -> B -> A -> Client
The microservices expose a REST interface that is accessed using HTTP. Where each new HTTP connection between services to submit requests is an additional overhead. I'm looking for ways to reduce this overhead without bringing in another transport mechanism into the mix (i.e. stick to HTTP and REST as much as possible). Some answers suggest using Apache Thrift but I'd like to avoid that. Other possible solutions are using Messaging Queues which I'd also like to avoid. (To keep operational complexity down).
Has anyone experience in microservices communication using HTTP Connection pooling or HTTP/2? The system is deployed on AWS where service groups are fronted by a ELB.
HTTP/1.0 working mode was to open a connection for each request, and close the connection after each response.
Using HTTP/1.0 from remote clients and clients inside microservices (e.g. those in A that call B, and those in B that call C) should be avoided because the cost of opening a connection for each request can contribute for most of the latency.
HTTP/1.1 working mode is to open a connection and then leave it open until either peer explicitly requests to close it. This allow for the connection to be reused for multiple requests, and it's a big win because it reduces the latency, it uses less resources, and in general it is more efficient.
Fortunately nowadays both remote clients (e.g. browsers) and clients inside microservices support HTTP/1.1 well, or even HTTP/2.
Surely browsers have connection pooling, and any decent HTTP client that you may use inside your microservices does also have connection pooling.
Remote clients and microservices clients should be using at least HTTP/1.1 with connection pooling.
Regarding HTTP/2, while I am a big promoter of HTTP/2 for browser-to-server usage, for REST microservices calls inside data centers I would benchmark the parameters you are interested in for both HTTP/1.1 and HTTP/2, and then see how they fare. I expect HTTP/2 to be on par with HTTP/1.1 for most cases, if not slightly better.
The way I would do it using HTTP/2 (disclaimer, I'm a Jetty committer) would be to offload TLS from remote clients using HAProxy, and then use clear-text HTTP/2 between microservices A, B and C using Jetty's HttpClient with HTTP/2 transport.
I'm not sure AWS ELB already supports HTTP/2 at the time of this writing, but if it does not please be sure to drop a message to Amazon asking to support it (many others already did that). As I said, alternatively you can use HAProxy.
For communication between microservices, you can use HTTP/2 no matter what is the protocol used by remote clients.
By using Jetty's HttpClient, you can very easily switch between the HTTP/1.1 and the HTTP/2 transports, so this gives you the maximum of flexibility.
If latency is really an issue to you then you should probably not be using service calls between your components. Rather you should minimize the number of times control passes to an out-of-band resource and be making the calls in-process, which is much faster.
However, in most cases, the overheads incurred by the service "wrappers" (channel construction, serialisation, marshalling, etc), are negligible enough and still well within adequate latency tolerances for the business process being supported.
So you should ask yourself:
Is latency really an issue for you, in respect to the business process? In my experience only engineers care about latency. Your business customers do not.
If latency is an issue, then can the latency definitively be attributed to the cost of making the service calls? Could there be another reason the calls are taking a long time?
If it is the services, then you should look at consuming the service code as an assembly, rather than out-of-band.
For the benefit of others running into this problem, apart from using HTTP/2, SSL/TLS Offloading, Co-location, consider using Caching where you can. This not only improves performance, but reduces dependency on downstream services. Also, consider data formats that are perform well.
latency between Micro-service communications is an issue for low latency applicaitons however number of call can be minimize by hybrid between micro-services and monolith
Emerging C++ microservices framework is best for low latency applications
https://github.com/CppMicroServices/CppMicroServices

Would you see a significant speedup using a single websocket connection for all requests on a website?

Imagine I'm building an ordinary old website. Not a game, not a chat program, an ordinary website. Let's say it's a stack overflow clone.
The client side would simply make service calls to the server side. The server is essentially a dumb data store and never sends down HTML. The client handles all templating via javascript.
If I established a single websocket connection and did all requests through that, would I see a significant speedup over doing ajax requests?
The obvious advantage to using a single connection is that it only has to be established once. But how much time does that actually save? I know establishing a TCP connection can be costly, but in the grand scheme of things, does it matter?
I would not recommend websockets for webpages. HTTP 1.1 can reuse a TCP-connection for multiple requests, it's only HTTP 1.0 that had to use a new TCP connection for each request.
SPDY is probably a protocol that do what you are looking for. See SPDY: An experimental protocol for a faster web, but it's only supported by Chrome.
If you use websockets, the requests will not be cached.
One HTTP connection can only by used for one HTTP request at the same time. Say that a page requested a 100Kb document, nothing else will be send from the client to the server until that 100Kb document has been transferred. This is called head-of-line blocking. The client can establish an additional connection with the server, but there is also a limit on the amount of concurrent connections with the same server.
One of the primary reasons for developing SPDY and later HTTP/2 was solving this exact problem. However, support for SPDY and HTTP/2 is not yet as widespread as for WebSocket. WebSocket can get you there earlier because it supports multiple streams in full-duplex mode.
Once HTTP/2 is better supported it will be the preferred solution for this problem, but WebSocket will still be better for real-time web applications, where server needs to push data to the client.
Have a look at the N2O framework, it was created to address the problems I described above. In N2O WebSocket is used to send all assets associated with a page.
How much speed you could gain from using WebSocket instead of standard HTTP requests pretty much depends on your specific website: how often it requests data from the server, how big is a typical response, etc.

What's the behavioral difference between HTTP Keep-Alive and Websockets?

I've been working with websockets lately in detail. Created my own server and there's a public demo. I don't have such detailed experience or knowledge re: http. (Although since websocket requests are upgraded http requests, I have some.)
On my end, the server reports details of each hit. Among them are a bunch of http keep-alive requests. My server doesn't handle them because they're not websocket requests. But it got my curiosity up.
The whole big thing about websockets is that the connection stays alive. Then you can pass messages in both directions (simultaneously even). I've read that the Keep-Alive HTTP connection is a relatively new development (I don't know how many years in people time, just that it's only included in the latest standard - 1.1 - is that actually old now?)
I guess I can assume that there's a behavioral difference between the two or there would have been no reason for a websocket standard? What's the difference?
A Keep Alive HTTP header since HTTP 1.0, which is used to indicate a HTTP client would like to maintain a persistent connection with HTTP server. The main objects is to eliminate the needs for opening TCP connection for each HTTP request. However, while there is a persistent connection open, the protocol for communication between client and server is still following the basic HTTP request/response pattern. In other word, server side can't push data to client.
WebSocket is completely different mechanism, which is used to setup a persistent, full-duplex connection. With this full-duplex connection, server side can push data to client and client should be expected to process data from server side at any time.
Quoting corresponding entries on Wikipedia for reference:
1) http://en.wikipedia.org/wiki/HTTP_persistent_connection
2) http://en.wikipedia.org/wiki/WebSocket
You should read up on COMET, a design pattern which shows the limits of HTTP Keep-Alive. Keep-Alive is over 12 years old now, so it's not a new feature of HTTP. The problem is that it's not sufficient; the client and server cannot communicate in a truly asynchronous manner. The client must always use a "hanging" request in order to get a message back from the server; the server may not just send a message to the client at any time it wants.
HTTP vs Websockets
REST (HTTP)
Resources benefit from caching when the representation of a resource changes rarely or multiple clients are expected to retrieve the resource.
HTTP methods have well-known idempotency and safety properties. A request is “idempotent” if it can be issued multiple times without resulting in unique outcomes.
The HTTP design allows for responses to describe errors with the request, with the resource, or to provide nuanced status information to differentiate between success scenarios.
Have request and response functionality.
HTTP v1.1 may allow multiple requests to reuse a single connection, there will generally be small timeout periods intended to control resource consumption.
You might be using HTTP incorrectly if…
Your design relies on a client polling the service often, without the user taking action.
Your design requires frequent service calls to send small messages.
The client needs to quickly react to a change to a resource, and it cannot predict when the change will occur.
The resulting design is cost-prohibitive. Ask yourself: Is a WebSocket solution substantially less effort to design, implement, test, and operate?
WebSockets
WebSocket design does not allow explicit or transparent proxies to cache messages, which can degrade client performance.
WebSocket protocol offers support only for error scenarios affecting the establishment of the connection. Once the connection is established and messages are exchanged, any additional error scenarios must be addressed in the messaging layer design, but WebSockets allow for a higher amount of efficiency compared to REST because they do not require the HTTP request/response overhead for each message sent and received.
When a client needs to react quickly to a change (especially one it cannot predict), a WebSocket may be best.
This makes the protocol well suited to “fire and forget” messaging scenarios and poorly suited for transactional requirements.
WebSockets were designed specifically for long-lived connection scenarios, they avoid the overhead of establishing connections and sending HTTP request/response headers, resulting in a significant performance boost
You might be using WebSockets incorrectly if..
The connection is used only for a very small number of events, or a very small amount of time, and the client does not - need to quickly react to the events.
Your feature requires multiple WebSockets to be open to the same service at once.
Your feature opens a WebSocket, sends messages, then closes it—then repeats the process later.
You’re re-implementing a request/response pattern within the messaging layer.
The resulting design is cost-prohibitive. Ask yourself: Is a HTTP solution substantially less effort to design, implement, test, and operate?
Ref: https://blogs.windows.com/buildingapps/2016/03/14/when-to-use-a-http-call-instead-of-a-websocket-or-http-2-0/

Resources