I have a handful of gRPC servers, and would like to chain some of the calls, so that the flow would become something like:
Client X calls Server A
└──> Server A
... some processing ... < 1 >
calls Server B
└──> Server B
... some processing ...
returns Result B
receives Result B
... some more processing ...
returns Result A
For this to work, I have been creating a gRPC client for Server B connection at each time Server A RPC gets called (shown with < 1 > above). I found a similar question asked here, and I understand this approach doesn't fit with the long lived connection to make use of concurrency nature discussed here.
What I didn't quite get from the above two references is when I should be creating the gRPC client. If I were to chain this even further with Server B calling Server C, etc., does each server need to generate the connection at the process start time? Does that introduce start-up dependency of each server?
Also, as a baseline, I thought using the context allows nice chaining and still keep control of those remote calls. Is this a correct usage of gPRC?
We've improved the GoDoc for ClientConn. See: https://github.com/grpc/grpc-go/pull/3096.
Let us know what you think. Thanks.
Related
Scenario : The server is in middle of processing a http request and the server shuts down. There are multiple points till where the code has executed. How are such cases typically handled ?. A typical example could be that some downstream http calls had to be made as a part of the incoming http request. How to find whether such calls were made or not made when the shutdown occurred. I assume that its not possible to persist every action in the code flow. Suggestions and views are welcome.
There are two kinds of shutdowns to consider here.
There are graceful shutdowns: when the execution environment politely asks your process to stop (e.g. systemd sends a SIGTERM) and expects it to exit on its own. If your process doesn’t exit within a few seconds, the environment proceeds to kill the process in a more forceful way.
A typical way to handle a graceful shutdown is:
listen for the signal from the environment
when you receive the signal, stop accepting new requests...
...and then wait for all current requests to finish
Exactly how you do this depends on your platform/framework. For instance, Go’s standard net/http library provides a Server.Shutdown method.
In a typical system, most shutdowns will be graceful. For example, when you need to restart your process to deploy a new version of code, you do a graceful shutdown.
There can also be unexpected shutdowns: e.g. when you suddenly lose power or network connectivity (a disconnected server is usually as good as a dead one). Such faults are harder to deal with. There’s an entire body of research dedicated to making distributed systems robust to arbitrary faults. In the simple case, when your server only writes to a single database, you can open a transaction at the beginning of a request and commit it before returning the response. This will guarantee that either all the changes are saved to the database or none of them are. But if you call multiple downstream services as part of one upstream HTTP request, you need to coordinate them, for example, with a saga.
For some applications, it may be OK to ignore unexpected shutdowns and simply deal with any inconsistencies manually if/when they arise. This depends on your application.
I know in HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response. I am wondering if this isn't kind of an obstacle in scalability when growning servers horizontaly.
My scenario: we are developing all new services following microservices patterns either in Java or Phyton. It is desarible we can design and implement such way we can increase horizontally. For isntance, I can use docker in order to easily scale up or use Spring Boot Cloud Config. Whatever the phisical host implementation the basic idea is favour scalability.
My understanding: I must keep server and client as musch agnostic as possible and when I set up HTTP Keep Alive I understand there will be advantage while taking use of same http connection (and save some cpu process) but I guess I am forcing the client (eg. another service) to keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection.
Assuming my understanding is correct, I assume it is not a good idea since we develop the service providing response that can be reuseable from different consumers with different approaches: some consumers can consume assyncronously or following reactive design paradigms which make me wondering if keeping alive same connection. Let's say in practical terms: the connection used should be free soon as possible in order to really balance the demand over all providers.
***edited after first comment
Let´s assume I have multiple diferent consumer services (CS1, CS2 ... CSn) connecting to a single Load Balance instance (LB) which will forward the request to multiple Dockers with same provider service (D1, D2 ... Dn). Since keep alive is the default behaviour in http 1+, we have keep "alive = true" in all connection (either between Cx and LB or LB and Dx). As far as I know the only advantage to keep alive is save cpu process while opening/closing a connection. If I send Connection:close after each request there is no advantage at all to use keep alive. If I have some logic to send "connection: close" it means I promote LB to keep connected to a specific Dx using exactly the same connection for while, right? (I choose here the word promote because I iguess force might not be the appropriate one since there is time out in keep alive and then LB migh route to another Dx anyway). So I have in some moment C1 -> LB -> D1 alive persisted for while, right? Comming back to my original question, isn't that against the idea of assyncronous/paralelal/reactive paradigm? For instance, I have some scenario where a single consumer service will call another service few times before returning a single answer to a page. Today we are doing it sequentially but if we decide to call in paralalel and depending on first answer therer will be already a answer to a page or we decide to compouse an answer to the page but I don't care the order. The caller service will wait every answers before returning to a ccontroller and the order doesn't matter. Ins't strange I have keep alive = true?
I am forcing the client (eg. another service) to keep using same connection
You are not forcing. The client can easily avoid persistent connections by sending HTTP/1.0 and/or Connection: close. There are many practical HTTP applications that work just like that.
keep using same connection which may downgrade the advantage of several docker instances of same service since I will promote the client to keep consuming the same initial connection
Assuming your load balancer works well, it will usually distribute connections evenly across your instances. Of course, this may not work when you only have a few connections altogether, but a few connections can hardly pose a scalability or performance problem.
I am implementing a hub/servers MPI application. Each of the servers can get tied up waiting for some data, then they do an MPI Send to the hub. It is relatively simple for me to have the hub waiting around doing a Recv from ANY_SOURCE. The hub can get busy working with the data. What I'm worried about is skipping data from one of the servers. How likely is this scenario:
server 1 and 2 do Send's
hub does Recv and ends up getting data from server 1
while hub busy, server 1 gets more data, does another Send
when hub does its next Recv, it gets the more recent server 1 data rather than the older server2
I don't need a guarantee that the order the Send's occur is the order the ANY_SOURCE processes them (though it would be nice), but if I new in practice it will be close to the order they are sent, I may go with the above. However if it is likely I could skip over data from one of the servers, I need to implement something more complicated. Which I think would be this pattern:
servers each do Send's
hub does an Irecv for each server
hub does a Waitany on all server requests
upon completion of one server request, hub does a Test on all the others
of all the Irecv's that have completed, hub selects the oldest server data (there is timing tag in the server data)
hub communicates with the server it just chose, has it start a new Send, hub a new Irecv
This requires more complex code, and my first effort crashed inside the Waitany call in a way that I'm finding difficult to debug. I am using the Python bindings mpi4py - so I have less control over buffers being used.
It is guaranteed by the MPI standard that the messages are received in the order they are sent (non-overtaking messages). See also this answer to a similar question.
However, there is no guarantee of fairness when receiving from ANY_SOURCE and when there are distinct senders. So yes, it is the responsibility of the programmers to design their own fairness system if the application requires it.
Let's say we have
Client node with HTTP gateway outbound service
Server node with HTTP gateway inbound service
I consider situation where MSMQ itself stops from some reason on the client node. In current implementation Rebus HTTP gateway will catch the exception.
What do you think about idea that instead of just catching, the MessageQueueException exception could be also sent to server node and put on error queue? (name of error queue could be gathered from headers)
So without additional infrastructure server would know that client has a problem so someone could react.
UPDATE:
I guessed problems described in the answer would be raised. I should have explained my scenario deeper :) Sorry about it. Here it is:
I'm going to modify HTTP gateway in the way that InboundService would be able to do both - Send and Receive messages. So the OutboundService would be the only one who initiate the connection(periodically e.g. once per 5 minutes) in order to get new messages from server and send its messages to server. That is because client node is not considered as a server but as a one of many clients which are behind the NAT.
Indeed, server itself is not interested in client health but I though that instead of creating separate alerting service on client side which would use HTTP gateway HTTP gateway code, the HTTP gateway itelf could do this since it's quite in business of HTTP gateway to have both sides running.
What if the client can't reach the server at all?
Since MSMQ would be dead I thought about using in-process standalone persistent queue object like that http://ayende.com/blog/4540/building-a-managed-persistent-transactional-queue
(just an example implementation, I'm not sure what kind of license it has)
to aggregate exceptions on client side until server is reachable.
And how often will the client notify the server that is has experienced an error?
I'm not sure about that part - I thought it could be related to scheduled time of message synchronization like once per 5 minutes but what in case there would be no scheduled time just like in current implementation (while(true) loop)? Maybe it could be just set by config?
I like to have a consistent strategy about handling errors which usually involves plain old NLog logging
Since client nodes will be in the Internet behind the NAT standard monitoring techniques won't work. I thought about using queue as NLog transport but since MSMQ would be dead it wouldn't work.
I also thought about using HTTP as NLog transport but on the server side it would require queue (not really, but I would like to store it in queue) so we are back to sbus and HTTP gateway...that kind of NLog transport would be de facto clone of HTTP gateway.
UPDATE2: HTTP as NLog transport (by transport I mean target) would also require client side queue like I described in "What if the client can't reach the server at all?" section. It would be clone of HTTP gateway embedded into NLog. Madness :)
All the thing is that client is unreliable so I want to have all the information about client on the server side and log it in there.
UPDATE3
Alternative solution could be creating separate service, which would however be part of HTTP gateway (e.g. OutboundAlertService). Then three goals would be fulfilled:
shared sending loop code
no additional server infrastructure required
no negative impact on OutboundService (no complexity of adding in-process queue to it)
It wouldn't take exceptions from OutboundService but instead it would check MSMQ perodically itself.
Yet other alternative solution would be simply using other than MSMQ queue as NLog target but that's ugly overkill.
Regarding your scenario, my initial thought is that it should never be the server's problem that a client has a problem, so I probably wouldn't send a message to the server when the client fails.
As I see it, there would be multiple problems/obstacles/challenges with that approach because, e.g. what if the client can't reach the server at all? And how often will the client notify the server that is has experienced an error?
Of course I don't know the details of your setup, so it's hard to give specific advice, but in general I like to have a consistent strategy about handling errors which usually involves plain old NLog logging and configuring WARN and ERROR levels to go the Windows Event Log.
This allows for setting up various tools (like e.g. Service Center Operations Manager or similar) to monitor all of your machines' event logs to raise error flags when someting goes wrong.
I hope I've said something you can use :)
UPDATE
After thinking about it some more, I think I'm beginning to understand your problem, and I think that I would prefer a solution where the client lets the HTTP listener in the other end know that it's having a problem, and then the HTTP listener in the other end could (maybe?) log that as an error.
Another option is that the HTTP listener in the other end could have an event, ReceivedClientError or something, that one could attach to and then do whatever is right in the given situation.
In your case, you might put a message in an error queue. I would just avoid putting anything in the error queue as a general solution because I think it confuses the purpose of the error queue - the "thing" in the error queue wouldn't be a message, and as such it would not be retryable etc.
My question is two-part:
What exactly does it mean by an 'asynchronous server', which is usually what people call Tornado? Can someone please provide an concrete example to illustrate the concept/definition?
In the case of Tornado, what exactly does it mean by 'non-blocking'? is this related to the asynchronous nature above? In addition, I read it somewhere it always uses a single thread for handling all the requests, does this mean that requests are handled sequentially one by one or in parallel? If the latter case, how does Tornado do it?
Tornado uses asynchronous, non-blocking I/O to solve the C10K problem. That means all I/O operations are event driven, that is, they use callbacks and event notification rather than waiting for the operation to return. Node.js and Nginx use a similar model. The exception is tornado.database, which is blocking. The Tornado IOLoop source is well documented if you want to look at in detail. For a concrete example, see below.
Non-blocking and asynchronous are used interchangeably in Tornado, although in other cases there are differences; this answer gives an excellent overview. Tornado uses one thread and handles requests sequentially, albeit very very quickly as there is no waiting for IO. In production you'd typically run multiple Tornado processes.
As far as a concrete example, say you have a HTTP request which Tornado must fetch some data (asynchronously) and respond to, here's (very roughly) what happens:
Tornado receives the request and calls the appropriate handler method in your application
Your handler method makes an asynchronous database call, with a callback
Database call returns, callback is called, and response is sent.
What's different about Tornado (versus for example Django) is that between step 2 and 3 the process can continue handling other requests. The Tornado IOLoop simply holds the connection open and continues processing its callback queue, whereas with Django (and any synchronous web framework) the thread will hang, waiting for the database to return.
This is my test about the performance of web.py(cherrypy) and tornado.
how is cherrypy working? it handls requests well compared with tornado when concurrence is low