HTTP Server-Push: Service to Service, without Browser - http

I am developing a cloud-based back-end HTTP service that will be exposed for integration with some on-prem systems. Client systems are custom-made by external vendors, they are back-end systems with their own databases. These systems are deployed in companies of our clients, we don't have access to them and don't control them. We are providing vendors our API specifications and they implement client code.
The data format which my service exchanges with clients is based on XML and follows a certain standard. Vendors implement their client systems in different programming languages and new vendors will appear over time. I want as many of clients to be able to work with my service as possible.
Most of my service API is REST-like: it receives HTTP requests, processes them, and sends back HTTP responses.
Additionally, my service accumulates some data state changes and needs to regularly push this data to client systems. Because of the below limitations, this use-case does not seem to fit the traditional client-server HTTP request-response model.
Due to the nature of the business, the client systems cannot afford to have their own HTTP API endpoints open and so my service can't establish an outbound HTTP connection to them for delivering data state notifications. I.e. use of WebHooks is not an option.
At the same time my service stakeholders need recorded acknowledgment that data state notifications were accepted by the client system, therefore fire-and-forget systems like Amazon SNS don't seem to apply.
I was considering few approaches to this problem but I'm not sure if I'm missing some simple options or some technologies that already address the problem. Hence this question.
The question text updated: options moved to my own answer.
Related questions and resources
REST API with active push notifications from server to client
Is ReST over websockets possible?
Can we use Web-Sockets for Communication between Microservices?
What is difference between grpc and websocket? Which one is more suitable for bidirectional streaming connection?
https://www.smashingmagazine.com/2018/02/sse-websockets-data-flow-http2/

I eventually found answers to my question myself and with some help from my team. For people like me who come here with a question "how do I arrange notifications delivery from my service to its clients" here's an overview of available options.
WebHooks
This is when the client opens endpoint iself. The service calls client's endpoints whenever the service has some notification to deliver. This way the client also acts as a service and so the client and the service swap roles during notification delivery.
With WebHooks the client must be able to open the endpoint with a well-known address. This is complicated if the client's software is working behind NAT or firewall or if the client is Browser or a mobile application.
The service needs to be prepared that client's WebHook endpoints may not always be online and may not always be healthy.
Another issue is flow control: special measures should be taken in the service not to overwhelm the client with high volume of connections, requests and/or data.
Polling
In this case the client is still the client and the service is still the service, unlike WebHooks. The service offers an endpoint where the client can continuously request new notifications. The advantage of this option is that it does not change connection direction and request-response direction and so it works well with HTTP-based services.
The caveat is that polling API should have some rich semantics to be reasonably reliable if loss of notifications is not acceptable. Good examples could be Google Pub/Sub pull and Amazon SQS.
Here are few considerations:
Receiving and deleting notification should be separate operations. Otherwise, if the service deletes notification just before giving it to the client and the client fails to process the notification, the notification will be lost forever. When deletion operation is separate from receiving, the client is forced to do deletion explicitly which normally happens after successful processing.
In case the client received the notification and has not yet deleted it, it might be undesirable to let the same notification to be processed by some other actor (perhaps a concurrent process of the same client). Therefore the notification must be hidden from receiving after it was first received.
In case the client failed to delete the notification in reasonable time because of error, network loss or process crash, the service has to make notification visible for receiving again. This is retry mechanism which allows the notification to be ultimately processed.
In case the service has no notifications to deliver, it should block the client's call for some time by not delivering empty response immediately. Otherwise, if the client polls in a loop and response comes immediately, the loop iteration will be short and clients will make excessive requests to the service increasing network, parsing load and requests counts. A nice-to have feature is for the service to unblock and respond to the client as soon as some notification appears for delivery. This is sometimes called "long polling".
HTTP Server-sent Events
With HTTP Server-sent Events the client opens HTTP connection and sends a request to the service, then the service can send multiple events (notifications) instead of a single response. The connection is long-living and the service can send events as soon as they are ready.
The downside is that the communication is one-way, the client has no way to inform the service if it successfully processed the event. Because this feedback is absent, it may be difficult for the service to control the rate of events to prevent overwhelming the client.
WebSockets
WebSockets were created to enable arbitrary two-way communication and so this is viable option for the service to send notifications to the client. The client can also send processing confirmation back to the service.
WebSockets have been around for a while and should be supported by many frameworks and languages. WebSocket connection begins as HTTP 1.1 connection and so WebSockets over HTTPS should be supported by many load balancers and reverse proxies.
WebSockets are often used with browsers and mobile clients and more rarely in service-to-service communication.
gRPC
gRPC is similar to WebSockets in a way that it enables arbitrary two-way communication. The advantage of gRPC is that it is centered around protocol and message format definition files. These files are used for code generation that is essential for client and service developers.
gRPC is used for service-to-service communication plus it is supported for Browser clients with grpc-web.
gRPC is supported on multiple popular programming languages and platforms, yet the support is narrower than for HTTP.
gRPC works on top of HTTP/2 which might cause difficulties with reverse proxies and load balancers around things like TLS termination.
Message queue (PubSub)
Finally, the service and the client can use a message queue as a delivery mechanism for notifications. The service puts notifications on the queue and the client receives them from the queue. A queue can be provided by one of many systems like RabbitMQ, Kafka, Celery, Google PubSub, Amazon SQS, etc. There's a wide choice of queuing systems with different properties and choosing one is a challenge on its own. The queue can also be emulated by using database for example.
It has to be decided between the service and the client who owns the queue, i.e. who pays for it. Either way, the queuing system and the queue should be available whenever the service needs to push notifications to it otherwise notifications will be lost (unless the service buffers them internally, with another queue).
Queues are typically used for service-to-service communication but some technologies also allow Browsers as clients.
It is worth noting that an "implicit" internal queue might be used on the service side in other options listed above. One reason is to prevent loss of notifications when there's no client available to receive them. There are many other good reasons like letting clients handle notifications at their pace, allowing to maximize processing throughput, allowing to handle spiky traffic with fixed capacity.
In this option the queue is used "explicitly" as delivery mechanism, i.e. the service does not put any other mechanism (HTTP, gRPC or WebSocket endpoint) in front of the queue and lets the client receive notifications from the queue directly.
Message passing is popular in organizing microservice communications.
Common considerations
In all options it has to be decided whether the loss of notifications is tolerable for the service, the client and the business. Some simpler technical choices are possible if it is ok to lose notifications due to processing errors, unavailability, etc.
It is valuable to have a monitoring for client processing errors from the service side. This way service owners know which clients are more broken without having to ask them.
If the queue is used (implicitly or explicitly) it is valuable to monitor the length of the queue and the age of the oldest notifications. It lets service owners judge how stale data may be in the client.
In case the delivery of notification is organized in a way that notification gets deleted only after a successful processing by the client, the same notification could be stuck in infinite receive loop when the client fails to process it. Such notification is sometimes called "poison message". Poison messages should be removed by the service or the queuing system to prevent clients being stuck in infinite loop. A common practice is to move poison messages to a special place, sometimes called "dead letter queue", for the later human intervention.

One alternative to WebSockets for the problem of server→client notifications with acks from the client seems to be gRPC.
It supports bidirectional communication between server and client in bidirectional streaming mode.
It works on top of HTTP 2.0. In our case functioning over HTTP ports is essential.
There are client and server generators for multiple popular languages and platforms. A nice thing is that I can share protocol definition file with vendors and can be sure my service and their clients will talk the same language.
Drawbacks:
Not as many languages and platforms are supported compared to HTTP. Alternative C from the question will be more accessible if based on HTTP 1.1. WebSockets have also been around longer and I would expect broader adoption than gRPC.
Not all gRPC implementations seem to currently support XML format for data according to FAQ. In order to transport XML my service and its clients will have to transfer XML message as byte arrays inside of gRPC protobuf message.
With gRPC, TLS termination cannot be done on general-purpose HTTP 1.1 load balancer. An application-layer HTTP/2-aware reverse proxy (load balancer) such as Traefik is required.
There are approaches like this and this to allow HTTP 1.1 compatible protocols but they have their own restrictions like limited amount of available clients or necessary client customizations.

Related

UI Design pattern / technology for monitoring when an async process finishes

My frontend kicks off an async process with an http POST.
By whatever means, (kafka, threading, database insert and another system monitoring the table), some process is completed after an unknown amount of time and finishes in some quantifiable way (you can make a http call and determine if its done or not).
Are there any design patterns/technologies for notifying the frontend without it having to make repeated requests to some service?
You can take a look on WebSockets, a bi-directional data channel that is generally used for real-time web applications.
The way you can use would be straight-forward: when you make the HTTP Post request and the async process is started on the backend, you also open a websocket connection with the front-end, for that particular request. When the async process is finished, the backend will notify the front-end through the websocket.
You can even use the same websocket connection to transport data for multiple requests (initiated by the same user), which is a kind of a multiplexing technique.
If you need the overall system to be scalable, you should think about having a cluster of VMs that manage the websocket connections (fully separated from the backend of your application).

How to serve synchronous client server communication with asynchronous Inter-service communication?

We have a service which communicates with another service using asynchronous communication(message queue) but still the client server communication happens through REST. Now we have a requirement that when user perform a persists call on service A it should trigger some modification on service B. Client should get success response only if both services successful completed their respective tasks. Reason for that is we need to make sure that user should not able to perform next step on partially completed data. We are not interested in concurrent users due to nature of the domain. We have very limited capability on replacing REST client server implementation with a protocol such as web socket, AMQP
What are the ways to achieve this type of a scenario with asynchronous interservice communication? Is there any reputed pattern/style of programming for implementing this type of requirement? Any framework or lib on Java?
Or is it a bad idea to rely on asynchronous communication to achieve blocking business requirement?

Web sockets with redis backplane scaleout - multiple redis channels per user or one redis channel for all users

I am connecting clients to our servers using SignalR (same as socketio websockets) so I can send them notifications for activities in the system. It is NOT a chat application. So messages when sent will be for a particular user only.
These clients are connected on multiple web servers and these servers are subscribed to a redis backplane. Like mentioned in this article - http://www.asp.net/signalr/overview/performance/scaleout-in-signalr
My question here is for this kind of notification system, in redis pubsub - should i have multiple channels - one per user in the backplane and the app server listening to each users notification channel. Or have one channel for all these notifications and the app server parses each message and figure out if they have that userid connected and send the message to that user.
Based on the little I know about the details of your application, I think you should create channels/lists in the backplane/Redis on a per-client basis. This would be cheap in Redis, and it gives the server side process handling a specific client only the notifications they are supposed to have.
This should save your application iteration or handling of irrelevant data, which could have implications of performance at scale, and if security is at all a concern (don't know what the domain or application is), then it would be best to never retrieve/receive information unnecessarily that wasn't intended for a particular client.
I will pose a final question and some thoughts which I think support my opinion. If you don't do this on a client-by-client basis, then how will you handle when the user is not present to receive a message? You would either have to throw that message away, or have the application server handle that un-received message for every single client, every time they poll or otherwise receive information from Redis. This could really add up. Although, without knowing the details of the application, I'm not sure if this paragraph is relevant.
At the end of the day, though approaches and opinions may vary depending on the application, I would think about the architecture in terms of the entities and you outlined. You have clients, and they send and receive messages directly to one another. Those messages should be associated with each of the parties involved somehow, and they should be stored in a manner that will be efficient for lookup and which helps define/outline the structure of the application.
Hope my 2c helps!

Did server successfully receive request

I am working on a C# mobile application that requires major interaction with a PHP web server. However, the application also needs to support an "offline mode" as connection will be over a cellular network. This network may drop requests at random times. The problem that I have experienced with previous "Offline Mode" applications is that when a request results in a Timeout, the server may or may not have already processed that request. In cases where sending the request more than once would create a duplicate, this is a problem. I was walking through this and came up with the following idea.
Mobile sets a header value such as UniqueRequestID: 1 to be sent with the request.
Upon receiving the request, the PHP server adds the UniqueRequestID to the current user session $_SESSION['RequestID'][] = $headers['UniqueRequestID'];
Server implements a GetRequestByID that returns true if the id exists for the current session or false if not. Alternatively, this could returned the cached result of the request.
This seems to be a somewhat reliable way of seeing if a request successfully contacted the server. In mobile, upon re-connecting to the server, we check if the request was received. If so, skip that pending offline message and go to the next one.
Question
Have I reinvented the wheel here? Is this method prone to failure (or am I going down a rabbit hole)? Is there a better way / alternative?
-I was pitching this to other developers here and we thought that this seemed very simple implying that this "system" would likely already exist somewhere.
-Apologies if my Google skills are failing me today.
As you correctly stated, this problem is not new. There have been multiple attempts to solve it at different levels.
Transport level
HTTP transport protocol itself does not provide any mechanisms for reliable data transfer. One of the reasons is that HTTP is stateless and don't care much about previous requests and responses. There have been attempts by IBM to make a reliable transport protocol called HTTPR what was based on HTTP, but it never got popular. You can read more about it here.
Messaging level
Most Web Services out there still uses HTTP as a transport protocol and SOAP messaging protocol on top of it. SOAP over HTTP is not sufficient when an application-level messaging protocol must also guarantee some level of reliability and security. This is why WS-Reliability and WS-ReliableMessaging protocols where introduced. Those protocols allow SOAP messages to be reliably delivered between distributed applications in the presence of software component, system, or network failures. At the same time they provide additional security. You can read more about those protocols here and here.
Your solution
I guess there is nothing wrong with your approach if you need a simple way to ensure that message has not been already processed. I would recommend to use database instead of session to store processing result for each request. If you use $_SESSION['RequestID'][] you will run in to trouble if the session is lost (user is offline for specific time, server is restarted or has crashed, etc). Also, if you use database instead of session, you can scale-up easier later on just by adding extra web server.

What's the behavioral difference between HTTP Keep-Alive and Websockets?

I've been working with websockets lately in detail. Created my own server and there's a public demo. I don't have such detailed experience or knowledge re: http. (Although since websocket requests are upgraded http requests, I have some.)
On my end, the server reports details of each hit. Among them are a bunch of http keep-alive requests. My server doesn't handle them because they're not websocket requests. But it got my curiosity up.
The whole big thing about websockets is that the connection stays alive. Then you can pass messages in both directions (simultaneously even). I've read that the Keep-Alive HTTP connection is a relatively new development (I don't know how many years in people time, just that it's only included in the latest standard - 1.1 - is that actually old now?)
I guess I can assume that there's a behavioral difference between the two or there would have been no reason for a websocket standard? What's the difference?
A Keep Alive HTTP header since HTTP 1.0, which is used to indicate a HTTP client would like to maintain a persistent connection with HTTP server. The main objects is to eliminate the needs for opening TCP connection for each HTTP request. However, while there is a persistent connection open, the protocol for communication between client and server is still following the basic HTTP request/response pattern. In other word, server side can't push data to client.
WebSocket is completely different mechanism, which is used to setup a persistent, full-duplex connection. With this full-duplex connection, server side can push data to client and client should be expected to process data from server side at any time.
Quoting corresponding entries on Wikipedia for reference:
1) http://en.wikipedia.org/wiki/HTTP_persistent_connection
2) http://en.wikipedia.org/wiki/WebSocket
You should read up on COMET, a design pattern which shows the limits of HTTP Keep-Alive. Keep-Alive is over 12 years old now, so it's not a new feature of HTTP. The problem is that it's not sufficient; the client and server cannot communicate in a truly asynchronous manner. The client must always use a "hanging" request in order to get a message back from the server; the server may not just send a message to the client at any time it wants.
HTTP vs Websockets
REST (HTTP)
Resources benefit from caching when the representation of a resource changes rarely or multiple clients are expected to retrieve the resource.
HTTP methods have well-known idempotency and safety properties. A request is “idempotent” if it can be issued multiple times without resulting in unique outcomes.
The HTTP design allows for responses to describe errors with the request, with the resource, or to provide nuanced status information to differentiate between success scenarios.
Have request and response functionality.
HTTP v1.1 may allow multiple requests to reuse a single connection, there will generally be small timeout periods intended to control resource consumption.
You might be using HTTP incorrectly if…
Your design relies on a client polling the service often, without the user taking action.
Your design requires frequent service calls to send small messages.
The client needs to quickly react to a change to a resource, and it cannot predict when the change will occur.
The resulting design is cost-prohibitive. Ask yourself: Is a WebSocket solution substantially less effort to design, implement, test, and operate?
WebSockets
WebSocket design does not allow explicit or transparent proxies to cache messages, which can degrade client performance.
WebSocket protocol offers support only for error scenarios affecting the establishment of the connection. Once the connection is established and messages are exchanged, any additional error scenarios must be addressed in the messaging layer design, but WebSockets allow for a higher amount of efficiency compared to REST because they do not require the HTTP request/response overhead for each message sent and received.
When a client needs to react quickly to a change (especially one it cannot predict), a WebSocket may be best.
This makes the protocol well suited to “fire and forget” messaging scenarios and poorly suited for transactional requirements.
WebSockets were designed specifically for long-lived connection scenarios, they avoid the overhead of establishing connections and sending HTTP request/response headers, resulting in a significant performance boost
You might be using WebSockets incorrectly if..
The connection is used only for a very small number of events, or a very small amount of time, and the client does not - need to quickly react to the events.
Your feature requires multiple WebSockets to be open to the same service at once.
Your feature opens a WebSocket, sends messages, then closes it—then repeats the process later.
You’re re-implementing a request/response pattern within the messaging layer.
The resulting design is cost-prohibitive. Ask yourself: Is a HTTP solution substantially less effort to design, implement, test, and operate?
Ref: https://blogs.windows.com/buildingapps/2016/03/14/when-to-use-a-http-call-instead-of-a-websocket-or-http-2-0/

Resources