How webpush work in TCP/IP network layers - firebase

Please explain to me how webpush work in TCP/IP network layers (especially layer 4-5).
I understand that HTTP is stateless protocol:
the protocol is opening TCP / layer 4 connection,
'state' is 'made to work' with cookie/session,
then client send HTTP request (plaintext/compressed "HTTP/1.1 /url/here ... Content-Length: ..."),
then server respond with HTTP request (plaintext/compressed "200 OK ... ..."),
Therefore it's understandable that for a user behind NAT to be able to view webpage of a remote host (because the user behind NAT is the one initiating the connection); but the webserver cannot initiate TCP connection with the client (browser process).
However there are some exceptions like 'websocket' where client (browser) initiate a connection, then leave it open (elevate to just TCP, not HTTP anymore). In this architecture, webserver may send / initiate sending message to client (for example "you have new chat message" notification).
What I don't understand is the new term 'webpush'.
I observed that it can send notification from server to client/browser (from user, it 'feels' like the server is the one initiating the connection)
webpush can send notification anytime, even when browser is closed / not opened yet (as when the device was just freshly turned on), or when it's just connected to internet
How does it work? How do they accomplish this? Previously I think that:
either a javascript in a page is continously (ex: 5 second interval) checking if there's a new notification in server,
or a javascript initiate a websocket (browser initiate/open TCP connection) and keep it alive, when server need to send something, it's sent from webserver to client/browser through this connection
Is this correct? Or am I missing something? Since both of my guess above won't work behind NAT'd network
Is Firebase web notification also this kind of webpush?
I have searched the internet for explanation on what make it work on client side, but there seems only explanation on 'how to send webpush', 'how to market your product with webpush', those articles only explain the server side (communication of app server with push service server) or articles about marketing.
Also, I'm interested in understanding what application layer protocol they're running on (as in what text/binary data the client/server send to each other), if it's not HTTP

Web Push works because there is a persistent connection between the browser (e.g. Chrome) and the browser push service (e.g. FCM).
When your application server needs to send a notification to a browser, it cannot reach the browser directly with a connection, instead it contacts the browser push service (e.g. FCM for Chrome) and then it's the browser push service that delivers the notification to the user browser.
This is possible because the browser constantly tries to keep an open connection with the server (e.g. FCM for Chrome). This means that there isn't any problem for NAT, since it's the clients that starts the connection. Also consider that any TCP connection is bi-directional: so any side of the connection can start sending data at any time. Don't confuse higher level protocols like HTTP with a normal TCP connection.
If you want more details I have written this article that explains in simple words how Web Push works. You can also read the standards: Push API and IETF Web Push in particular.
Note: Firebase (FCM) is two different things, even if that is not clear from the documentation. It is both the browser push service required to deliver notifications to Chrome (like Mozilla autopush for Firefox, Windows Push Notification Services for Edge and Apple Push Notification service for Safari), but it is also a proprietary service with additional features to send notifications to any browser (like Pushpad, Onesignal and many others).

Related

How to send http request from server to user's device

I have a question regarding http requests and responses.
I know that I can send a request to a server from my device (I can build and send a GET request to http://google.com for example). But what if I am Google and I want to send a request from the server to the user's device? How do I do that?
I understand that when the server receives a request, it can answer it, but in this case I want the server to send the request to the user's device. Just like WhatsApp does when you receive a new message.
Thanks for the help!
There are several options for sending information from the server to a client:
Push notifications - depends on the platform you are using
Constructing a Websocket connection that allows bi-directional communication
I'm sure there are more options but those are the two that come up to my mind right away.
It really depends on your application use case. For example, a chat application would like to have a socket open between it and the server so it can update frequently on new messages, etc. On the other had some simple Calendar applications might want to use push notifications to send reminders on certain dates and times.

Why and how SSE (Server-Sent Events) are unidirectional

https://developer.mozilla.org/en-US/docs/Web/API/EventSource
The EventSource interface is web content's interface to server-sent events. An EventSource instance opens a persistent connection to an HTTP server, which sends events in text/event-stream format. The connection remains open until closed by calling EventSource.close().
From what I understand server-sent events require persistent HTTP connection (Connection: keep-alive) so similarly to keeping the connection alive like in case of web sockets.
If the connection is persistent, why server-sent events are unidirectional? Web socket connections are persistent as well.
In this case, what happens if I send a request to my HTTP service and I have persistent connection opened due to EventSource. Will it re-use HTTP connection opened by EventSource or open a new connection?
If it re-uses the connection opened by EventSource how is it considered unidirectional?
Might be trivial, but I had to ask because it is not clear. Because nothing mentions what happens to subsequent HTTP requests when there's existing connection opened by EventSource.
For example, it seems possible to me to implement centralized chat app using SSE:
User 1 sends message to User 2(by sending it to HTTP server). Server sends event to user 2 with a new message, user 2 sends another request to HTTP server with message for User 1, server sends event to user 1.
How is that not considered bi-directional?
Related:
What's the behavioral difference between HTTP Stay-Alive and Websockets?
SSE is unidirectional because when you open a SSE connection, only the server can send data to the client (browser, etc.). The client cannot send any data. SSE is a bit older than WebSockets, hence may be the difference between the unidirectional and bi-directional support between these two technos.
In your use-case, if you open a SSE connection (which is an HTTP connection), only the server will be able to send data. If you wish to send a request to your HTTP service, you will need to open a new "classical" HTTP connection. You will see your browser opening two HTTP connections: 1 for the SSE connection and 1 for the classical HTTP request (short live).
You can implement a chat with SSE. You can have a SSE connection (hence HTTP) to let the user receives the messages from the server. And you can use POST HTTP requests to enable the user to send his/her messages.
Note that most of the browsers can open around 6 HTTP/1.x connections to the same host. So, if you use 1 SSE connection, it will remain potentially 5 HTTP/1.x connections. This is only true with HTTP/1.x. With HTTP 2.x, the connections to the same host are multiplexed: so, in theory, you can send as many HTTP requests at the same time as you wish or you can open as many SSE connections as you wish and thus, by passing the limitation of the 6 connections.
You can have a look at this article (https://streamdata.io/blog/push-sse-vs-websockets/) and this video (https://www.youtube.com/watch?v=NDDp7BiSad4) to get an insight about this technology and whether it could fit your needs. They summarize pros & cons of both SSE and WebSockets.

How can a third person read the HTTP request headers, if those are transported via HTTP (insecure)?

My question is about networking. I'm just looking for a simple answer, yet I couldn't find one after 1 hour research. I know there are techniques such as Wi-Fi Hotspot, man-in-the-middle-attack, local network, echo switch, etc. But I couldn't find an answer to my specific question.
Let's say, client A wants to communicate with server B, and server B says client A must authenticate himself via HTTP basic authentication first. My question is, what happens if client A sends the authentication credentials via HTTP layer (insecure), who can read the HTTP headers that the client A sends to server B over the internet? Would it be easy to do that? Like placing a breakpoint between two arbitrary routers, which help to transfer the packets across the internet, in order to read those headers? How does it work in general?
Thank you!
PS.: I am not trying to learn and do it. I just want to know, how dangerous it would be, if the HTTP basic auth is made via the insecure HTTP layer.
Who can read the HTTP headers that the client A sends to server B over
the internet?
Your Network Provider (e.g Wi-fi hotspot Provider).
Your Domain Name System server (DNS, as 192.168.1.1).
Your Internet Service Provider (ISP).
Your Virtual Private Network if using one (VPN server).
Yourself Or a Virus.
and here comes the HTTPS (HTTP + SSL Encryption)
SSL is about communicating in a language that you and the server only understand.
How dangerous it would be if the HTTP basic auth is made via the insecure HTTP layer?
Well, from above, You can totally get that a simple virus or even a public Wi-fi Hotspot Device can capture and see all of your data if the communication was done in a plain HTTP Socket.
A Simple packet may contain all of your Device information including its basic contents as your passwords, credit cards information, The HTML form for the signup/login that you've just completed with all its data, VoIP Calls and messages being sent to the server + upcoming/received ones.
that's why we need SSL encryption and the server should have a valid SSL certificate too.
By the way, your device may have sent thousands of packets while you read this now!
Capturing the packets that your device sends or even the packets that other devices on your network send can be done through any packet capturing tool or software as Wireshark.

Not able to access Server-Sent-Events over Mobile 3g Network

I am having an issue with Server Sent events.
My endpoint is not available on mobile 3G network.
One observation I have is that a https endpoint like the one below which is available on my mobile network.
https://s-dal5-nss-32.firebaseio.com/s1.json?ns=iot-switch&sse=true
But the same endpoint when proxy passed using an nginx and accessed over http (without ssl) is not available on my mobile network.
http://aws.arpit.me/live/s1.json?ns=iot-switch&sse=true
This is available on my home/office broadband network though. Only creates an issue over my mobile 3g network.
Any ideas what might be going on?
I read that mobile networks use broken transparent proxies that might be causing this. But this is over HTTP.
Any help would be appreciated.
I suspect the mobile network is forcing use of an HTTP proxy that tries to buffer files before forwarding them to the browser. Buffering will make SSE messages wait in the buffer.
With SSE there are a few tricks to work around such proxies:
Close the connection on the server after sending a message. Proxies will observe end of the "file" and forward all messages they've buffered.
This will be equivalent to long polling, so it's not optimal. To avoid reducing performance for all clients you could do it only if you detect it's necessary, e.g. when a client connects always send a welcome message. The client should expect that message and if the message doesn't arrive soon enough report the problem via an AJAX request to the server.
Send between 4 and 16KB of data in SSE comments before or after a message. Some proxies have limited-size buffers, and this will overflow the buffer forcing messages out.
Use HTTPS. This bypasses all 3rd party proxies. It's the best solution if you can use HTTPS.

Can I reuse my existing TCP-Server?

At the moment I have an existing application which basically consists of a desktop GUI and a TCP server. The client connects to the server, and the server notifies the client if something interesting happens.
Now I'm supposed to replace the desktop GUI by a web GUI, and I'm wondering if I have to rewrite the server to send http packets instead of tcp packets or if I can somehow use some sort of proxy to grab the tcp packets and forward them to the web client?
Do I need some sort of comet server?
If you can make your client ask something like "Whats new pal?" to your server from time to time you can start implementing HTTP server emulator over TCP - its fun and easy process. And you could have any web based GUI.
You can just add to your TCP responds Http headers - itll probably do=)
So I mean HTTP is just a TCP with some headers like shown in here.
You should probably install fiddler and monitor some http requests/ responses you normally do on the web and you'll get how to turn your TCP server into http emulator=)
If you want keep sockets based approche use flash (there is some socket api) or silverlight (there is socket API and you can go for NetTcpBinding or Duplexbinding something like that - it would provide you with ability to receive messages from server when server wants you to receive them (server pushes messages))
So probably you should tall us which back end you plan to use so we could recomend to you something more usefull.

Resources