When I read about websockets, heartbeats are usually mentioned as a must have. MDN even writes about a special opcode for heartbeats.
But are heartbeats a mandatory part of websockets? Do I have to implement it or else my websockets will be terminated by the browsers or some other standards?
The RFC 6455, the current reference for the WebSocket protocol, defines some control frames to communicate state about the WebSocket:
Close: 0x8
Ping: 0x9
Pong: 0xA
Ping and Pong are used for heartbeat and allows you to check if the client is still responsive. See the quote below:
A Ping frame may serve either as a keepalive or as a means to
verify that the remote endpoint is still responsive.
But when the client gets a Ping, a Pong must be sent back to the server. See the quote:
Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in
response, unless it already received a Close frame. It SHOULD
respond with Pong frame as soon as is practical.
Bottom line
When designing both client and server, supporting heartbeats is up to you. But if you need to check if the connection is still alive, Ping and Pong frames are the standard way to do it.
Just keep in mind that if a Ping is sent and a Pong is not sent back, one peer may assume that the other peer is not alive anymore.
It is mandatory or not depending on client and server implementations. If you are connected to a server that requires you to answer the PING with a PONG, you will be probably disconnected in case you don't reply. Same if you are the server and a client is sending you PING.
Server and client implementations vary (there are a myriad of them), but
the browser´s javascript client do not send PING, and do not provide any API to do so, although It replies to PINGs with PONGs.
Pings and Pongs are not mandatory. They are useful, since they allow the detection of dropped connections. (Without some traffic on the wire, there is no way to detect a dropped connection.)
Note that in the browser, WebSocket heartbeats are not accessible. If you require your browser client code to detect dropped connections, then you have to implement hearbeating on the application level.
Related
As far as I understand from Is it possible to handle TCP flags with TCP socket? and what I've read so far, a server application does not handle and cannot access TCP flags at all.
And from what I've read in RFCs the PSH flag tells the receiving host's kernel to forward data from receive buffer to the application.
I've found this interesting read https://flylib.com/books/en/3.223.1.209/1/ and it mentions that "Today, however, most APIs don't provide a way for the application to tell its TCP to set the PUSH flag. Indeed, many implementors feel the need for the PUSH flag is outdated , and a good TCP implementation can determine when to set the flag by itself."
"Most Berkeley-derived implementations automatically set the PUSH flag if the data in the segment being sent empties the send buffer. This means we normally see the PUSH flag set for each application write, because data is usually sent when it's written. "
If my understanding is correct and TCPStack decides by itself using different conditions,etc. when to set the PSH flag, then what can I do if TCPStack doesn't set the PSH flag when it should?
I have a server application written in Java and client written in C, there are 1000 clients each on a separate host and they all connect to server. A mechanism which acts as a keep-alive involves server sending each 60 seconds a request to each client that requests some info. The response is always less than MTU(1500bytes) so all the time response frames should have PSH flag set.
It happened at some point that client was sending 50 replies to only one request and all of them with PSH flag not set. Buffer got full probably before the client even sent the 3rd or 4th time the same reply and receiving app thrown an exception because it received more data than it was expecting from receive buffer of host.
My question is, what can I do in such a situation if I cannot communicate at all with TCPStack?
P.S. - I know that client should not send more than 1 reply but still in normal operation all the replies have PSH flag set and in this certain situation they didn't, which is not application fault
SSEs are advertised as a unidirectional communication tool to be used from server to client. I have a requirement to broadcast data to all clients and so i was wondering how SSEs behave on a low level. I cannot seem to find any low level information about SSEs online.
Primarily i would like to know if, after sending the data, does the server wait for a response from the client to confirm it has received the data before finishing the "send". That would mean that doing a broadcast using a for loop would be quiet dangerous and slow in which case websockets might be the better options.
Perhaps the implementation depends entirely on the language and framework? Is it not standardized?
Broadcast usually uses UDP which does not wait for a response. - - Broadcasting ip:port by socket server
.. says
UDP Packet: First four bytes as a magic number, next four bytes an IPv4 address (and you might want to add other things like a server name).
The magic number is just in case there is a collision with another application using the same port. Check both the length of the packet and the magic number.
Server would broadcast the packet at something like 30 second time intervals. (Alternatively you could have the server send a response only when a client sends a request via broadcast.)
So the client app would have to send a request back to the server app.
Different protocols would get different responses according the the underlying technology. eg HTTP uses responses extnsivly.
SSE and WebSockets are both over TCP, so there could be a wait before the socket could be used to send further data.
However, each client is a dedicated socket. So server-side you would be using threads or async coding (depending on the server-side language and its conventions). So looping through all the sockets to send a message to each client would be fine and quick.
I think it relates just to the TCP layer, but I describe my setup in the following paragraph:
On google compute engine I set up a http and websocket server (python, geventwebsocket+gevent.WSGIServer). At home I have my computer (esp8266) that connects to it using websockets.
I use websockets because I need bidirectional communication (a couple of messages a day, it goes like this: a message from server, a response from client.) The connection itself is initiated by the client, as it's behind a NAT.
The problem is that a couple of seconds from the last packet exchange, the messages from server don't arrive to the client. However, the client can send packets to the server even minutes after (and possibly much longer). And interestingly then, the probably retransmitted packets from server finally arrive.
I examined the packets are indeed sent from server with wireshark (and retrasmitted, if not ack'ed) and log every network communication on the client, so the problem probably isn't the application software. I get no exceptions in the applications. The connections are open.
I tested the time server can sent packets after the connection initiation/last delivered packet generally and it's between 6 and 20 seconds, varying between tests. In the test server sends out packets with a set, fixed, delay between them.
In a test (couple of packets) with the single set delay usually either all packets arrive, or none (yeah if one doesn't arrive, the next won't).
I suspect that might be because of the NAT. But then the one solution I see would be to periodically (every 6 seconds or less) send out keep alive packets (Pings and Pongs in websocket, or the TCP's keepalive) from the client. But that doesn't seem elegant, as there should be only a few data messages in a day.
And the similar thing happens when ssh'ing from my desktop to the server: after a couple seconds of inactivity at my and server side, the server stops sending anything (tested e.g. with watch -n20 date. Sometimes it just freezes and doesn't update until I press a key = send a packet from client. But the update is not instant in case of the ssh, it takes a couple of seconds after the keypress to see new stuff. Edit: of course that must be due to the retransmission timer algorithm)
So I studied what is the purpose of TCP keep-alive packets etc. and the thing is that routers and NAT's forget the connections or mappings or whatever in some time/keep only the newest. (So I guess in the case of client->server the mappings just recreate as the destination ip is public and is the actual server. And in the opposite direction it is not possible, so it doesn't work.)
But didn't think it can be as bad as in 6 seconds. The websockets almost reduce to polling (although with a possibly smaller lag).
It seems that the router's NAT mechanism may cause the problem. Maybe you can usee some little tools like NAT-PMP or Upnp to open a port and mapping to your local client. This will last long enough for you to do bidirectional communication.
I'm an application developer looking to learn more about the transport layer of my requests that I've been making all these years. I've also been learning more of the backend and am building my own live data service with websockets, which has me curious about how data actually moves around.
As such I've learned about TCP, and I understand how it works, but there's still one term that confuses me-- a "TCP Connection". I have seen it everywhere, and actually there was a thread opened with the exact same question... but as the OP said in the comments, nobody actually answered the question:
TCP vs UDP - What is a TCP connection?
"when we say that there is a connection established between two hosts,
what does that mean? If I could get a magic microscope and inspect the
server or the client, and - a-ha! - find the connection, what would I
be looking at? Some variable allocated by the OS code? Some entry in
some kind of table? How and when does that gets there, and how and
when it is removed from there"
I've been reading to try to figure this out on my own,
Here is a nice resource that details HTTP flow, also mentions "TCP Connection"
https://blog.catchpoint.com/2010/09/17/anatomyhttp/
Here is another thread about HTTP Keep-alive, same "TCP Connection":
HTTP Keep Alive and TCP keep alive
My understanding:
When a client wants data from server, SYN/ACK handshake happens, this "connection" is established, and both parties agree on the starting sequence number, maximum packet size, etc.
as long as this "connection" is still open, client can request/receive data without doing another handshake. TCP Keep-alive sends a heartbeat to keep this "connection" open
1) Somehow a HTTP Header "Keep-alive" also keeps this TCP "connection" open, even though HTTP headers are part of the packet payload and it doesn't seem to make sense that the TCP layer would parse the HTTP headers?
To me it seems like a "connection" between two machines in the literal sense can never be closed, because a client is always free to hit a server with packets (like the first SYN packet, for example)
2) Is a TCP "connection" just the client and server saving the sequence number from the other's IP address? maybe it's just a flag that's saying "hey this client is cool, accept messages from them without a handshake"? So would closing a connection just be wiping that data out from memory?
... both parties agree on the starting sequence number
No, they don't "agree" one a number. Each direction has their own sequence numbering. So the client sends in the SYN to the server the initial sequence number (ISN) for the data from client to server, the server sends in its SYN the ISN for the data from server to client.
Somehow a HTTP Header "Keep-alive" also keeps this TCP "connection" open ...
Not really. With HTTP keep-alive the client just asks a server nicely to not close the connection after the HTTP response was sent so that another HTTP request can be sent using the same TCP connection. The server might decide to follow the clients wish or not.
To me it seems like a "connection" between two machines in the literal sense can never be closed,
Each side can send a packet with a FIN flag to signal that it will no longer send any data. If both sides has send the FIN the the connection is considered close since no one will send anything and thus nothing can be received. If one side decides that it does not want to receive any more data it can send a packet with a RST flag.
Is a TCP "connection" just the client and server saving the sequence number from the other's IP address?
Kind of. Each side saves the current state of the connection, i.e. IP's and ports involved, currently expected sequence number for receiving, current sequence number for sending, outstanding bytes which were not ACKed yet ... If no such state is there (for example one site crashed) then there is no connection.
... maybe it's just a flag that's saying "hey this client is cool, accept messages from them without a handshake"
If a packet got received which fits an existing state then it is considered part of the connection, i.e. it will be processed and the state will be updated.
So would closing a connection just be wiping that data out from memory?
Closing is telling the other that no more data will be send (using FIN) and if both side have done it both can basically remove the state and then there is no connection anymore.
I've a Client Socket that pushes Image Data to Server Socket after connection Handshake is done. and the Server sockets process them without responding anything
It works well for few minutes. But After sometime the Server socket stops getting those Data. That I couldn't figure out why ? Is there any such thing in TCP like if client keep pushing data the server must say something otherwise the conversation will stop ?
I wrote this code years ago. and to make it work I made the server returning a string "ACK" response. However If I change that to any string it will work.
But now I want to figure out the Why to reconstruct the Program.
"One-way" communication with TCP is totally fine unless you need an acknowledgment from the receiver on the sending side. But that's your application-level protocol. At the transport level the packets still flow both ways - TCP keeps sequence numbers in both directions and acknowledges them to the other side. This allows for detecting dropped/duplicate packets and for re-transmission, thus providing reliability of the stream. The window sizes negotiated during connection handshake and updated during the life of the conversation allow TCP to slow down fast sender that would overwhelm a slow receiver.
What you really need to do is to record the TCP connection with a sniffer like tcpdump(1) or wireshark and find out what happens on the wire at the point when "socket stops getting those Data".