ZeroMQ: Internal queuing policy before having a valid connection - networking

Say if I have an inproc-PAIR messaging system. Before the Receiver connects to the Sender, the Sender is already bound and starts sending out messages right away.
Now, before the connection succeeds, will ZMQ choose to do one of the following:
queue up those messages internally
block Sender from sending those messages
simply discard them?
I know that ZMQ has an internal queue of 1000 messages, and read that with PubSub it will send out messages right after binding happens and thus will lose messages. But I'm not sure about other protocols.

From the ZeroMQ documentation :
When a PAIR socket enters the mute state due to having reached the high water mark for the connected peer, or if no peer is connected, then any send operations on the socket will block until the peer becomes available for sending; messages are not discarded.
So your first two options are correct; it's queuing messages till you reach hwm then blocks the sender.
It is not dicarding messages.

Related

Why do XMPP messages sometimes get lost on mobile devices

This question asks what to do about loosing XMPP messages on mobile devices when they don't have a stable connection, but I don't really see why the packages get lost in the first place.
I remember having read that the stream between the server and the client stays open when the connection is suddenly lost and will only be destroyed once the connection times out. This means that the server sends arriving messages over the stream, even though the disconnected client can't receive those messages anymore.
I was happy with that explanation for some time, but started wondering why core XMPP would be lacking such an important feature. Eventually I noticed that ensuring correct transmission in the XMPP protocol would be redundant, as the underlying TCP should already ensure the proper transmission of the message, but as the various problems that arise from the lost message it seems that this isn't true.
Why isn't TCP enough to ensure that the message is either correctly sent or fails properly so the server knows it has to send the message later?
Why isn't TCP enough to ensure a proper transmission (or proper error handling, so the server knows the message has to be sent again) in this scenario?
Application gives the data that needs to be sent across to its TCP. TCP segments the data as needed and sends them out on established connection. Application passes over the burden of ensuring the packet reaches the other end to TCP. ( This does not mean,application should not have re-transmissions. Application level protocol can define re-send of messages if right response didn't come)
TCP has the mechanism of the Re-transmissions. Every packet sent to peer needs to be acknowledged. Until the acknowledgements come in TCP shall have the packets in its sendQ. Once the acknowledgement for the packets sent is received, they are removed.
If there is packet loss, acknowledgements don't arrive. TCP does the re-transmissions. Eventually gives up.Notifies application which needs to take action. Packet loss can happen beyond TCPs control. Thus TCP provides best-effort reliable service.

BLE indications

As I understand, BLE indications are a reliable communications method. How do you know if your indications was not communicated. I am writing code for the peripheral/server and currently when I send a notifications, I get a manual response from the central. I read that if I use indications, the acknowledges take place in the L2CAP layer automatically and communications is therefore faster, but how does my embedded controller know the Bluetooth module was not successful at getting the packet across the link? We are using the Microchip RN4030 Bluetooth module.
Let's make things clear.
The BLE stack looks roughly like the following. The stack has these layers in this order:
Link Layer
HCI (if controller and host are separated)
L2CAP
ATT
GATT
Application
The Link Layer is a reliable protocol in the sense that all packets are protected by a CRC and every packet is acknowledged by the receiving device. If a packet is not acknowledged, it is resent until an acknowledge is received. There can also only be one outstanding packet, which means no reordering of packets are possible. Acknowledges are normally not being presented to upper layers.
The HCI layer is the communication protocol between the controller and the host.
The L2CAP layer does almost nothing if you use the standard MTU size of 23. It has a length header and a type code ("channel") which indicates what type of packet is being sent (usually ATT).
On the ATT level, there are two types of packets that are sent from the server that are not sent as a response to a client request. Those are notifications and indications. Sending one notification or indication has the same "performance" since the type is just a tag of a packet that's sent over the lower layers. The differences are listed below:
Indications:
When an indication packet is sent to the client, the client must acknowledge the packet by sending a confirmation packet when it has received the indication packet. Even if the indication packet is invalid, a confirmation shall be sent back.
Upper layers are not involved sending back the confirmation.
The server may not send a new indication until it has received a confirmation from the previous one.
The only time you won't receive a confirmation after an indication is if the link is dropped (you should then get a disconnected event), or there is some bug in some of the BLE stacks.
After the app has sent an indication, most BLE stack confirms to the app that that a confirmation has been received by the client as that the indication operation has completed.
Notifications:
No ATT layer acknowledges are sent.
"Commands and notifications that are received but cannot be processed, due to buffer overflows or other reasons, shall be discarded. Therefore, those PDUs must be considered to be unreliable." However I have never noticed an implementation actually following this rule, i.e. all notifications are delivered to the application. Or I've never hit the max buffer size.
The GATT layer is mostly a definition of how the attribute protocol should be used and defines a DB structure of characteristics.
Implications
According to my opinion, there are several flaws or issues with the BLE standard. There are two things that might make indications useless and unreliable in practice:
There are no way to send back an error response instead of a confirmation.
The fact that it is the ATT layer that sends back the confirmation directly when it has received the indication, and not the app when it has successfully handled the indication.
This means that if for example, some bug or other issue causing that the BLE stack couldn't send the indication to the app, or your app crashed, or your app found your value to be invalid, there is no way your peripheral can aware of that. Since it got the confirmation it thinks everything is fine.
I can't understand why they defined indications this way. Since the app doesn't send the confirmation but a lower layer does, there seems to be no point at all in having an ATT layer acknowledge instead of just using the Link Layer acknowledge. Both are just acknowledges that the packet has been received halfway of its destination.
If we draw a parallel to a HTTP POST and internet, we could consider the Link Layer acknowledge as when the network card of the destination receives the request and the ATT confirmation as a confirmation that the TCP stack received the packet. You have no way of knowing that your web server software actually did receive your request, and it processed it with success.
The fact that notifications are allowed to be dropped by the receiver is also bad. Normally notifications are used if the peripheral wants to stream a lot of data to the central and then you don't want dropped packets. They should have designed the flow control so that the Link Layer stopped acknowledge incoming packets instead until the app are ready to process the next notifications. This is even already implemented at the LL + HCI + Host layers.
Windows
One interesting thing about at least the Windows BLE stack is, if it receives indications faster than the app processes them it starts to drop the indications as well, even though only notifications should be allowed to be dropped due to "buffer overflows or other reasons". It buffers at most 512 indications in my tests.
That said
Just use notifications and if you want some kind of confirmation, let the client send a write packet when it has received your data and successed processing it.

How to identify pushback under Win IOCP?

How can I identify TCP pushback when using IOCP? I.e. how can I find out that the receiver is not receiving, that tx/rx buffers on both sides of connection are full and that the sender should cease to send more data?
With any async TCP send operation the way to determine the rate that the peer is receiving data is to monitor the rate of send completions on the sender.
I've written about this in depth here. In summary, when the receiver's buffers fill and TCP flow control is in operation and the TCP window is reduced the sender cannot send which causes the sender's TCP buffers to fill. This then means that async send requests can not complete. If you track the number of outstanding send requests that are pending you can spot this situation and throttle the sender.
When this happens, the send won't complete.

Disconnect and Reconnect a connected datagram socket

Iam trying to create an iterative server based on datagram sockets (UDP).
It calls connect to the first client which it gets from the first recvfrom() call (yes I know this is no real connect).
After having served this client, I disconnect the UDP socket (calling connect with AF_UNSPEC)
Then I call recvfrom() to get the first packet from the next client.
Now the problem is, that the call of recvfrom() in the second iteration of the loop is returning 0. My clients never send empty packets, so what could be going on.
This is what Iam doing (pseudocode):
s = socket(PF_INET, SOCK_DGRAM, 0)
bind(s)
for(;;)
{
recvfrom(s, header, &client_address) // get first packet from client
connect(s,client_address) // connect to this client
serve_client(s);
connect(s, AF_UNSPEC); // disconnect, ready to serve next client
}
EDIT: I found the bug in my client accidently sending an empty packet.
Now my problem is how to make the client wait to get served instead of sending a request into nowhere (server is connected to another client and doesn't serve any other client yet).
connect() is really completely unnecessary on SOCK_DGRAM.
Calling connect does not stop you receiving packets from other hosts, nor does it stop you sending them. Just don't bother, it's not really helpful.
CORRECTION: yes, apparently it does stop you receiving packets from other hosts. But doing this in a server is a bit silly because any other clients would be locked out while you were connect()ed to one. Also you'll still need to catch "chaff" which float around. There are probably some race conditions associated with connect() on a DGRAM socket - what happens if you call connect and packets from other hosts are already in the buffer?
Also, 0 is a valid return value from recvfrom(), as empty (no data) packets are valid and can exist (indeed, people often use them). So you can't check whether something has succeeded that way.
In all likelihood, a zero byte packet was in the queue already.
Your protocol should be engineered to minimise the chance of an errant datagram being misinterpreted; for this reason I'd suggest you don't use empty datagrams, and use a magic number instead.
UDP applications MUST be capable of recognising "chaff" packets and dropping them; they will turn up sooner or later.
man connect:
...
If the initiating socket is not connection-mode, then connect()
shall set the socket’s peer address, and no connection is made.
For SOCK_DGRAM sockets, the peer address identifies where all
datagrams are sent on subsequent send() functions, and limits
the remote sender for subsequent recv() functions. If address
is a null address for the protocol, the socket’s peer address
shall be reset.
...
Just a correction in case anyone stumbples across this like I did. To disconnect connect() needs to be called with the sa_family member of sockaddr set to AF_UNSPEC. Not just passed AF_UNSPEC.

How does TCP/IP report errors?

How does TCP/IP report errors when packet delivery fails permanently? All Socket.write() APIs I've seen simply pass bytes to the underlying TCP/IP output buffer and transfer the data asynchronously. How then is TCP/IP supposed to notify the developer if packet delivery fails permanently (i.e. the destination host is no longer reachable)?
Any protocol that requires the sender to wait for confirmation from the remote end will get an error message. But what happens for protocols where a sender doesn't have to read any bytes from the destination? Does TCP/IP just fail silently? Perhaps Socket.close() will return an error? Does the TCP/IP specification say anything about this?
TCP/IP is a reliable byte stream protocol. All your bytes will get to the receiver or you'll get an error indication.
The error indication will come in the form of a closed socket. Regardless of what the communication pattern (who does the sending), if the bytes can't be delivered, the socket will close.
So the question is, how do you see the socket close? If you're never reading, you'd eventually get an error trying to write to the closed socket (with ECONNRESET errno, I think).
If you have a need to sleep or wait for input on another file handle, you might want to do your waiting in a select() call where you include the socket in the list of sources you're waiting on (even if you never expect to receive anything). If the select() indicates that the socket is ready for a read call, you may get a -1 return (with ECONNRESET, I think). An EOF would indicate an orderly close (other side did a shutdown() or close().
How to distinguish this error close from a clean close (other program exiting, for example)? The errno values may be enough to distinguish error from orderly close.
If you want an unambiguous indication of a problem, you'll probably need to build some sort of application level protocol above the socket layer. For example, a short "ack" message sent by the receiver back to the sender. Then the violation of that higher level application protocol (sender didn't see an ack) would be a confirmation that it was an error close vs a clean close.
The sockets API has no way of informing the writer exactly how many bytes have been received as acknowledged by the peer. There are no guarantees made by the presence of a successful shutdown or close either.
The TCP/IP specification says nothing about the application interface (which is nearly always the sockets API).
SCTP is an alternative to TCP which attempts to address these shortcomings, among others.
In C, if you write to a socket that has failed with send(), you will get back the number of bytes that were sent. If this does not match the number of bytes you meant to send, then you have a problem. But also, when you write to a failed socket, you get SIGPIPE back. Before you start socket handling, you need to have a signal handler in place that will alert you when you get SIGPIPE.
If you are reading from a socket, you really should wrap it with an alarm so you can timeout. Like "alarm(timeout_val); recv(); alarm(0)". Check the return code of recv, and if it's 0, that indicates that the connection has been closed. A negative return result indicates a read failure and you need to check errno.
TCP is built upon the IP protocol, which is the centerpiece for the Internet, providing much of the interoperability that drives Routing, which is what determines how to get packets from their source to their destination. The IP protocol specifies that error messages should be sent back to the sender via Internet Control Message Protocol(ICMP) in the case of a packet failing to get to the sender. Some of these reasons include the Time To Live(TTL) field being decremented to zero, often meaning that the packet got stuck in a routing loop, or the packet getting dropped due to switch contention causing buffer overruns. As others have said, it is the responsibility of the Socket API that is being used to relay these errors at the IP layer up to the application interacting with the network at the TCP layer.
TCP/IP packets are either raw, UDP, or TCP. TCP requires each byte to be acked, and it will re-transmit bytes that are not acked in time. raw, and UDP are connectionless (aka best effort), so any lost packets (barring some ICMP cases, but many of these get filtered for security) are silently dropped. Upper layer protocols can add reliability, such as is done with some raw OSPF packets.

Resources