Understanding NATS clustering - nats.io

Section NATS Server Clustering states that:
Note that NATS clustered servers have a forwarding limit of one hop.
This means that each gnatsd instance will only forward messages that
it has received from a client to the immediately adjacent gnatsd
instances to which it has routes. Messages received from a route will
only be distributed to local clients. Therefore a full mesh cluster,
or complete graph, is recommended for NATS to function as intended and
as described throughout the documentation.
Let's assume that I have a NATS cluster of 3 nodes: A -> B -> C ( -> denotes a route). Would you please let me know what will happen wit NATS clients in the following scenario:
A message sent to node A
Node A suddenly terminates before delivering the message to node B
Thanks in advance

In the case you described, the message will be dropped.
Core NATS provides a delivery guarantee of "at most once", so if you cannot tolerate lost messages, your application needs to detect that the message never arrived in its destination and resend the message. You might detect this from a timeout using the request/reply pattern, or implement your own type of remediation for lost messages.
Alternatively, you can use NATS streaming, which provides log based persistence and sits atop NATS. It will guarantee the message will be delivered "at least once".

Related

Are Published messages sent to all nodes in a NATS cluster? Or only to nodes with local Subscribers to the message's Subject?

Let's say I have a simple 2-server NATS cluster with servers A and B.
If a client on Server A publishes a message to a Subject for which there are no subscribers on Server B, does that message still get sent from Server A to Server B?
Not sure what the goal is here but more details could help clarify the context. If you're asking specifically whether the nodes can pass messages published on another server to a subscribed client, yes.. how long will it stay? Depending on the defined expiration time.. what if the server goes down before it shares with the mesh, almost impossible but can happen.
Also, though you're probably just illustrating, even numbers are not generally advisable in clusters to avoid split-brain situations where it's impossible to locate the truth source (seed server in this case)..
Yes. NATS clustered servers will forward messages that it has received from a client to the immediately adjacent nats-server instances to which it has routes. Messages received from a route will only be distributed to local clients. There is not subscriber checks in the mesh.
It's not a black and white answer because NATS supports many deployment architectures with server to client communications, server to server cluster communications, and cluster to cluster communications through gateways to form super-clusters. Gateways for example optimize the interest graph propagation (i.e. only send data to other clusters if there is interest) Super-cluster with gateways documentation

What is the difference between DEALER and ROUTER socket archetype in ZeroMQ?

What is the difference between the ROUTER and the DEALER socket archetypes in zmq?
And which should I use, if I have a server, which is receiving messages and a client, which is sending messages? The server will never send a message to a client.
EDIT: I forgot to say that there can be several instances of the client.
For details on ROUTER/DEALER Formal Communication Pattern, do not hesitate to consult the API documentation. There are many features important for ROUTER/DEALER ( XREQ/XREP ) that have nothing beneficial for your indicated use-case.
Many just send, just one just listens?
Given N-clients purely .send() messages to 1-server, which exclusively .recv() messages, but never sends any message back,
the design may benefit from a PUB/SUB Formal Communication Pattern.
In case some other preferences outweight the trivial approach, one may setup a more complex "wireing", using another one-way type of infrastructure, based on PUSH/PULL, and use a reverse setup PUB/SUB, where each new client, the PUB side, .connect()-s to the SUB-side, given a server-side .bind() access-point is on a known, static IP address and the client self-advertises on this signalling channel, that it is alive ( keep-alive with IP-address:port#, where the server-side ought initiate a new PUSHtoPULL.connect() setup onto the client-advertised, .bind()-ready PULL-side access point.
Complex? Rather a limitless tool, only our imagination is our limit.
After some time, one realises all the powers of multi-functional SIG/MSG-infrastructure, so do not hesitate to experiment and re-use the elementary archetypes in more complex, mutually-cooperating distributed systems computing.

Does TCP (Transmission Control Protocol) provide at-most-once, at-least-once or exactly-once delivery

I've heard it said that providing exactly-once delivery is almost impossible. At the same time, TCP is said to provide guaranteed delivery. If TCP does not provide exactly-once guaranteed delivery then does it provide at-most-once or at-least-once
We could say that TCP provides at-least-once delivery and exactly-one processing, with regards to the following definitions:
At-least-once delivery: a TCP message will be delivered at least once to the destination. More specifically, it will keep re-transmitting with specific timeouts if no ACK(knowledgement) is received, so that it will eventually be delivered. However, if some of these re-transmissions were not lost (but just delayed), then more than one copies of the message will be delivered.
Exactly-once processing: each TCP message will be processed by the destination node exactly once. More specifically, the destination will watch out for duplicate messages (checking the IDs of each received message). So, even if a message is delivered twice, the destination node will only process it (pass it to the application level) once and ignore the duplicates received later.
Exactly once is clearly impossible. What if the network connection is severed and never recovers?

How is the shortest path in IRC ensured?

RFC 2810 says the following about one-to-one communication:
Communication on a one-to-one basis is usually performed by clients,
since most server-server traffic is not a result of servers talking
only to each other. To provide a means for clients to talk to each
other, it is REQUIRED that all servers be able to send a message in
exactly one direction along the spanning tree in order to reach any
client. Thus the path of a message being delivered is the shortest
path between any two points on the spanning tree.
(Emphasis mine.)
What does this "one direction" mean? To only one client? And how does this "reach any client" and find "the shortest path between any two points [hosts on the IRC network]"?
And why not simply cut the crap and store IP addresses of clients and let IP do its job? After all, IRC is built on top of TCP/IP.
Johannes alludes to the solution but doesn't fully answer your questions. He is however correct in that graph theory is a large part of the answer.
Because each child node in server maps of EFnet and IRCnet have only one parent, the shortest path is the only path between two servers on the graph; the same vertice cannot be visited twice without backtracking. This is called a spanning tree, where all nodes are connected, but no loops exist.
IRC is not necessarily unicast like TCP/IP. It communicates with multiple clients on different servers by broadcasting. The important thing to note is that the Client says 'send 'hi' to everyone on #coding', and the message travels from the client to the connected server. That server passes the message to any connected servers, and those servers pass it on to any clients subscribed to #coding and then on to any connected servers.
There isn't really anything like 'client-to-client' communication; one-to-one is accomplished by sending a message to a user with the specified name; not ip address. NickServs help to prevent people from hijacking names, and temporarily associate a nickname with an IP, refusing to authenticate other IP addresses, and protecting the nickname with a password when authentication expires.
In much the same way as sending a channel message, the user sends a message to the server 'send 'hi' to #nicky', and the server simply passes this message on until the client #nicky is listed as a client connected to the server receiving the message. Bots provide a means for #nicky to receive messages when offline; they sign in under the username.
EDIT: IRC actually opens an invite-only personal channel for client-client communications.
Essentially, the shortest path guarantee is a result of IRCs broadcast policy; the moment a message propagates near the desired user's server, it is forwarded to the desired user. Timestamps presumably prevent echoed messages if there are loops in the graph of servers.
In the architecture section, we find evidence that 'spanning tree' is being used in the proper sense. Servers are aware of eachother so as to prevent loops (guarantee shortest paths) and connect efficiently:
6.1 Scalability
It is widely recognized that this protocol does not scale
sufficiently well when used in a large arena. The main problem comes
from the requirement that all servers know about all other servers,
clients and channels and that information regarding them be updated
as soon as it changes.
and this one below is a result of having no alternate paths/detours to take
6.3 Network Congestion
Another problem related to the scalability and reliability issues, as
well as the spanning tree architecture, is that the protocol and
architecture for IRC are extremely vulnerable to network congestions.
IRC networks are designed to be IP agnostic, and follow the shortest path because messages propagate the whole graph, stopping when they reach an endpoint. Clients and servers have enough information to discard duplicate broadcasts. IRC is a very simple, but effective chatting protocol that makes no assumptions about security, IP, or hardware. You could literally use a networked telegraph machine to connect to IRC.
Every IRC server is connected to one or more servers in the same Network. A client connects to one of the server. Let's suppose we have the following setup:
A
/ \
B C
/ / \
D E F
Let's suppose a client on server A wants to send a message to a user on server E. In that case, server A only sends a message to server C, which will send this message to server E, but not to F.
If a client on A sends a message to a channel with users on the servers B and E then A will send the message to the servers B and C. B will send the message to the users in that channel that are connected to B, C will send the message to server E, which will send the message to it's clients in that channel.
Server D and F will never see the message because nobody in that channel is connected to them, but C will see the message even if nobody in that channel is connected to C because it has to rely the message to E

How can I have my ZeroMQ app reject additional connections?

I have a C++ 0MQ application that does a bind() and sends messages using a PUSH socket. I want to ensure that these messages get sent to no more than one client.
Is there a way to allow just one client to .connect(), and then reject connections from all subsequent clients?
If your server application uses a ROUTER socket instead of PUSH, it has more control over the connections. The first frame of each message contains the id of the sender, so the server can treat one connection specially.
To make this work, the protocol has to be a little more complicated than a simple PUSH/PULL. One way is for the connections to be DEALER sockets, whose first action is to sent an "I'm here" message to the server. The server then knows the id of the connections, and treats the first one specially. Any other connections can be rejected with a "You shouldn't be here" message to the other connections, which of course they must understand and act on it by disconnecting themselves.
After the first "I'm here" message, the clients do not need to send any more messages. They can just sit there waiting for messages from the server, exactly the same as PUSH/PULL.
Yes, there is
While the genuine ZeroMQ messaging framework has lot of built-in features, it allows to integrate additional abstract layers, that can solve your task and many other, custom-specific, needs. So do not worry that there is not a direct API call for doing what you need.
How to do it?
Assuming your formal architecture is given, the viable approach would be to re-use networking security trick known as "port-knocking".
This trick adds an "introduction" phase on a publicly known aPortToKnockAt, after which ( upon having successfully met the condition(s) -- in your case being the first client to have asked for / to have completed a .connect() -- another, working, port is being used privately for a "transport" phase ( and in your case, the original port is being closed ).
This way your application does not devastate either local-side, or the remote-side resources as aPortToKnockAt provides means to protect soliton-archetype only handshaking and forthcoming attempts to knock there will find just a .close()-ed door ( and will handle that remotely ), so a sort of a very efficient passive reject is being achieved.

Resources