Multicast IPC options in unix - unix

Among the following list of IPC options, which could perform multicast (i.e. 1 sender and multiple receivers):
signals
half duplex pipe
named pipe
system V message queue
unix domain socket
Edit
memory mapped files
From my understanding, it might be possible with named pipe (not sure).

There's nothing as conceptually flexible as multicast, but with a few limitations some of the facilities might do what you want.
Signals may be delivered to a process group. The other IPC mechanisms you list have a sender/receiver model and are not suitable for multicast, barring local extensions like Linux's multicast AF_UNIX sockets as #Barmar points out in the comments.
If you only need to send a single "signal" to descendant processes, and only once, you may use an inherited fifo. All receivers inherit the read end of the fifo but not the write end. The process holding the write end closes it at some point, and all receivers will detect EOF on their read end copies.

Related

How to check if MPI one-sided communication has finished?

I am using MPI_Raccumulate function which is one-sided communication from source to destination with pre-defined aggregation function.
I want to check whether all the MPI_Raccumulate call has finished (sender sent the data and receiver received the data successfully) at the end of the program. MPI_Wait, however, does not seem to be the solution to this problem; it only waits for checking whether the source buffer is updatable or not (available to user).
Is there any way (1) to check whether A specific MPI-one-sided communication call has completely finished (in sender and receiver side)? (2) to confirm that there is no send/receive MPI requests in each processor?
My application program should use one-sided communication but need to confirm that there is no more communications at the end of a specific task.
thanks
Completing RMA requests only ensures local completion and thus buffer reuse. Remote completion requires one of:
MPI_Win_complete, in the PSCW usage model
MPI_Win_fence, in the BSP usage model
MPI_Win_unlock(_all) or MPI_Win_flush(_all) in the passive-target usage model.
You probably don't want to use request-based RMA. The regular functions are sufficient for nearly all usage models. The only request RMA operation that is obviously useful is MPI_Get (Or MPI_Get_accumulate with MPI_NO_OP, which is the atomic equivalent of MPI_Get). And I say this as the person most responsible for these features being part of MPI-3.

What's the difference between pipes and sockets?

I found a couple of answers, but they seem to be specifically relating to Windows machines.
So my question is what are the differences between pipes and sockets, and when/how should you choose one over the other?
what are the differences between pipes and sockets, and when/how should you choose one over the other?
Both pipes and sockets handle byte streams, but they do it in different ways...
pipes only exist within a specific host, and they refer to buffering between virtual files, or connecting the output / input of processes within that host. There are no concepts of packets within pipes.
sockets packetize communication using IPv4 or IPv6; that communication can extend beyond localhost. Note that different endpoints of a socket can share the same IP address; however, they must listen on different TCP / UDP ports to do so.
Usage:
Use pipes:
when you want to read / write data as a file within a specific server. If you're using C, you read() and write() to a pipe.
when you want to connect the output of one process to the input of another process... see popen()
Use sockets to send data between different IPv4 / IPv6 endpoints. Very often, this happens between different hosts, but sockets could be used within the same host
BTW, you can use netcat or socat to join a socket to a pipe.
To complete the answer given by Mike, it is important to mention the existence of UNIX domain sockets, which are available on any POSIX compliant operating system. Although very similar to "normal" internet sockets in terms of usage semantics, they are purely local to the machine (of course internet sockets can also work locally), and thus almost behave like a pipe. Almost, because a UNIX pipe is by definition unidirectional:
Pipes and FIFOs (also known as named pipes) provide a unidirectional
interprocess communication channel. A pipe has a read end and a write
end. Data written to the write end of a pipe can be read from the read
end of the pipe. (excerpt from the man page pipe(7))
UNIX domain sockets also have a very unusual feature, as besides data, they also allow sending file descriptors: this way, an unprivileged process can access any file whose descriptor has been sent over the socket. This technique, according to Wikipedia, is used by the ClamAV antivirus scanning daemon.

Why only related processes can only communicate using pipe() (IPC)?

why does there is limitation that with pipe() only parent and child process can communicate, why not unrelated processes?
why can't two children of a process can't communicate using pipe()?
There do have limitation.
Pipe use fd to read/write data, fd is per-process, a process maintain a fd table, child inherit the fd table when fork, and each inherited fd refer to the same open file as in parent process, which is maintained by kernel.
Processes that communicate via the same pipe should be related.
It means, the 2 processes should both aware of the 2 fd of the pipe.
<TLPI> says:
The pipe should be created by a common ancestor before the series of fork() calls that led to the existence of the processes.
There is no such limitation. Any two processes which have a means of obtaining references to each end of the pipe can communicate. A process can even communicate with itself using a pipe.
Any process could obtain a reference to one of the ends of a pipe using any of the following generic means of communicating file descriptors between processes. Pipes are not special in this respect.
The process itself called pipe() and obtained file descriptors for both ends.
The process received the file descriptor as SCM_RIGHTS ancillary data through a socket.
The process obtained the file descriptor from another arbitrary process using platform-specific means like /proc/<pid>/fd on Linux.
(There might be other methods.)
The process inherited the file descriptor from an ancestor (direct or indirect) that obtained it using one of the aforementioned methods.

What are some tips for buffer usage and tuning in custom TCP services?

I've been researching a number of networking libraries and frameworks lately such as libevent, libev, Facebook Tornado, and Concurrence (Python).
One thing I notice in their implementations is the use of application-level per-client read/write buffers (e.g. IOStream in Tornado) -- even HAProxy has such buffers.
In addition to these application-level buffers, there's the OS kernel TCP implementation's buffers per socket.
I can understand the app/lib's use of a read buffer I think: the app/lib reads from the kernel buffer into the app buffer and the app does something with the data (deserializes a message therein for instance).
However, I have confused myself about the need/use of a write buffer. Why not just write to the kernel's send/write buffer? Is it to avoid the overhead of system calls (write)? I suppose the point is to be ready with more data to push into the kernel's write buffer when the kernel notifies the app/lib that the socket is "writable" (e.g. EPOLLOUT). But, why not just do away with the app write buffer and configure the kernel's TCP write buffer to be equally large?
Also, consider a service for which disabling the Nagle algorithm makes sense (e.g a game server). In such a configuration, I suppose I'd want the opposite: no kernel write buffer but an application write buffer, yes? When the app is ready to send a complete message, it writes the app buffer via send() etc. and the kernel passes it through.
Help me to clear up my head about these understandings if you would. Thanks!
Well, speaking for haproxy, it has no distinction between read and write buffers, a buffer is used for both purposes, which saves a copy. However, it is really painful to do some changes. For instance, sometimes you have to rewrite an HTTP header and you have to manage to move data correctly for your rewrite, and to save some state about the previous header's value. In haproxy, the connection header can be rewritten, and its previous and new states are saved because they are need later, after being rewritten. Using a read and a write buffer, you don't have this complexity, as you can always look back in your read buffer if you need any original data.
Haproxy is also able to make use of splicing between sockets on Linux. This means that it does not read nor write data, it just tells the kernel what to take where, and where to move it. The kernel then automatically moves pointers without copying data to transfer TCP segments from a network card to another one (when possible), but data are then never transferred to user space, thus avoiding a double copy.
You're completely right about the fact that in general you don't need to copy data between buffers. It's a waste of memory bandwidth. Haproxy runs at 10Gbps with 20% CPU with splicing, but without splicing (2 more copies), it's close to 100%. But then consider the complexity of the alternatives, and make your choice.
Hoping this helps.
When you use asynchronous socket IO operation, the asynchronous read/write operation returns immediately, since the asynchronous operation does not guaranty dealing all the data (ie put all the required data to TCP socket buffer or get all the required data from it) successfully with one invocation, the partial data must outlive through mutiple operations. Then you need an application buffer space to keep the data as long as IO operations last.

TCP Socket Piping

Suppose that you have 2 sockets(each will be listened by other TCP peers) each resides on the same process, how these sockets could be bound, meaning input stream of each other will be bound to output stream of other. Sockets will continuously carry data, no waiting will happen. Normally thread can solve this problem but, rather than creating threads is there more efficient way of piping sockets?
If you need to connect both ends of the socket to the same process, use the pipe() function instead. This function returns two file descriptors, one used for writing and the other used for reading. There isn't really any need to involve TCP for this purpose.
Update: Based on your clarification of your use case, no, there isn't any way to tell the OS to connect the ends of two different sockets together. You will have to write code to read from one socket and write the same data to the other. Depending on the architecture of your process, you may or may not need an additional thread to do this work. For example, if your application is based on a select() loop, then creating another thread is not necessary.
You can avoid threads with an event queue within the process. The WP Message queue article assumes you want interprocess message passing, but if you are using sockets, you kind of are doing interprocess message passing over the same process.

Resources