Asynchronous I/O with IO Uring and Pipes - asynchronous

I wanted to test the speed of epoll compare with io uring using pipes to write to and read from pipes.
I created 2 io_uring rings.
ring(for write in parent process)
ring2(for read in child process)
But I stumbled upon the event where the child process of io_uring reads the pipe before the pipes is written by the parent process, thus the cqes return asynchronous failed. I not sure which options or flags from the io_uring library can apply to make sure that if the pipes is empty wait till the parent process written it to the pipes before the child process can read from it
int BUF_SIZE = 1024;
first I create pipes:
//write:
pipe2(&pipes[0], O_NONBLOCK | O_CLOEXEC)
//read:
pipe2(&pipes[1], O_NONBLOCK | O_CLOEXEC)
// create a iovec buffer
buffer.iov_base = malloc(BUF_SIZE * 2);
buffer.iov_len = BUF_SIZE * 2;
memset(buffer.iov_base, 'R', BUF_SIZE * 2);
fork() create child process:
Parent process(write):
sqe = io_uring_get_sqe(&ring);
io_uring_prep_write(sqe, 1 , buffer.iov_base + 1, BUF_SIZE, 0);
sqe->flags |= IOSQE_FIXED_FILE|IOSQE_IO_LINK;
Child process(read):
sqe2 = io_uring_get_sqe(ring2);
io_uring_prep_read(sqe2, 0, buffer.iov_base, BUF_SIZE, 0);
sqe2->flags |= IOSQE_FIXED_FILE;
This the method to check the cqes for error:
if (cqes2[j]->res < 0)
printf(Async task failed: %s\n", strerror(-cqes2[j]->res));
if (cqes2[j]->res != BUF_SIZE) {
printf("Mismatching read/write: %d\n");

Related

Using MPI_Scatter with 3 processes

I am new to MPI , and my question is how the root(for example rank-0) initializes all its values (in the array) before other processes receive their i'th value from the root?
for example:
in the root i initialize: arr[0]=20,arr[1]=90,arr[2]=80.
My question is ,If i have for example process (number -2) that starts a little bit before the root process. Can the MPI_Scatter sends incorrect value instead 80?
How can i assure the root initialize all his memory before others use Scatter ?
Thank you !
The MPI standard specifies that
If comm is an intracommunicator, the outcome is as if the root executed n
send operations, MPI_Send(sendbuf+i, sendcount, extent(sendtype), sendcount, sendtype, i,...), and each process executed a receive, MPI_Recv(recvbuf, recvcount, recvtype, i,...).
This means that all the non-root processes will wait until their recvcount respective elements have been transmitted. This is also known as synchronized routine (the process waits until the communication is completed).
You as the programmer are responsible of ensuring that the data being sent is correct by the time you call any communication routine and until the send buffer available again (in this case, until MPI_Scatter returns). In a MPI only program, this is as simple as placing the initialization code before the call to MPI_Scatter, as each process executes the program sequentially.
The following is an example based in the document's Example 5.11:
MPI_Comm comm = MPI_COMM_WORLD;
int grank, gsize,*sendbuf;
int root, rbuf[100];
MPI_Comm_rank( comm, &grank );
MPI_Comm_size(comm, &gsize);
root = 0;
if( grank == root ) {
sendbuf = (int *)malloc(gsize*100*sizeof(int));
// Initialize sendbuf. None of its values are valid at this point.
for( int i = 0; i < gsize * 100; i++ )
sendbuf[i] = i;
}
rbuf = (int *)malloc(100*sizeof(int));
// Distribute sendbuf data
// At the root process, all sendbuf values are valid
// In non-root processes, sendbuf argument is ignored.
MPI_Scatter(sendbuf, 100, MPI_INT, rbuf, 100, MPI_INT, root, comm);
MPI_Scatter() is a collective operation, so the MPI library does take care of everything, and the outcome of a collective operation does not depend on which rank called earlier than an other.
In this specific case, a non root rank will block (at least) until the root rank calls MPI_Scatter().
This is no different than a MPI_Send() / MPI_Recv().
MPI_Recv() blocks if called before the remote peer MPI_Send() a matching message.

How to programmatically detect the number of cores and run an MPI program using all cores

I do not want to use mpiexec -n 4 ./a.out to run my program on my core i7 processor (with 4 cores). Instead, I want to run ./a.out, have it detect the number of cores and fire up MPI to run a process per core.
This SO question and answer MPI Number of processors? led me to use mpiexec.
The reason I want to avoid mpiexec is because my code is destined to be a library inside a larger project I'm working on. The larger project has a GUI and the user will be starting long computations that will call my library, which will in turn use MPI. The integration between the UI and the computation code is not trivial... so launching an external process and communicating via a socket or some other means is not an option. It must be a library call.
Is this possible? How do I do it?
This is quite a nontrivial thing to achieve in general. Also, there is hardly any portable solution that does not depend on some MPI implementation specifics. What follows is a sample solution that works with Open MPI and possibly with other general MPI implementations (MPICH, Intel MPI, etc.). It involves a second executable or a means for the original executable to directly call you library provided some special command-line argument. It goes like this.
Assume the original executable was started simply as ./a.out. When your library function is called, it calls MPI_Init(NULL, NULL), which initialises MPI. Since the executable was not started via mpiexec, it falls back to the so-called singleton MPI initialisation, i.e. it creates an MPI job that consists of a single process. To perform distributed computations, you have to start more MPI processes and that's where things get complicated in the general case.
MPI supports dynamic process management, in which one MPI job can start a second one and communicate with it using intercommunicators. This happens when the first job calls MPI_Comm_spawn or MPI_Comm_spawn_multiple. The first one is used to start simple MPI jobs that use the same executable for all MPI ranks while the second one can start jobs that mix different executables. Both need information as to where and how to launch the processes. This comes from the so-called MPI universe, which provides information not only about the started processes, but also about the available slots for dynamically started ones. The universe is constructed by mpiexec or by some other launcher mechanism that takes, e.g., a host file with list of nodes and number of slots on each node. In the absence of such information, some MPI implementations (Open MPI included) will simply start the executables on the same node as the original file. MPI_Comm_spawn[_multiple] has an MPI_Info argument that can be used to supply a list of key-value paris with implementation-specific information. Open MPI supports the add-hostfile key that can be used to specify a hostfile to be used when spawning the child job. This is useful for, e.g., allowing the user to specify via the GUI a list of hosts to use for the MPI computation. But let's concentrate on the case where no such information is provided and Open MPI simply runs the child job on the same host.
Assume the worker executable is called worker. Or that the original executable can serve as worker if called with some special command-line option, -worker for example. If you want to perform computation with N processes in total, you need to launch N-1 workers. This is simple:
(separate executable)
MPI_Comm child_comm;
MPI_Comm_spawn("./worker", MPI_ARGV_NULL, N-1, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &child_comm, MPI_ERRCODES_IGNORE);
(same executable, with an option)
MPI_Comm child_comm;
char *argv[] = { "-worker", NULL };
MPI_Comm_spawn("./a.out", argv, N-1, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &child_comm, MPI_ERRCODES_IGNORE);
If everything goes well, child_comm will be set to the handle of an intercommunicator that can be used to communicate with the new job. As intercommunicators are kind of tricky to use and the parent-child job division requires complex program logic, one could simply merge the two sides of the intercommunicator into a "big world" communicator that replaced MPI_COMM_WORLD. On the parent's side:
MPI_Comm bigworld;
MPI_Intercomm_merge(child_comm, 0, &bigworld);
On the child's side:
MPI_Comm parent_comm, bigworld;
MPI_Get_parent(&parent_comm);
MPI_Intercomm_merge(parent_comm, 1, &bigworld);
After the merge is complete, all processes can communicate using bigworld instead of MPI_COMM_WORLD. Note that child jobs do not share their MPI_COMM_WORLD with the parent job.
To put it all together, here is a complete functioning example with two separate program codes.
main.c
#include <stdio.h>
#include <mpi.h>
int main (void)
{
MPI_Init(NULL, NULL);
printf("[main] Spawning workers...\n");
MPI_Comm child_comm;
MPI_Comm_spawn("./worker", MPI_ARGV_NULL, 2, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &child_comm, MPI_ERRCODES_IGNORE);
MPI_Comm bigworld;
MPI_Intercomm_merge(child_comm, 0, &bigworld);
int size, rank;
MPI_Comm_rank(bigworld, &rank);
MPI_Comm_size(bigworld, &size);
printf("[main] Big world created with %d ranks\n", size);
// Perform some computation
int data = 1, result;
MPI_Bcast(&data, 1, MPI_INT, 0, bigworld);
data *= (1 + rank);
MPI_Reduce(&data, &result, 1, MPI_INT, MPI_SUM, 0, bigworld);
printf("[main] Result = %d\n", result);
MPI_Barrier(bigworld);
MPI_Comm_free(&bigworld);
MPI_Comm_free(&child_comm);
MPI_Finalize();
printf("[main] Shutting down\n");
return 0;
}
worker.c
#include <stdio.h>
#include <mpi.h>
int main (void)
{
MPI_Init(NULL, NULL);
MPI_Comm parent_comm;
MPI_Comm_get_parent(&parent_comm);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("[worker] %d of %d here\n", rank, size);
MPI_Comm bigworld;
MPI_Intercomm_merge(parent_comm, 1, &bigworld);
MPI_Comm_rank(bigworld, &rank);
MPI_Comm_size(bigworld, &size);
printf("[worker] %d of %d in big world\n", rank, size);
// Perform some computation
int data;
MPI_Bcast(&data, 1, MPI_INT, 0, bigworld);
data *= (1 + rank);
MPI_Reduce(&data, NULL, 1, MPI_INT, MPI_SUM, 0, bigworld);
printf("[worker] Done\n");
MPI_Barrier(bigworld);
MPI_Comm_free(&bigworld);
MPI_Comm_free(&parent_comm);
MPI_Finalize();
return 0;
}
Here is how it works:
$ mpicc -o main main.c
$ mpicc -o worker worker.c
$ ./main
[main] Spawning workers...
[worker] 0 of 2 here
[worker] 1 of 2 here
[worker] 1 of 3 in big world
[worker] 2 of 3 in big world
[main] Big world created with 3 ranks
[worker] Done
[worker] Done
[main] Result = 6
[main] Shutting down
The child job has to use MPI_Comm_get_parent to obtain the intercommunicator to the parent job. When a process is not part of such a child job, the returned value will be MPI_COMM_NULL. This allows for an easy way to implement both the main program and the worker in the same executable. Here is a hybrid example:
#include <stdio.h>
#include <mpi.h>
MPI_Comm bigworld_comm = MPI_COMM_NULL;
MPI_Comm other_comm = MPI_COMM_NULL;
int parlib_init (const char *argv0, int n)
{
MPI_Init(NULL, NULL);
MPI_Comm_get_parent(&other_comm);
if (other_comm == MPI_COMM_NULL)
{
printf("[main] Spawning workers...\n");
MPI_Comm_spawn(argv0, MPI_ARGV_NULL, n-1, MPI_INFO_NULL, 0,
MPI_COMM_SELF, &other_comm, MPI_ERRCODES_IGNORE);
MPI_Intercomm_merge(other_comm, 0, &bigworld_comm);
return 0;
}
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("[worker] %d of %d here\n", rank, size);
MPI_Intercomm_merge(other_comm, 1, &bigworld_comm);
return 1;
}
int parlib_dowork (void)
{
int data = 1, result = -1, size, rank;
MPI_Comm_rank(bigworld_comm, &rank);
MPI_Comm_size(bigworld_comm, &size);
if (rank == 0)
{
printf("[main] Doing work with %d processes in total\n", size);
data = 1;
}
MPI_Bcast(&data, 1, MPI_INT, 0, bigworld_comm);
data *= (1 + rank);
MPI_Reduce(&data, &result, 1, MPI_INT, MPI_SUM, 0, bigworld_comm);
return result;
}
void parlib_finalize (void)
{
MPI_Comm_free(&bigworld_comm);
MPI_Comm_free(&other_comm);
MPI_Finalize();
}
int main (int argc, char **argv)
{
if (parlib_init(argv[0], 4))
{
// Worker process
(void)parlib_dowork();
printf("[worker] Done\n");
parlib_finalize();
return 0;
}
// Main process
// Show GUI, save the world, etc.
int result = parlib_dowork();
printf("[main] Result = %d\n", result);
parlib_finalize();
printf("[main] Shutting down\n");
return 0;
}
And here is an example output:
$ mpicc -o hybrid hybrid.c
$ ./hybrid
[main] Spawning workers...
[worker] 0 of 3 here
[worker] 2 of 3 here
[worker] 1 of 3 here
[main] Doing work with 4 processes in total
[worker] Done
[worker] Done
[main] Result = 10
[worker] Done
[main] Shutting down
Some things to keep in mind when designing such parallel libraries:
MPI can only be initialised once. If necessary, call MPI_Initialized to check if the library has already been initialised.
MPI can only be finalized once. Again, MPI_Finalized is your friend. It can be used in something like an atexit() handler to implement a universal MPI finalisation on program exit.
When used in threaded contexts (usual when GUIs are involved), MPI must be initialised with support for threads. See MPI_Init_thread.
You can get number of CPUs by using for example this solution, and then start the MPI process by calling MPI_comm_spawn. But you will need to have a separate executable file.

Begin Transmission and Receiving Byte using I2C, PSOC

I'm new to the PSoC board and I'm trying to read the x,y,z values from a Digital Compass but I'm having a problem in beginning the Transmission with the compass itself.
I found some Arduino tutorial online here but since PSoC doesn't have the library I can't duplicate the code.
Also I was reading the HMC5883L datasheet here and I'm suppose to write bytes to the compass and obtain the values but I was unable to receive anything. All the values I received are zero which might be caused by reading values from wrong address.
Hoping for your answer soon.
PSoC is sorta tricky when you are first starting out with it. You need to read over the documentation carefully of both the device you want to talk to and the i2c module itself.
The datasheet for the device you linked states this on page 18:
All bus transactions begin with the master device issuing the start sequence followed by the slave address byte. The
address byte contains the slave address; the upper 7 bits (bits7-1), and the Least Significant bit (LSb). The LSb of the
address byte designates if the operation is a read (LSb=1) or a write (LSb=0). At the 9
th clock pulse, the receiving slave
device will issue the ACK (or NACK). Following these bus events, the master will send data bytes for a write operation, or
the slave will clock out data with a read operation. All bus transactions are terminated with the master issuing a stop
sequence.
If you use the I2C_MasterWriteBuf function, it wraps all that stuff the HMC's datasheet states above. The start command, dealing with that ack, the data handling, etc. The only thing you need to specify is how to transmit it.
If you refer to PSoC's I2C module datasheet, the MasterWriteBuf function takes in the device address, a pointer to the data you want to send, how many bytes you want to send, and a "mode". It shows what the various transfer modes in the docs.
I2C_MODE_COMPLETE_XFER Perform complete transfer from Start to Stop.
I2C_MODE_REPEAT_START Send Repeat Start instead of Start.
I2C_MODE_NO_STOP Execute transfer without a Stop
The MODE_COMPLETE_XFRE transfer will send the start and stop command for you if I'm not mistaken.
You can "bit-bang" this also if you want but calling directly on the I2C_MasterSendStart, WriteByte, SendStop, etc. But it's just easier to call on their writebuf functions.
Pretty much you need to write your code like follows:
// fill in your data or pass in the buffer of data you want to write
// if this is contained in a function call. I'm basing this off of HMC's docs
uint8 writeBuffer[3];
uint8 readBuffer[6];
writeBuffer[0] = 0x3C;
writeBuffer[1] = 0x00;
writeBuffer[2] = 0x70;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
while((I2C_MasterStatus() & I2C_MSTAT_WR_CMPLT) == 0u)
{
// wait for operation to finish
}
writeBuffer[1] = 0x01;
writeBuffer[2] = 0xA0;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
writeBuffer[1] = 0x02;
writeBuffer[2] = 0x00;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
CyDelay(6); // docs state 6ms delay before you can start looping around to read
for(;;)
{
writeBuffer[0] = 0x3D;
writeBuffer[1] = 0x06;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 2, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
// Docs don't state any different sort of bus transactions for reads.
// I'm assuming it'll be the same as a write
I2C_MasterReadBuf(HMC_SLAVE_ADDRESS, readBuffer, 6, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish, wait on I2C_MSTAT_RD_CMPLT instead of WR_COMPLT
// You should have something in readBuffer to work with
CyDelay(67); // docs state to wait 67ms before reading again
}
I just sorta wrote that off the top of my head. I have no idea if that'll work or not, but I think that should be a good place to start and try. They have I2C example projects to look at also I think.
Another thing to look at so the WriteBuf function doesn't just seem like some magical command, if you right-click on the MasterWriteBuf function and click on "Find Definition" (after you build the project) it'll show you what it's doing.
Following are the samples for I2C read and write operation on PSoC,
simple Write operation:
//Dumpy data values to write
uint8 writebuffer[3]
writebuffer[0] = 0x23
writebuffer[1] = 0xEF
writebuffer[2] = 0x0F
uint8 I2C_MasterWrite(uint8 slaveAddr, uint8 nbytes)
{
uint8 volatile status;
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
status = I2C_MasterWriteBuf(slaveAddr, (uint8 *)&writebuffer, nbytes, I2C_MODE_COMPLETE_XFER);
if(status == I2C_MSTR_NO_ERROR)
{
/* wait for write complete and no error */
do
{
status = I2C_MasterStatus();
} while((status & (I2C_MSTAT_WR_CMPLT | I2C_MSTAT_ERR_XFER)) == 0u);
}
else
{
/* translate from I2CM_MasterWriteBuf() error output to
* I2C_MasterStatus() error output */
status = I2C_MSTAT_ERR_XFER;
}
}
return status;
}
Read Operation:
void I2C_MasterRead(uint8 slaveaddress, uint8 nbytes)
{
uint8 volatile status;
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
/* Then do the read */
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
status = I2C_MasterReadBuf(slaveaddress,
(uint8 *)&(readbuffer),
nbytes, I2C_MODE_COMPLETE_XFER);
if(status == I2C_MSTR_NO_ERROR)
{
/* wait for reading complete and no error */
do
{
status = I2C_MasterStatus();
} while((status & (I2C_MSTAT_RD_CMPLT | I2C_MSTAT_ERR_XFER)) == 0u);
if(!(status & I2C_MSTAT_ERR_XFER))
{
/* Decrement all RW bytes in the EZI2C buffer, by different values */
for(uint8 i = 0u; i < nbytes; i++)
{
readbuffer[i] -= (i + 1);
}
}
}
else
{
/* translate from I2C_MasterReadBuf() error output to
* I2C_MasterStatus() error output */
status = I2C_MSTAT_ERR_XFER;
}
}
}
if(status & I2C_MSTAT_ERR_XFER)
{
/* add error handler code here */
}
}

Need help in IPC through Pipes

I am Working On a lab.
A father process will create two son processes A and B.
Son A will send some string to son B through pipe.son B will Invert the String case of the String Got from Son A and will send back the Inverted string to son A.after receiving the inverted string son A will print it to the screen.
here is the code.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <ctype.h>
void process_A(int input_pipe[], int output_pipe[])
{
int c;
char ch;
int rc;
close(input_pipe[1]);
close(output_pipe[0]);
while ((c = getchar()) > 0) {
ch = (char)c;
rc = write(output_pipe[1], &ch, 1);
if (rc == -1) {
perror("A_TO_B: write");
close(input_pipe[0]);
close(output_pipe[1]);
exit(1);
}
rc = read(input_pipe[0], &ch, 1);
c = (int)ch;
if (rc <= 0) {
perror("A_TO_B: read");
close(input_pipe[0]);
close(output_pipe[1]);
exit(1);
}
putchar(c);
}
close(input_pipe[0]);
close(output_pipe[1]);
exit(0);
}
void process_B(int input_pipe[], int output_pipe[])
{
int c;
char ch;
int rc;
close(input_pipe[1]);
close(output_pipe[0]);
while (read(input_pipe[0], &ch, 1) > 0) {
c = (int)ch;
if (isascii(c) && isupper(c))
c = tolower(c);
else if (isascii(c) && islower(c))
c = toupper(c);
ch = (char)c;
rc = write(output_pipe[1], &ch, 1);
if (rc == -1) {
perror("B_TO_A: write");
close(input_pipe[0]);
close(output_pipe[1]);
exit(1);
}
}
close(input_pipe[0]);
close(output_pipe[1]);
exit(0);
}
int main(int argc, char* argv[])
{
/* 2 arrays to contain file descriptors, for two pipes. */
int A_TO_B[2];
int B_TO_A[2];
int pid;
int rc,i,State;
/* first, create one pipe. */
rc = pipe(A_TO_B);
if (rc == -1) {
perror("main: pipe A_TO_B");
exit(1);
}
/* create another pipe. */
rc = pipe(B_TO_A);
if (rc == -1) {
perror("main: pipe B_TO_A");
exit(1);
}
for(i=0;i<2;i++)
{
if((pid=fork()) <0){perror("fork failed\n");};
if((i==0) && (pid ==0))
{
process_A(A_TO_B, B_TO_A);
}
else if((i==1)&&(pid==0))
{
process_B(B_TO_A, A_TO_B);
}
else if(pid>0)
{
wait( &State );
}
}
return 0;
}
the problem is When i run the program the Son B gets Block.
I need u guys help.
Thanks in advance.
OK, diagram:
initially: parent process: has
B_TO_A[0] and [1] open,
has A_TO_B[0] and [1] open
fork (makes copy)
parent: child (pid==0):
B_TO_A both open, A_TO_B both open call process_A: close unwanted pipe ends, loop
call wait(), wait for one child loop reads stdin, writes one pipe, reads other pipe
if we ever get here:
fork (makes copy)
parent: child (pid==0):
B_TO_A both open, A_TO_B both open call process_B: close unwanted pipe ends, loop
parent: both ends of both pipes open
call wait(), wait for one child loop reads one pipe, writes other pipe
First, you will usually not get to "if we ever get here" because the child running process_A() runs in a loop until either EOF on stdin (if that occurs first) or one of the pipe read/write calls fails (e.g., due to EOF on input_pipe[0]). Since the parent is still waiting in a wait() call, and has both ends of both pipes open, there's no EOF on the pipe (EOF on a pipe occurs after you read all the data written by all writers, and all dups of the write end have been closed). So the only way to get there is to hit EOF on stdin, so that the while loop does not run.
Second, if you do get around to forking again and doing process_B(), that child will also wait forever, because one write end of the pipe it's reading from is still open... in the parent! The parent won't close it, because the parent will be waiting forever in wait.
In general, what you need to do here is:
create two pipes (like you do now)
fork once, and run process_A() in the child
fork again (in the parent), and run process_B() in the (new) child
close both ends of both pipes (in the parent)
wait for both children now, after both have gotten started
The error handling gets a bit messy since you have to do something (such as kill() the first child) if you can't start the second child. So you need to know how far along you have gotten. You can still loop to fork twice but you can't wait inside the loop, and with just two trips around the loop, each of which do rather different steps, you might as well just write it all out without a loop.

A doubt on pipes in Unix

This code below is for executing ls -l | wc -l.
In the code, if I comment close(p[1]) in parent then the program just hangs, waiting for some input. Why it is so? The child writes output of ls on p1 and parent should have taken that output from p0.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
main ()
{
int i;
int p[2];
pid_t ret;
pipe (p);
ret = fork ();
if (ret == 0)
{
close (1);
dup (p[1]);
close (p[0]);
execlp ("ls", "ls", "-l", (char *) 0);
}
if (ret > 0)
{
close (0);
dup (p[0]);
//Doubt, Commenting the line below does not work WHy?
close (p[1]);
wait (NULL);
execlp ("wc", "wc", "-l", (char *) 0);
}
}
pipe + fork creates 4 file descriptors, two are inputs
Before the fork you have a single pipe with one input and one output.
After the fork you will have a single pipe with two inputs and two outputs.
If you have two inputs for the pipe (that a proc writes to) and two outputs (that a proc reads from), you need to close the other input or the reader will also have a pipe input which never gets closed.
In your case the parent is the reader, and in addition to the output end of the pipe, it has an open other end, or input end, of the pipe that stuff could, in theory, be written to. As a result, the pipe never sends an eof, because when the child exits the pipe is still open due to the parent's unused fd.
So the parent deadlocks, waiting forever for it to write to itself.
Note that 'dup(p[1])' means you have two file descriptors pointing to the same file. It does not close p[1]; you should do that explicitly. Likewise with 'dup(p[0])'. Note that a file descriptor reading from a pipe only returns zero bytes (EOF) when there are no open write file descriptors for the pipe; until the last write descriptor is closed, the reading process will hang indefinitely. If you dup() the write end, there are two open file descriptors to the write end, and both must be closed before the reading process gets EOF.
You also do not need or want the wait() call in your code. If the ls listing is bigger than a pipe can hold, your processes will deadlock, with the child waiting for ls to complete and ls waiting for the child to get on with reading the data it has written.
When the redundant material is stripped out, the working code becomes:
#include <unistd.h>
int main(void)
{
int p[2];
pid_t ret;
pipe(p);
ret = fork();
if (ret == 0)
{
close(1);
dup(p[1]);
close(p[0]);
close(p[1]);
execlp("ls", "ls", "-l", (char *) 0);
}
else if (ret > 0)
{
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
execlp("wc", "wc", "-l", (char *) 0);
}
return(-1);
}
On Solaris 10, this compiles without warning with:
Black JL: gcc -Wall -Werror -Wmissing-prototypes -Wstrict-prototypes -o x x.c
Black JL: ./x
77
Black JL:
If the child doesn't close p[1], then that FD is open in two processes -- parent and child. The parent eventually closes it, but the child never does -- so the FD stays open. Therefore any reader of that FD (the child in this case) is going to wait forever just in case more writing it's gonna be done on it... it ain't, but the reader just doesn't KNOW!-)

Resources