fork() has no homogeneous behaviour within shell - unix

I've been testing this simple fork in C:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
int main (int argc, char** argv) {
int id = fork();
pid_t p = getpid();
pid_t pp = getppid();
if (id != 0) {
printf("I'm the parent, fork() gave me %i, my pid is %i and my parent's is %i \n",id,p,pp);
return 0;
}
else {
printf("I'm the child, fork() gave me %i, my pid is %i and my parent's is %i \n",id,p,pp);
return 0;
}
}
I run it on Debian using VSCode and gcc.
Everytime I launch it, i get something like :
I'm the parent, fork() gave me 152911, my pid is 152906 and my parent's is 152890
I'm the child, fork() gave me 0, my pid is 152911 and my parent's is 1
[1] + Done "/usr/bin/gdb" --interpreter=mi --tty=${DbgTerm} 0<"/tmp/Microsoft-MIEngine-In-bdyrp1e2.wm5" 1>"/tmp/Microsoft-MIEngine-Out-vixicyih.jop"
in the VSCode console. But when I invoke this script directly from the console, sometimes something "strange" happens :
me#laptop:~/Desktop/programmesC$ ./test_fork
I'm the parent, fork() gave me 153571, my pid is 153570 and my parent's is 8738
me#laptop:~/Desktop/programmesC$ I'm the child, fork() gave me 0, my pid is 153571 and my parent's is 1
Hence it behaves as if the parent gave the control back to the shell then the child executed. And in the meanwhile the shell has got the result of the children's print() ?
Which would mean that everytime I get a "proper" output (like in case #1), it's the child process who finishes its execution before its parent ? Am I correct or did I miss something ?

The order these three things are queued and ran will appear as if it’s random.
The code returns for both the parent and child. In your case, occasionally the parent ends and the shell is resumed before the child ends.

Related

fork(), runs always the parent first and then the child

im learning about fork() but something is working wrong in my ubuntu. im running this code:
#include <stdio.h>
#include <unistd.h>
int main(int argc, char **argv)
{
printf("--beginning of program\n");
int counter = 0;
pid_t pid = fork();
if (pid == 0)
{
// child process
int i = 0;
for (; i < 5; ++i)
{
printf("child process: counter=%d\n", ++counter);
}
}
else if (pid > 0)
{
// parent process
int j = 0;
for (; j < 5; ++j)
{
printf("parent process: counter=%d\n", ++counter);
}
}
else
{
// fork failed
printf("fork() failed!\n");
return 1;
}
printf("--end of program--\n");
return 0;
}
i know the parent and the child should run their code in no specific order so the code should return something like this:
-beginning of program
parent process: counter=1
parent process: counter=2
parent process: counter=3
child process: counter=1
parent process: counter=4
child process: counter=2
parent process: counter=5
child process: counter=3
child process: counter=4
child process: counter=5
--end of program--
But every time i run the program, they run in what it seems to be the same order. here is a capture of what i get every single time i run this program:
user#user-VirtualBox:~/Documents/S-O$ ./sample
--beginning of program
parent process: counter=1
parent process: counter=2
parent process: counter=3
parent process: counter=4
parent process: counter=5
--end of program--
user#user-VirtualBox:~/Documents/S-O$
child process: counter=1
child process: counter=2
child process: counter=3
child process: counter=4
child process: counter=5
--end of program--
it seems like the parent finish first, and then the child starts. but i guess thats not true. please note that the second shell prompt is opened by the program itself. any ideas of what may be happening?
Edit:
if i put an sleep(1) it works just fine. i still think that with no delay it shouldn't always have the same execution order. even counting until 100, it giver the same thing
Yes parent and children run their code in an almost unpredictable order (remember that there is no randomness), but your scheduler probably choose, after forking, to schedule the parent first and then the child. As both have a very small code, the parent is able to run all of its code, and then the children. To experiment something else you must do something much much more longer after forking in both processes (You will be able to see different interleaving of outputs for example).
The prompt is not produced by your code, but is an effect of it. When the parent process dies, the shell that ran it get control back and prompt you for another command, while in the same time the children is working and pollutes your terminal. The child has become an orphan process. To prevent this, you must tell the parent to wait its child by calling...wait().

Process reads data before writing into pipe

I am trying to create pipe and use it with fork(). But I m confused in the order of execution.
Process reads data from pipe before anything is written into pipe. Sometimes it runs correctly. But sometimes, reads before writing but still gives correct output.
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main(void)
{
int fd[2], nbytes,ret;
pid_t childpid;
char string[] = "Hello, world!\n";
char readbuffer[80];
pipe(fd);
if(!fork())
{
close(fd[0]);
printf("Writing...");
write(fd[1], string, (strlen(string)+1));
exit(0);
}
else{
close(fd[1]);
printf("Reading...");
nbytes = read(fd[0], readbuffer, sizeof(readbuffer));
printf("Received string: %s", readbuffer);
wait(NULL);
}
return 0;
}
It happens because scheduler schedules the parent process before the child and thus
it results in the read operation before write. On the other hand, the output may be
as expected sometimes when scheduler will schedule the child first.
SOLUTION:
1. you need to use synchronisation techniuqes like semaphores in order to synchronise the
parent and the child.
2. or you can make the parent process wait until the child has finished. Consider the
following code :
int child_status; // declare this variable
/* add the following line to the else clause */
else
{
close(fd[1]);
wait(&child_status);
// rest of your code
}
Explanation :
If the parent process is scheduled before the child, then its execution will halt when
it finds the 'wait(&child_status)' statement and will wait for the child to finish
its execution and then proceed further.

Difficulty in redirecting output in a dup2 and pipe code in Unix

I am new in unix. In the following code, I pass three arguments from the command line "~$ foo last sort more" in order to replicate "~$ last | sort | more". I am trying to create a program that will take three argument(at least 3 for now). The parent will fork three processes. The first process will write to the pipe. The second process will read and write to and from the pipe and the third process will read from the pipe and write to the stdout(terminal). First process will exec "last", second process will exec "sort" and third process will exec "more" and the processes will sleep for 1,2 and 3 secs in order to synchronize. I am pretty sure I am having trouble creating a pipe and redirecting the input and output. I don't get any output to the terminal but I can see that the processes have been created. I would appreciate some help.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <dirent.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <errno.h>
#define FOUND 1
#define NOT_FOUND 0
#define FIRST_CHILD 1
#define LAST_CHILD numargc
#define PATH_1 "/usr/bin/"
#define PATH_2 "/bin/"
#define DUP_READ() \
if (dup2(fdes[READ], fileno(stdin)) == -1) \
{ \
perror("dup error"); \
exit(4); \
}
#define DUP_WRITE() \
if (dup2(fdes[WRITE], fileno(stdout)) == -1) \
{ \
perror("dup error"); \
exit(4); \
}
#define CLOSE_FDES_READ() \
close(fdes[READ]);
#define CLOSE_FDES_WRITE() \
close(fdes[WRITE]);
#define EXEC(x, y) \
if (execl(arraycmds[x], argv[y], (char*)NULL) == -1) \
{ \
perror("EXEC ERROR"); \
exit(5); \
}
#define PRINT \
printf("FD IN:%d\n", fileno(stdin)); \
printf("FD OUT:%d\n", fileno(stdout));
enum
{
READ, /* 0 */
WRITE,
MAX
};
int cmdfinder( char* cmd, char* path); /* 1 -> found, 0 -> not found */
int main (int argc, char* argv[])
{
int numargc=argc-1;
char arraycmds[numargc][150];
int i=1, m=0, sleeptimes=5, numfork;
int rc=NOT_FOUND;
pid_t pid;
int fdes[2];
if(pipe(fdes) == -1)
{
perror("PIPE ERROR");
exit(4);
}
while(i <= numargc)
{
memset(arraycmds[m], 0, 150);
rc=cmdfinder(argv[i], arraycmds[m]);
if (rc)
{
printf("Command found:%s\n", arraycmds[m]);
}
i++;
m++;
}
i=0; //array index
numfork=1; //fork number
while(numfork <= numargc)
{
if ((pid=fork()) == -1)
{
perror("FORK ERROR");
exit(3);
}
else if (pid == 0)
{
/* Child */
sleep(sleeptimes);
if (numfork == FIRST_CHILD)
{
DUP_WRITE();
EXEC(i, numfork);
}
else if (numfork == LAST_CHILD)
{
DUP_READ();
CLOSE_FDES_WRITE();
EXEC(i, numfork);
}
else
{
DUP_READ();
DUP_WRITE();
CLOSE_FDES_READ();
CLOSE_FDES_WRITE();
EXEC(i, numfork);
}
}
else
{
/* Parent */
printf("pid:%d\n", pid);
i++;
numfork++;
sleeptimes++;
}
}
PRINT;
printf("i:%d\n", i);
printf("numfork:%d\n", numfork);
printf("DONE\n");
return 0;
}
int cmdfinder(char* cmd, char* path)
{
DIR* dir;
struct dirent *direntry;
char *pathdir;
int searchtimes=2;
while (searchtimes)
{
pathdir = (char*)malloc(250);
memset(pathdir, 0, 250);
if (searchtimes==2)
{
pathdir=PATH_1;
}
else
{
pathdir=PATH_2;
}
if ((dir = opendir(pathdir)) == NULL)
{
perror("Directory not found");
exit (1);
}
else
{
while (direntry = readdir(dir))
{
if (strncmp( direntry->d_name, cmd, strlen(cmd)) == 0)
{
strcat(path, pathdir);
strcat(path, cmd);
//searchtimes--;
return FOUND;
}
}
}
closedir(dir);
searchtimes--;
}
printf("%s: Not Found\n", cmd);
return NOT_FOUND;
}
All your macros are making this harder to read than if you just wrote it straight. Especially when they refer to local variables. To find out what's going on with EXEC my eyes have to jump up from where it's used to where it's defined, find out which local arrays it uses, then jump back down to see how that access fits in the flow of main. It's a maze of macros.
And wow, cmdfinder? Your very own $PATH lookup, only it's hardcoded /usr/bin:/bin? And double wow, readdir, just to find out if a file exists whose name is already decided? Just stat it! Or don't do anything, just exec it and handle the ENOENT by trying the next one. Or use execlp that's what it's there for!
On to the main point... you don't have enough pipes, and you're not closing all the unused descriptors.
last | sort | more is a pipeline of 3 commands connected by 2 pipes. You can't do it with one pipe. The first command should write into the first pipe, the middle command should read the first pipe and write to the second pipe, and the last command should read the second pipe.
You could create both pipes first, then do all the forks, which makes things simple to follow, but requires a lot of closes in every child process since they'll all inherit all the pipe fds. Or you can use a more sophisticated loop, creating each pipe just before forking the first process that will use it, and closing each descriptor in the parent as soon as the relevant child process has been created. I'd hate to see how many macros you'd use for that.
Every successful dup should be followed by a close of the descriptor that was copied. dup is short for "duplicate", not "move". After it's done, you have an extra descriptor left over, so don't just dup2(fdes[1], fileno(stdout) - also close(fdes[1]) afterward. (To be perfectly robust you should check whether fdes[1]==fileno(stdout) already, and in that case skip the dup2 and close.)
FOLLOWUP QUESTIONS
You can't use one pipe for 3 processes because there would be no way to distinguish which data should go to which destination. When the first process writes to the pipe, while both of the other processes are trying to read from it, one of them will get the data but you won't be able to predict which one. You need the middle process to read what the first process writes, and the last process to read what the middle process writes.
You're halfway right about file descriptors being shared after a fork. The actual pipe object is shared. That's what makes the whole system work. But the file descriptors - the endpoints designated by small integers like 1 for standard output, 0 for standard input, and so on - are not coupled the way you suggest. The same pipe object may be associated with the same file descriptor number in two processes, the associations are independent. Closing fd 1 in one process does not cause fd 1 to become closed in any other process, even if they are related.
Sharing of the fd table, so that a close in one task has an effect in another task, is part of the "pthread" feature set, not the "fork" feature set.

why does ptrace singlestep return a too big instruction count when statically linking it?

So, I've already read this article Counting machine instructions of a process using PTRACE_SINGLESTEP, and i understand that dynamically linking a testprogram to my ptrace program will return an instruction count that also counts the initialization of the run-time library. However, I'm trying to get a valid count for my test program, which is:
int main(){
return 0;
}
My ptrace program first also returned 90k+ values, so I changed it to statically linking the used testprogram. The counter is now less, but still over 12k. The program I used to count the instructions is:
#include <sys/ptrace.h>
#include <unistd.h>
#include <stdio.h>
int main() {
long long counter = 1; // machine instruction counter
int wait_val; // child's return value
int pid; // child's process id
int dat;
switch (pid = fork()) { // copy entire parent space in child
case -1: perror("fork");
break;
case 0: // child process starts
ptrace(PTRACE_TRACEME,0,NULL,NULL);
/*
must be called in order to allow the
control over the child process and trace me (child)
0 refers to the parent pid
*/
execl("./returntestprog","returntestprog",NULL);
/*
executes the testprogram and causes
the child to stop and send a signal
to the parent, the parent can now
switch to PTRACE_SINGLESTEP
*/
break;
// child process ends
default: // parent process starts
wait(&wait_val);
if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0) != 0)
perror("ptrace");
/*
switch to singlestep tracing and
release child
if unable call error.
*/
wait(&wait_val);
// parent waits for child to stop at next
// instruction (execl())
while (wait_val == 1407) {
counter++;
if (ptrace(PTRACE_SINGLESTEP, pid, 0, 0) != 0)
perror("ptrace");
/*
switch to singlestep tracing and
release child
if unable call error.
*/
wait(&wait_val);
// wait for next instruction to complete */
}
/*
continue to stop, wait and release until
the child is finished; wait_val != 1407
Low=0177L and High=05 (SIGTRAP)
*/
}
printf("Number of machine instructions : %lld\n", counter);
return 0;
} // end of switch
Any help would be really appreciated as I'm not quite sure if it's working right, or not at all. Once I get this thing started, I want to work on timing analysis with ptrace, but first things first and try to count the number of executed instructions
thanks!
I've kind of figured it out by now. So even when statically linking your code, you try to prevent the libraries to be dynamically linked to your program and thus also included into your count. The other side of it however is that executing your file, or basically invoking under the operating system also takes an enormous amount of instructions. So basically, as long as the instruction count is constant and the same under the same conditions, you can subtract this count (when using a return 0 program for instance) from your original program, to count the real number of instructions.

A doubt on pipes in Unix

This code below is for executing ls -l | wc -l.
In the code, if I comment close(p[1]) in parent then the program just hangs, waiting for some input. Why it is so? The child writes output of ls on p1 and parent should have taken that output from p0.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
main ()
{
int i;
int p[2];
pid_t ret;
pipe (p);
ret = fork ();
if (ret == 0)
{
close (1);
dup (p[1]);
close (p[0]);
execlp ("ls", "ls", "-l", (char *) 0);
}
if (ret > 0)
{
close (0);
dup (p[0]);
//Doubt, Commenting the line below does not work WHy?
close (p[1]);
wait (NULL);
execlp ("wc", "wc", "-l", (char *) 0);
}
}
pipe + fork creates 4 file descriptors, two are inputs
Before the fork you have a single pipe with one input and one output.
After the fork you will have a single pipe with two inputs and two outputs.
If you have two inputs for the pipe (that a proc writes to) and two outputs (that a proc reads from), you need to close the other input or the reader will also have a pipe input which never gets closed.
In your case the parent is the reader, and in addition to the output end of the pipe, it has an open other end, or input end, of the pipe that stuff could, in theory, be written to. As a result, the pipe never sends an eof, because when the child exits the pipe is still open due to the parent's unused fd.
So the parent deadlocks, waiting forever for it to write to itself.
Note that 'dup(p[1])' means you have two file descriptors pointing to the same file. It does not close p[1]; you should do that explicitly. Likewise with 'dup(p[0])'. Note that a file descriptor reading from a pipe only returns zero bytes (EOF) when there are no open write file descriptors for the pipe; until the last write descriptor is closed, the reading process will hang indefinitely. If you dup() the write end, there are two open file descriptors to the write end, and both must be closed before the reading process gets EOF.
You also do not need or want the wait() call in your code. If the ls listing is bigger than a pipe can hold, your processes will deadlock, with the child waiting for ls to complete and ls waiting for the child to get on with reading the data it has written.
When the redundant material is stripped out, the working code becomes:
#include <unistd.h>
int main(void)
{
int p[2];
pid_t ret;
pipe(p);
ret = fork();
if (ret == 0)
{
close(1);
dup(p[1]);
close(p[0]);
close(p[1]);
execlp("ls", "ls", "-l", (char *) 0);
}
else if (ret > 0)
{
close(0);
dup(p[0]);
close(p[0]);
close(p[1]);
execlp("wc", "wc", "-l", (char *) 0);
}
return(-1);
}
On Solaris 10, this compiles without warning with:
Black JL: gcc -Wall -Werror -Wmissing-prototypes -Wstrict-prototypes -o x x.c
Black JL: ./x
77
Black JL:
If the child doesn't close p[1], then that FD is open in two processes -- parent and child. The parent eventually closes it, but the child never does -- so the FD stays open. Therefore any reader of that FD (the child in this case) is going to wait forever just in case more writing it's gonna be done on it... it ain't, but the reader just doesn't KNOW!-)

Resources