On a Unix system, there are two circumstances where getppid will return 1: either the calling process' original parent has exited, or it was directly started by init. Sometimes one might wish to behave differently depending on which of the two it is; for example, the program in this question wants to exit when its parent does, and one way to detect that is to notice that the value returned by getppid is now 1, but init never exits, so if the program was directly started by init it should not exit. You can't tell by calling getppid at the very beginning of main, because your parent might have exited before you ever get a chance to run — if it did the double-fork trick to dissociate from a shell, for instance.
In days of yore, being directly started by init was only a realistic possibility for a tiny handful of processes, /etc/rc and maybe some gettys, but nowadays we have containers and much more powerful "system manager" programs run as init, so it's much more likely to come up.
So the question is, is there any 100% reliable way for a program to tell whether its original parent was init? OS-specific answers are OK, I'm almost certain there's no way to do it within POSIX.
Related
I am looking into various OS designs in the hopes of writing a simple multitasking OS for the DCPU-16. However, everything I read about implementation of preemptive multitasking is centered around interrupts. It sounds like in the era of 16-bit hardware and software, cooperative multitasking was more common, but that requires every program to be written with multitasking in mind.
Is there any way to implement preemptive multitasking on an interruptless architecture? All I can think of is an interpreter which would dynamically switch tasks, but that would have a huge performance hit (possibly on the order of 10-20x+ if it had to parse every operation and didn't let anything run natively, I'm imagining).
Preemptive multitasking is normally implemented by having interrupt routines post status changes/interesting events to a scheduler, which decides which tasks to suspend, and which new tasks to start/continue based on priority. However, other interesting events can occur when a running task makes a call to an OS routine, which may have the same effect.
But all that matters is that some event is noted somewhere, and the scheduler decides who to run. So you can make all such event signalling/scheduling occur only only on OS calls.
You can add egregious calls to the scheduler at "convenient" points in various task application code to make your system switch more often. Whether it just switches, or uses some background information such as elapsed time since the last call is a scheduler detail.
Your system won't be as responsive as one driven by interrupts, but you've already given that up by choosing the CPU you did.
Actually, yes. The most effective method is to simply patch run-times in the loader. Kernel/daemon stuff can have custom patches for better responsiveness. Even better, if you have access to all the source, you can patch in the compiler.
The patch can consist of a distributed scheduler of sorts. Each program can be patched to have a very low-latency timer; on load, it will set the timer, and on each return from the scheduler, it will reset it. A simplistic method would allow code to simply do an
if (timer - start_timer) yield to scheduler;
which doesn't yield too big a performance hit. The main trouble is finding good points to pop them in. In between every function call is a start, and detecting loops and inserting them is primitive but effective if you really need to preempt responsively.
It's not perfect, but it'll work.
The main issue is making sure that the timer return is low latency; that way it is just a comparison and branch. Also, handling exceptions - errors in the code that cause, say, infinite loops - in some way. You can technically use a fairly simple hardware watchdog timer and assert a reset on the CPU without clearing any of the RAM; an in-RAM routine would be where RESET vector points, which would inspect and unwind the stack back to the program call (thus crashing the program but preserving everything else). It's sort of like a brute-force if-all-else-fails crash-the-program. Or you could POTENTIALLY change it to multi-task this way, RESET as an interrupt, but that is much more difficult.
So...yes. It's possible but complicated; using techniques from JIT compilers and dynamic translators (emulators use them).
This is a bit of a muddled explanation, I know, but I am very tired. If it's not clear enough I can come back and clear it up tomorrow.
By the way, asserting reset on a CPU mid-program sounds crazy, but it is a time-honored and proven technique. Early versions of Windows even did it to run compatibility mode on, I think 386's, properly, because there was no way to switch back to 32-bit from 16-bit mode. Other processors and OSes have done it too.
EDIT: So I did some research on what the DCPU is, haha. It's not a real CPU. I have no idea if you can assert reset in Notch's emulator, I would ask him. Handy technique, that is.
I think your assessment is correct. Preemptive multitasking occurs if the scheduler can interrupt (in the non-inflected, dictionary sense) a running task and switch to another autonomously. So there has to be some sort of actor that prompts the scheduler to action. If there are no interrupting devices (in the inflected, technical sense) then there's little you can do in general.
However, rather than switching to a full interpreter, one idea that occurs is just dynamically reprogramming supplied program code. So before entry into a process, the scheduler knows full process state, including what program counter value it's going to enter at. It can then scan forward from there, substituting, say, either the twentieth instruction code or the next jump instruction code that isn't immediately at the program counter with a jump back into the scheduler. When the process returns, the scheduler puts the original instruction back in. If it's a jump (conditional or otherwise) then it also effects the jump appropriately.
Of course, this scheme works only if the program code doesn't dynamically modify itself. And in that case you can preprocess it so that you know in advance where jumps are without a linear search. You could technically allow well-written self-modifying code if it were willing to nominate all addresses that may be modified, allowing you definitely to avoid those in your scheduler's dynamic modifications.
You'd end up sort of running an interpreter, but only for jumps.
another way is to keep to small tasks based on an event queue (like current GUI apps)
this is also cooperative but has the effect of not needing OS calls you just return from the task and then it will go on to the next task
if you then need to continue a task you need to pass the next "function" and a pointer to the data you need to the task queue
Well, it is obvious, let's say we have two processes A and F. F wants to fork A when it has the CPU control (and A is suspended since CPU is on F).
I have Googled however nothing related showed up. Is such a thing possible in Unix environments?
There is definitely no standard and/or portable way to clone a process from the outside but depending on the OS, there are certainly possible ways to divert a process from its task and force it to clone itself or do whatever you want.
I don't think it's a good idea in any way, but it may be possible for process F to attach to A using a debugger interface such as ptrace. Doing something like suspending the target process, saving its state, diverting the process to run fork, then restoring its original state.
It should be noted that your cloning process will probably need to handle some odd cases around threads and the like.
No it's not possible.
The fork() system call makes a copy of the parent, so if you call fork() in the F process, the child will be a copy of F, there's nothing you can do to change this behavior.
The reason this is not possible is because, normally with fork(), there is exactly one difference to start out with between the two processes: the return value of the fork() call itself. Without such a call inside the code of A, there is no way for the processes to have any difference between them, so they would both be doing exactly the same thing, when normally with you want one of the processes to start doing something different.
How exactly do you think what you want to do should work?
No, this would be a huge security hole that would result in the leakage of sensitive information if it were possible.
At best, you could setup a signal handler in the parent process that would fork(2) off a child (maybe exec(2) a pre-configured child process?).
I think you would be better served by looking in to message passing between two processes that have CPU affinity setup, but even then, I think the gains would be nominal (over-optimizing a problem?).
http://www.freebsd.org/cgi/man.cgi?query=cpuset&apropos=0&sektion=0
I understand the differences between fork, vfork, exec, execv, execp. So pls dont rant about it.
My question is about the design of the unix process creation. Why did the designers think of creating 2 seperate calls ( fork and exec ) instead of keeping it one tight call ( spawn ).
Was good API design a reason so that developers had more control over process creation?
Is it because of performance reason, so that we could delay allocating process table and other kernel structures to the child till either copy-on-write or copy-on-access?
The main reason is likely that the separation of the fork() and exec() steps allows arbitrary setup of the child environment to be done using other system calls. For example, you can:
Set up an arbitrary set of open file descriptors;
Alter the signal mask;
Set the current working directory;
Set the process group and/or session;
Set the user, group and supplementary groups;
Set hard and soft resource limits;
...and many more besides. If you were to combine these calls into a single spawn() call, it would have to have a very complex interface, to be able to encode all of these possible changes to the child's environment - and if you ever added a new setting, the interface would need to be changed. On the other hand, separate fork() and exec() steps allow you to use the ordinary system calls (open(), close(), dup(), fcntl(), ...) to manipulate the child's environment prior to the exec(). New functionality (eg. capset()) is easily supported.
fork and exec do completely different things.
fork() - duplicates a process
exec() - replaces a process
There's plenty of reasons to use one without the other. You can fork off child processes that perform tasks on behalf of your controlling parent app e.g., pretty common in the unix world. And you can e.g. setup the preconditions for some other quirky application and then exec it from your launcher application without ever using fork.
From the draft book, xv6 a simple, Unix-like teaching operating system used in MIT course 6.828:
Now it should be clear why it is a good idea that fork and exec are separate calls. Because if they are separate, the shell can fork a child, use open, close, dup in the child to change the standard input and output file descriptors, and then exec. No changes to the program being exec-ed (cat in our example) are required. If fork and exec were combined into a single system call, some other (probably more complex) scheme would be required for the shell to redirect standard input and output, or the program itself would have to understand how to redirect I/O.
Basically as caf says below but providing a good reference. Implementing a simple shell would probably make the reasons very clear.
AFAIK, initially there was only fork(). But performance reasons called for creation of exec() that does not re-create the kernel structures that are going to be immediately overwritten. <-- This is wrong.
There's a performance problem when a process needs co create a child process which is not a copy of itself. A fork() copies process-related kernel data which are going to be immediately replaced by an exec(). Thus vfork() was introduced that does not copy excessive kernel data and any process data; after it, a process is expected to call something exec()-like, and the parent is suspended until the child does so. See 'Bugs' section for description of problems with vfork(), though.
fork() can only be used for creating a child process.Only replication of parent can be done.
While exec() can be used for starting any new process on the system.
I do not see any correlation above two.
is there a way to implement multitasking using setjmp and longjmp functions
You can indeed. There are a couple of ways to accomplish it. The difficult part is initially getting the jmpbufs which point to other stacks. Longjmp is only defined for jmpbuf arguments which were created by setjmp, so there's no way to do this without either using assembly or exploiting undefined behavior. User level threads are inherently not portable, so portability isn't a strong argument for not doing it really.
step 1
You need a place to store the contexts of different threads, so make a queue of jmpbuf stuctures for however many threads you want.
Step 2
You need to malloc a stack for each of these threads.
Step 3
You need to get some jmpbuf contexts which have stack pointers in the memory locations you just allocated. You could inspect the jmpbuf structure on your machine, find out where it stores the stack pointer. Call setjmp and then modify its contents so that the stack pointer is in one of your allocated stacks. Stacks usually grow down, so you probably want your stack pointer somewhere near the highest memory location. If you write a basic C program and use a debugger to disassemble it, and then find instructions it executes when you return from a function, you can find out what the offset ought to be. For example, with system V calling conventions on x86, you'll see that it pops %ebp (the frame pointer) and then calls ret which pops the return address off the stack. So on entry into a function, it pushes the return address and frame pointer. Each push moves the stack pointer down by 4 bytes, so you want the stack pointer to start at the high address of the allocated region, -8 bytes (as if you just called a function to get there). We will fill the 8 bytes next.
The other thing you can do is write some very small (one line) inline assembly to manipulate the stack pointer, and then call setjmp. This is actually more portable, because in many systems the pointers in a jmpbuf are mangled for security, so you can't easily modify them.
I haven't tried it, but you might be able to avoid the asm by just deliberately overflowing the stack by declaring a very large array and thus moving the stack pointer.
Step 4
You need exiting threads to return the system to some safe state. If you don't do this, and one of the threads returns, it will take the address right above your allocated stack as a return address and jump to some garbage location and likely segfault. So first you need a safe place to return to. Get this by calling setjmp in the main thread and storing the jmpbuf in a globally accessible location. Define a function which takes no arguments and just calls longjmp with the saved global jmpbuf. Get the address of that function and copy it to your allocated stacks where you left room for the return address. You can leave the frame pointer empty. Now, when a thread returns, it will go to that function which calls longjmp, and jump right back into the main thread where you called setjmp, every time.
Step 5
Right after the main thread's setjmp, you want to have some code that determines which thread to jump to next, pulling the appropriate jmpbuf off the queue and calling longjmp to go there. When there are no threads left in that queue, the program is done.
Step 6
Write a context switch function which calls setjmp and stores the current state back on the queue, and then longjmp on another jmpbuf from the queue.
Conclusion
That's the basics. As long as threads keep calling context switch, the queue keeps getting repopulated, and different threads run. When a thread returns, if there are any left to run, one is chosen by the main thread, and if none are left, the process terminates. With relatively little code you can have a pretty basic cooperative multitasking setup. There are more things you probably want to do, like implement a cleanup function to free the stack of a dead thread, etc. You can also implement preemption using signals, but that is much more difficult because setjmp doesn't save the floating point register state or the flags registers, which are necessary when the program is interrupted asynchronously.
It may be bending the rules a little, but GNU pth does this. It's possible, but you probably shouldn't try it yourself except as an academic proof-of-concept exercise, use the pth implementation if you want to do it seriously and in a remotely portable fashion -- you'll understand why when you read the pth thread creation code.
(Essentially it uses a signal handler to trick the OS into creating a fresh stack, then longjmp's out of there and keeps the stack around. It works, evidently, but it's sketchy as hell.)
In production code, if your OS supports makecontext/swapcontext, use those instead. If it supports CreateFiber/SwitchToFiber, use those instead. And be aware of the disappointing truth that one of the most compelling use of coroutines -- that is, inverting control by yielding out of event handlers called by foreign code -- is unsafe because the calling module has to be reentrant, and you generally can't prove that. This is why fibers still aren't supported in .NET...
This is a form of what is known as userspace context switching.
It's possible but error-prone, especially if you use the default implementation of setjmp and longjmp. One problem with these functions is that in many operating systems they'll only save a subset of 64-bit registers, rather than the entire context. This is often not enough, e.g. when dealing with system libraries (my experience here is with a custom implementation for amd64/windows, which worked pretty stable all things considered).
That said, if you're not trying to work with complex external codebases or event handlers, and you know what you're doing, and (especially) if you write your own version in assembler that saves more of the current context (if you're using 32-bit windows or linux this might not be necessary, if you use some versions of BSD I imagine it almost definitely is), and you debug it paying careful attention to the disassembly output, then you may be able to achieve what you want.
I did something like this for studies.
https://github.com/Kraego/STM32L476_MiniOS/blob/main/Usercode/Concurrency/scheduler.c
The context/thread switching is done by setjmp/longjmp. The difficult part was to get the allocated stack correct (see allocateStack()) this depends on your platform.
This is just a demonstration how this could work, I would never use this in production.
As was already mentioned by Sean Ogden,
longjmp() is not good for multitasking, as
it can only move the stack upward and can't
jump between different stacks. No go with that.
As mentioned by user414736, you can use getcontext/makecontext/swapcontext
functions, but the problem with those is that
they are not fully in user-space. They actually
call the sigprocmask() syscall because they switch
the signal mask as part of the context switching.
This makes swapcontext() much slower than longjmp(),
and you likely don't want the slow co-routines.
To my knowledge there is no POSIX-standard solution to
this problem, so I compiled my own from different
available sources. You can find the context-manipulating
functions extracted from libtask here:
https://github.com/dosemu2/dosemu2/tree/devel/src/base/lib/mcontext
The functions are:
getmcontext(), setmcontext(), makemcontext() and swapmcontext().
They have the similar semantic to the standard functions with similar names,
but they also mimic the setjmp() semantic in that getmcontext()
returns 1 (instead of 0) when jumped to by setmcontext().
On top of that you can use a port of libpcl, the coroutine library:
https://github.com/dosemu2/dosemu2/tree/devel/src/base/lib/libpcl
With this, it is possible to implement the fast cooperative user-space
threading. It works on linux, on i386 and x86_64 arches.
I am trying to test out my system and wish to emulate a condition, where the child process gets hung. For doing this, I am trying to attach the child process to GDB and putting a break on it. But things don't seem to be going as expected.
Also, in the same vein, how do I know that a spawned child process is not progressing, but is hung?
Use can use SIGSTOP to hang a child process - but that is observably different from the child process going into an infinite loop, or a bad conditional wait - still it may be close enough for testing.
To check a child process has not hung, you have it send heart-beats to the parent (you'll need some kind of communications channel for this - maybe stdin/stdout at a minimum). Then the child has hung if it fails to send a couple of heart-beats messages.
A child process will inherit any pipes created before the fork. You can use this to both "hang" your child and to let it know when to continue. You can have your child process try a blocking read on the pipe, and it will block (i.e. hang) until the parent writes something.
You could also use signals like Douglass mentions. You can let the OS do basic stop/cont or you can implement signal handlers to do something more complex (like entering an infinite loop).
Examples for both of these can be found in the Unix Programming FAQ along with a ton of additional information on process control, signal handling, pipes, etc...
You can try looking in /proc to determine if you are hung. You can read /proc/<child-pid>/stat to get a lot of low-level process information including the current state, the amount of user/kernel time the process has been scheduled, the current stack and instruction pointers, etc... Using a combination of this you can try to determine if the process is hung or not. Check out the proc(5) man page for more info on /proc/<pid>/stat.