So I am looking at my professor's code that he handed out to try and give us an idea of how to implement >, <, | support into our unix shell. I ran his code and was amazed at what actually happened.
if( pid == 0 )
{
close(1); // close
fd = creat( "userlist", 0644 ); // then open
execlp( "who", "who", NULL ); // and run
perror( "execlp" );
exit(1);
}
This created a userlist file in the directory I was currently in, with the "who" data inside that file. I don't see where any connection between fd, and execlp are being made. How did execlp manage to put the information into userlist? How did execlp even know userlist existed?
Read Advanced Linux Programming. It has several chapters related to the issue. And we cannot explain all this in a few sentences. See also the standard stream and process wikipages.
First, all the system calls (see syscalls(2) for a list, and read the documentation of every individual system call that you are using) your program is doing should be tested against failure. But assume they all succeed. After close(1); the file descriptor 1 (STDOUT_FILENO) is free. So creat("userlist",0644) is likely to re-use it, hence fd is 1; you have redirected your stdout to the newline created userlist file.
At last, you are calling execlp(3) which will call execve(2). When successful, your entire process is restarted with the new executable (so a fresh virtual address space is given to it), and its stdout is still the userlist file descriptor. In particular (unless execve fails) the perror call is not reached.
So your code is a bit what a shell running who > userlist is doing; it does a redirection of stdout to userlist and runs the who command.
If you are coding a shell, use strace(1) -notably with -f option- to understand what system calls are done. Try also strace -f /bin/sh -c ls to look into the behavior of a shell. Study also the source code of existing free software shells (e.g. bash and sash).
See also this and the references I gave there.
execlp knowns nothing. Before execing stdout was closed and a file opened, so the descriptor is the one corresponding to stdout (opens always returns the lowest free descriptor). At that point the process has an "stdout" plugged to the file. Then exec is called and this replaces to whole address space, but some properties remains as the descriptors, so know the code of who is executed with an stdout that correspond to the file. This is the way redirections are managed by shells.
Remember that when you use printf (for example) you never specify what stdout exactly is... That can be a file, a terminal, etc.
Basile Starynkevitch correctly explained:
After close(1); the file descriptor 1 (STDOUT_FILENO) is free. So creat("userlist",0644) is likely to re-use it…
This is because, as Jean-Baptiste Yunès wrote, "opens always returns the lowest free descriptor".
It should be stressed that the professor's code only likely works; it fails if file descriptor 0 is closed.
Related
I am trying to understand how, for the above command, the shell arranges for output re-direction when the process associated with the shell itself is replaced by that of echo? Would very much appreciate any help.
Regards
Rupam
Output redirection of an external command is achieved by manipulating file descriptor, typically involving dup2, a system call that assigns a file descriptor to an existing open file. In this case, the standard output of the process that executes echo is instructed to point to the target output file.
The shell does a close equivalent of the following steps:
create a copy of the current process by invoking fork(); the subsequent steps happen in the child process. (This step is omitted when the command is invoked with exec, as in your example.)
open test for writing and remember the file descriptor
make file descriptor 1 equal to the descriptor, by calling dup2(fd, 1)
call execlp or similar to replace current process execution with execution of echo
The echo command simply writes to standard output, a fancy name for file descriptor number 1. It doesn't care if that ends up writing to a TTY, to a file, or to another program. The above steps make sure that file descriptor 1 really points to the open file test.
The question
What is the difference between Cwd::cwd and Cwd::getcwd in Perl, generally, without regard to any specific platform? Why does Perl have both? What is the intended use, which one should I use in which scenarios? (Example use cases will be appreciated.) Does it matter? (Assuming I don’t mix them.) Does choice of either one affect portability in any way? Which one is more commonly used in modules?
Even if I interpret the manual is saying that except for corner cases cwd is `pwd` and getcwd just calls getcwd from unistd.h, what is the actual difference? This works only on POSIX systems, anyway.
I can always read the implementation but that tells me nothing about the meaning of those functions. Implementation details may change, not so defined meaning. (Otherwise a breaking change occurs, which is serious business.)
What does the manual say
Quoting Perl’s Cwd module manpage:
Each of these functions are called without arguments and return the absolute path of the current working directory.
getcwd
my $cwd = getcwd();
Returns the current working directory.
Exposes the POSIX function getcwd(3) or re-implements it if it's not available.
cwd
my $cwd = cwd();
The cwd() is the most natural form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).
And in the Notes section:
Actually, on Mac OS, the getcwd(), fastgetcwd() and fastcwd() functions are all aliases for the cwd() function, which, on Mac OS, calls `pwd`. Likewise, the abs_path() function is an alias for fast_abs_path()
OK, I know that on Mac OS1 there is no difference between getcwd() and cwd() as both actually boil down to `pwd`. But what on other platforms? (I’m especially interested in Debian Linux.)
1 Classic Mac OS, not OS X. $^O values are MacOS and darwin for Mac OS and OS X, respectively. Thanks, #tobyink and #ikegami.
And a little meta-question: How to avoid asking similar questions for other modules with very similar functions? Is there a universal way of discovering the difference, other than digging through the implementation? (Currently, I think that if the documentation is not clear about intended use and differences, I have to ask someone more experienced or read the implementation myself.)
Generally speaking
I think the idea is that cwd() always resolves to the external, OS-specific way of getting the current working directory. That is, running pwd on Linux, command /c cd on DOS, /usr/bin/fullpath -t in QNX, and so on — all examples are from actual Cwd.pm. The getcwd() is supposed to use the POSIX system call if it is available, and falls back to the cwd() if not.
Why we have both? In the current implementation I believe exporting just getcwd() would be enough for most of systems, but who knows why the logic of “if syscall is available, use it, else run cwd()” can fail on some system (e.g. on MorphOS in Perl 5.6.1).
On Linux
On Linux, cwd() will run `/bin/pwd` (will actually execute the binary and get its output), while getcwd() will issue getcwd(2) system call.
Actual effect inspected via strace
One can use strace(1) to see that in action:
Using cwd():
$ strace -f perl -MCwd -e 'cwd(); ' 2>&1 | grep execve
execve("/usr/bin/perl", ["perl", "-MCwd", "-e", "cwd(); "], [/* 27 vars */]) = 0
[pid 31276] execve("/bin/pwd", ["/bin/pwd"], [/* 27 vars */] <unfinished ...>
[pid 31276] <... execve resumed> ) = 0
Using getcwd():
$ strace -f perl -MCwd -e 'getcwd(); ' 2>&1 | grep execve
execve("/usr/bin/perl", ["perl", "-MCwd", "-e", "getcwd(); "], [/* 27 vars */]) = 0
Reading Cwd.pm source
You can take a look at the sources (Cwd.pm, e.g. in CPAN) and see that for Linux cwd() call is mapped to _backtick_pwd which, as the name suggests, calls the pwd in backticks.
Here is a snippet from Cwd.pm, with my comments:
unless ($METHOD_MAP{$^O}{cwd} or defined &cwd) {
...
# some logic to find the pwd binary here, $found_pwd_cmd is set to 1 on Linux
...
if( $os eq 'MacOS' || $found_pwd_cmd )
{
*cwd = \&_backtick_pwd; # on Linux we actually go here
}
else {
*cwd = \&getcwd;
}
}
Performance benchmark
Finally, the difference between two is that cwd(), which calls another binary, must be slower. We can make some kind of a performance test:
$ time perl -MCwd -e 'for (1..10000) { cwd(); }'
real 0m7.177s
user 0m0.380s
sys 0m1.440s
Now compare it with the system call:
$ time perl -MCwd -e 'for (1..10000) { getcwd(); }'
real 0m0.018s
user 0m0.009s
sys 0m0.008s
Discussion, choice
But as you don't usually query the current working directory too often, both options will work — unless you cannot spawn any more processes for some reason related to ulimit, out of memory situation, etc.
Finally, as for selecting which one to use: for Linux, I would always use getcwd(). I suppose you will need to make your tests and select which function to use if you are going to write a portable piece of code that will run on some really strange platform (here, of course, Linux, OS X, and Windows are not in the list of strange platforms).
I am attempting to get Phabricator running on Solaris over apache. The website is working, but all of the cli scripts are not. For example, phd.
The first problem, is that it is not passing arguments to the underling manage-daemons.php script that it invokes. Looking at the phd file, this does not surprise me:
$> cat phd
../scripts/daemon/manage_daemons.php
Now, given my default shell is bash, this isn't going to pass-through my arguments. To do this, I have modified the script:
#! /bin/bash
../scripts/daemon/manage_daemons.php $*
This will now pass-through the arguments, but it's now failing to find transative scripts it requires via relative path:
./phd start
Preparing to launch daemons.
NOTE: Logs will appear in '/var/tmp/phd/log/daemons.log'.
Launching daemon "PhabricatorRepositoryPullLocalDaemon".
[2014-05-09 19:29:59] EXCEPTION: (CommandException) Command failed with error #127!
COMMAND
exec ./phd-daemon 'PhabricatorRepositoryPullLocalDaemon' --daemonize --log='/var/tmp/phd/log/daemons.log' --phd='/var/tmp/phd/pid'
STDOUT
(empty)
STDERR
./phd-daemon: line 1: launch_daemon.php: not found
at [/XXX/XXX/libphutil/src/future/exec/ExecFuture.php:398]
#0 ExecFuture::resolvex() called at [/XXX/XXX/phabricator/src/applications/daemon/management/PhabricatorDaemonManagementWorkflow.php:167]
#1 PhabricatorDaemonManagementWorkflow::launchDaemon(PhabricatorRepositoryPullLocalDaemon, Array , false) called at [/XXX/XXX/phabricator/src/applications/daemon/management/PhabricatorDaemonManagementWorkflow.php:246]
#2 PhabricatorDaemonManagementWorkflow::executeStartCommand() called at [/XXX/XXX/phabricator/src/applications/daemon/management/PhabricatorDaemonManagementStartWorkflow.php:18]
#3 PhabricatorDaemonManagementStartWorkflow::execute(Object PhutilArgumentParser) called at [/XXX/XXX/libphutil/src/parser/argument/PhutilArgumentParser.php:396]
#4 PhutilArgumentParser::parseWorkflowsFull(Array of size 9 starting with: { 0 => Object PhabricatorDaemonManagementListWorkflow }) called at [/XXX/XXX/libphutil/src/parser/argument/PhutilArgumentParser.php:292]
#5 PhutilArgumentParser::parseWorkflows(Array of size 9 starting with: { 0 => Object PhabricatorDaemonManagementListWorkflow }) called at [/XXX/XXX/phabricator/scripts/daemon/manage_daemons.php:30]
Note I have obscured my paths with XXX as they give away sensitive information.
Now, obviously I shouldn't be modifying these scripts. This is an indication that some prerequisite is not set up properly.
It's clear to me that Phabricator is making some (bold) assumption about my setup. But I'm not quite sure what...?
These are supposed to be symlinks. For example, if you look at "phd" in the repository on GitHub, you can see that the file type is "symbolic link":
https://github.com/facebook/phabricator/blob/master/bin/phd
Something in your environment is incorrectly turning the symlinks into normal files. I'm not aware of any Git configuration which can cause this, although it's possible there is something. One situation where I've seen this happen is when a working copy was cloned, then copied using something like rsync without appropriate flags to preserve symlinks.
I need some Powershell advice.
I need to install an application's MSP update file on multiple Win08r2 servers. If I run these commands locally, within the target machine's PS window, it does exactly what I want it to:
$command = 'msiexec.exe /p "c:\test\My Application Update 01.msp" REBOOTPROMPT=S /qb!'
invoke-wmimethod -path win32_process -name create -argumentlist $command
The file being executed is located on the target machine
If I remotely connect to the machine, and execute the two commands, it opens two x64 msiexec.exe process, and one msiexec.exe *32 process, and just sits there.
If I restart the server, it doesn't show that the update was installed, so I don't think it's a timing thing.
I've tried creating and remotely executing a PS1 file with the two lines, but that seems to do the same thing.
If anyone has advice on getting my MSP update installed remotely, I'd be all ears.
I think I've included all the information I have, but if something is missing, please ask questions, and I'll fill in any blanks.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
My process for this is:
Read a CSV for server name and Administrator password
Create a credential with the password
Create a new session using the machine name and credential
Create a temporary folder to hold my update MSP file
Call a PS1 file that downloads the update file to the target server
>>> Creates a new System.Net.WebClient object
>>> Uses that web client object to download from the source to the location on the target server
Call another PS1 file that applies the patch that was just downloaded –>> This is where I’m having issues.
>>> Set the variable shown above
>>> Execute the file specified in the variable
Close the session to the target server
Move to the next server in the CSV…
If I open a PS window and manually set the variable, then execute it (as shown above in the two lines of code), it works fine. If I create a PS1 file on the target server, containing the same two lines of code, then right click > ‘Run With PowerShell’ it works as expected / desired. If I remotely execute my code in PowerGUI, it returns a block of text that looks like this, then just sits there. RDP’d into the server, the installer never launches. My understanding of the “Return Value” value is that “0″ means the command was successful.
PSComputerName : xx.xx.xx.xx
RunspaceId : bf6f4a39-2338-4996-b75b-bjf5ef01ecaa
PSShowComputerName : True
__GENUS : 2
__CLASS : __PARAMETERS
__SUPERCLASS :
__DYNASTY : __PARAMETERS
__RELPATH :
__PROPERTY_COUNT : 2
__DERIVATION : {}
__SERVER :
__NAMESPACE :
__PATH :
ProcessId : 4808
ReturnValue : 0
I even added a line of code between the variable and the execution that creates a text file on the desktop, just to verify I was getting into my ‘executeFile’ file, and that text file does get created. It seems that it’s just not remotely executing my MSP.
Thank you in advance for your assistance!
Catt11.
Here's the strategy I used to embed an msp into a powershell script. It works perfectly for me.
$file = "z:\software\AcrobatUpdate.msp"
$silentArgs = "/passive"
$additionalInstallArgs = ""
Write-Debug "Running msiexec.exe /update $file $silentArgs"
$msiArgs = "/update `"$file`""
$msiArgs = "$msiArgs $silentArgs $additionalInstallArgs"
Start-Process -FilePath msiexec -ArgumentList $msiArgs -Wait
You probably don't need to use the variables if you don't want to, you could hardcode the values. I have this set up as a function to which I pass those arguments, but if this is more of a one-shot deal, it might be easier to hard-code the values.
Hope that helps!
using Start-Process for MSP package is not a good practice because some update package lockdown powershell libs and so you must use WMI call
Let's say I have a command called "enjoy." I'm expecting enjoy to give valid output and an error message. How do I call enjoy such that the valid output goes to one file and the error messages go to another file?
enjoy > log.txt 2> errors.txt
Assuming of course that you've used STDOUT and STDERR properly and you're using a nice shell. If you're using csh, you need to do something more complicated:
(enjoy > log.txt) >& errors.txt
This works because >& redirects both STDOUT and STDERR - but STDOUT has already been redirected. The parentheses make sure that STDOUT is long gone before the data gets anywhere near the overzealous >&.