Calling ssh with system in R shell eats subsequent commands

Calling ssh with system in R shell eats subsequent commands - r

My workflow is to send commands from an emacs buffer to an R session in emacs via the ESS package.
a=0;
system("ssh remotehost ls")
a = a+1;
When I run the three lines above in rapid succession (i.e. submit them to the R buffer), the value of a at the end is 0. When I run them slowly, a is 1.
I've only had this issue running an ssh command via system. In all other cases, the commands queue up and all run sequentially.
My colleagues have the exact same issue with their R/vim setup. But we don't have the same issue in RStudio.
Any suggestions here would be great .

ssh eats up any stdin during the system() command. If you paste it line by line then ssh terminates before you submit a=a+1 and thus it gets passed to R instead of ssh. Use system("ssh .. < /dev/null") or system(..., input="") if you don't want terminal input to be eaten by the subprocess.

Related

Running ssh script with background process

Below is a simple example of what I'm trying to accomplish. I'm trying to force an ssh script to not wait for all child processes to exit before returning. The purpose is to launch a daemon process on a remote host via ssh.
test.sh
#!/bin/bash
(
sleep 2
echo "done"
) &
When I run the script on the console it returns immediately, with "done" appearing 2 seconds later.
When I run the script as an ssh script, the ssh command . It appears to wait until all child processes have terminated until ssh exits.
ssh example
$ ssh mike#127.0.0.1 /home/mike/test.sh
(2 seconds)
done
standard terminal example
$ ./test.sh
$
(2 seconds)
done
How can I make ssh return when the parent/main process has terminated?
EDIT:
I'm aware of the -f option to ssh to run the process in the background . It leaves the ssh process and connection open on the source host. For my purposes this is unsuitable.

ssh mike#127.0.0.1 /home/mike/test.sh
When you run ssh in this fashion, the remote ssh server will create a set of pipes (or socketpairs) which become the standard input, output, and error for the process which you requested it to run, in this case the script process. The ssh server doesn't end the session based on when the script process exits. Instead, it ends the session when it reads and end-of-file indication on the script process's standard output and standard error.
In your case, the script process creates a child process which inherits the script's standard input, output, and error. A pipe (or socketpair) only returns EOF when all possible writers have exited or closed their end of the pipe. As long as the child process is running and has a copy of the standard output/error file descriptors, the ssh server won't read an EOF indication on those descriptors and it won't close the session.
You can get around this by redirecting standard input and standard output in the command that you pass to the remote server:
ssh mike#127.0.0.1 '/home/mike/test.sh > /dev/null 2>&1'
(note the quotes are important)
This avoids passing the standard output and standard error created by the ssh server to the script process or the subprocesses that it creates.
Alternately, you could add a redirection to the script:
#!/bin/bash
(
exec > /dev/null 2>&1
sleep 2
echo "done"
) &
This causes the script's child process to close its copies of the original standard output and standard error.

Opening a program and then waiting for it

Is there a general way to wait for an executed process that backgrounds in fish (like open "foo")? As far as I can tell, $! (the PID of the last executed child process in bash) is not present in fish, so you can't just wait $!.

1) The fish idiom is cmd1; and cmd2 or if cmd1; cmd2; end.
2) You should find that bash and zsh also don't block if you execute open ARG. That's because open will normally background the program being run then open exits. The shell has no idea that open has put the "real" program in the background. Another example of that behavior is launching vim in GUI mode via vim -g. Add the -W flag on macOS or -w on Linux to the open command and -f to the vim command.
The key here is that open, even if it backgrounds, won't return a signal that fish will use to evaluate the and operator until something happens to the opened process. So you get the behavior you're looking for.

R Parallel - connecting to remote cores

Working in R 2.14.1, on Windows 7
Using the package parallel in R, I'm trying to take advantage of cores outside of my local machine available on my network, where all remote hosts I am connecting to are identical Windows machines.
The basic form of the commands are as such to make the connection.
library(parallel)
#assume 8 cores per machine
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)))
Of course, trying to debug these things can be pretty tricky, but here is where I'm at with it.
If I specify the manual = TRUE flag as below
cl<-makePSOCKcluster(c(rep("localhost", 8), rep("otherhost", 8)), manual=TRUE)
there are no problems connecting to the remote host, and running a parallel process. The computers have identical setups to the one that I am working on. Yet, when this manual flag is not set, the connection command hangs.
This seems to indicate to me that since the manual flag bypasses ssh to make the connection to the host, that ssh is the problem when manual=FALSE.
It is not guaranteed at the moment that the remote computers have ssh on them. The question is, given that I have all the pertinent windows login information for my remote hosts, and that I cannot change the settings on the remote computers, how would I connect to cores on remote machines with the package parallel in R without specifying manual = true?
Alternatively, if ssh must be installed for this to happen, let's assume all computers have ssh on them. How would I connect to cores on the remote machines without circumventing ssh?
If you need any more information please let me know, I appreciate the time.
UPDATE 1
8-26-14
Thanks to Steve Weston for his insights. I will provide an update with the exact tools and setup I use to get my system working when it's up and running.
Feel free to comment or post if you have anything else to add as to what may be the best route to go in remote connecting to a windows machine from a windows machine via makePSOCKcluster, where the manual flag is set to FALSE.

When creating a PSOCK cluster with manual=FALSE, the only way to start a worker on a remote machine is with "ssh", "rsh", or something command-line compatible, such as "plink" from PuTTY. The reason is that makePSOCKcluster starts the remote workers using the "system" function to execute commands of the form:
ssh -l user otherhost '/usr/lib/R/bin/Rscript' -e 'parallel:::.slaveRSOCK()' MASTER=myhost PORT=10187 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
You can confirm this by looking at the source code for the newPSOCKnode function in the file snowSOCK.R from the parallel package.
For this to work, the ssh-compatible command must be available on the local machine and a corresponding ssh daemon must be running on each of the remote machines, otherwise makePSOCKcluster will simply hang. I've found that installing a good, working ssh daemon is the difficult part on Windows.
Unfortunately, manual=TRUE is generally the easiest way to create a PSOCK cluster on multiple Windows machines.

Helle everyone, I had the same problem and I managed to solve it. It is June 2018 when I'm writing this answer, my OS is windows 10 and the R version is 3.2.2. It is surprising to see this problem still exists after 4 years. I hope it can be fixed in the following release.
Before you move on, please make sure you can access the server in cmd using ssh. I didn't put any password in my code because I have the private key, you don't need to do that and you will see the reason later.
Fixing The problem
File directory
Since the function makePSOCKcluster works when manually start the workers, my first trying is to let manual=TRUE, and see what's the output. Here is my result:
machineAddresses <-list(list(host='192.168.1.220',user='jeff'))
cl <- makePSOCKcluster(spec,manual = F)
> Manually start worker on 192.168.1.220 with
"C:/PROGRA~1/R/R-32~1.2/bin/x64/Rscript" -e
"parallel:::.slaveRSOCK()" MASTER=DESKTOP-U5JA32O PORT=11756
OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
Ok, Here is the first problem. The Rscript location is incorrect(The location of Rscript in the server). Generally, it locates in C:\Program Files. In my server is C:\Program Files\R\R-3.2.2\bin. So we need to correct them by adding more option to tell this stupid code where the Rscript is:
machineAddresses <-list(list(host='192.168.1.220',
user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript"))
CMD problem
Once you fix the directory problem, you will find that the code still hangs forever. Then we need to check if we can manually access the server in R, my code is:
system("ssh jeff#192.168.1.220")
> GetConsoleMode on STD_INPUT_HANDLE failed with 6
I honestly don't know what does this error mean, but we just need to fix that. Inspired by #Steve Weston, I decide to use PuTTY, so I install it, and change my code to:
machineAddresses <-list(list(host='192.168.1.220',user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript",rshcmd="plink -pw qwer"))
The option -pw means the password. Because I'm a newbie to PuTTY, I don't know how to let the private key automatically work in PuTTY. Therefore, I use the easiest way to deal with that: put your password! The above code is equivalent to the following in cmd:
plink -pw qwer jeff#192.168.1.220 Rscript -e parallel:::.slaveRSOCK() MASTER=DESKTOP-U5JA32O PORT=11063 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE
And this is exactly what we will do if we manually create the workers. For those who are new like me, you need to add the PuTTY directory in PATH in your environmental variables to run plink. Here are my final codes:
machineAddresses <-list(list(host='192.168.1.220',user='jeff',rscript="C:/Program Files/R/R-3.3.2/bin/Rscript",rshcmd="plink -pw qwer"))
cl <- makePSOCKcluster(machineAddresses,manual = F)
I run it with no problem at all. In summary, the function makePSOCKcluster makes two mistakes:
Assuming a wrong R directory in the server(At least it should assume the same directory as my local computer, but it didn't! I don't know where that strange directory comes from)
Using ssh command to start the connection, which does not work in R. It works well in cmd, but not in R. I don't know the reason.
If you are still not able to use makePSOCKcluster, here is one trick: Try to connect to the server in R using system function first. It can give you some error code, that may instruct you where the problem is. Here is my debugging code:
system("plink -pw qwer jeff#192.168.1.220 Rscript -e parallel:::.slaveRSOCK() MASTER=DESKTOP-U5JA32O PORT=11063 OUT=/dev/null TIMEOUT=2592000 METHODS=TRUE XDR=TRUE")

How do I use the nohup command without getting nohup.out?

I have a problem with the nohup command.
When I run my job, I have a lot of data. The output nohup.out becomes too large and my process slows down. How can I run this command without getting nohup.out?

The nohup command only writes to nohup.out if the output would otherwise go to the terminal. If you have redirected the output of the command somewhere else - including /dev/null - that's where it goes instead.
nohup command >/dev/null 2>&1 # doesn't create nohup.out
Note that the >/dev/null 2>&1 sequence can be abbreviated to just >&/dev/null in most (but not all) shells.
If you're using nohup, that probably means you want to run the command in the background by putting another & on the end of the whole thing:
nohup command >/dev/null 2>&1 & # runs in background, still doesn't create nohup.out
On Linux, running a job with nohup automatically closes its input as well. On other systems, notably BSD and macOS, that is not the case, so when running in the background, you might want to close input manually. While closing input has no effect on the creation or not of nohup.out, it avoids another problem: if a background process tries to read anything from standard input, it will pause, waiting for you to bring it back to the foreground and type something. So the extra-safe version looks like this:
nohup command </dev/null >/dev/null 2>&1 & # completely detached from terminal
Note, however, that this does not prevent the command from accessing the terminal directly, nor does it remove it from your shell's process group. If you want to do the latter, and you are running bash, ksh, or zsh, you can do so by running disown with no argument as the next command. That will mean the background process is no longer associated with a shell "job" and will not have any signals forwarded to it from the shell. (A disowned process gets no signals forwarded to it automatically by its parent shell - but without nohup, it will still receive a HUP signal sent via other means, such as a manual kill command. A nohup'ed process ignores any and all HUP signals, no matter how they are sent.)
Explanation:
In Unixy systems, every source of input or target of output has a number associated with it called a "file descriptor", or "fd" for short. Every running program ("process") has its own set of these, and when a new process starts up it has three of them already open: "standard input", which is fd 0, is open for the process to read from, while "standard output" (fd 1) and "standard error" (fd 2) are open for it to write to. If you just run a command in a terminal window, then by default, anything you type goes to its standard input, while both its standard output and standard error get sent to that window.
But you can ask the shell to change where any or all of those file descriptors point before launching the command; that's what the redirection (<, <<, >, >>) and pipe (|) operators do.
The pipe is the simplest of these... command1 | command2 arranges for the standard output of command1 to feed directly into the standard input of command2. This is a very handy arrangement that has led to a particular design pattern in UNIX tools (and explains the existence of standard error, which allows a program to send messages to the user even though its output is going into the next program in the pipeline). But you can only pipe standard output to standard input; you can't send any other file descriptors to a pipe without some juggling.
The redirection operators are friendlier in that they let you specify which file descriptor to redirect. So 0<infile reads standard input from the file named infile, while 2>>logfile appends standard error to the end of the file named logfile. If you don't specify a number, then input redirection defaults to fd 0 (< is the same as 0<), while output redirection defaults to fd 1 (> is the same as 1>).
Also, you can combine file descriptors together: 2>&1 means "send standard error wherever standard output is going". That means that you get a single stream of output that includes both standard out and standard error intermixed with no way to separate them anymore, but it also means that you can include standard error in a pipe.
So the sequence >/dev/null 2>&1 means "send standard output to /dev/null" (which is a special device that just throws away whatever you write to it) "and then send standard error to wherever standard output is going" (which we just made sure was /dev/null). Basically, "throw away whatever this command writes to either file descriptor".
When nohup detects that neither its standard error nor output is attached to a terminal, it doesn't bother to create nohup.out, but assumes that the output is already redirected where the user wants it to go.
The /dev/null device works for input, too; if you run a command with </dev/null, then any attempt by that command to read from standard input will instantly encounter end-of-file. Note that the merge syntax won't have the same effect here; it only works to point a file descriptor to another one that's open in the same direction (input or output). The shell will let you do >/dev/null <&1, but that winds up creating a process with an input file descriptor open on an output stream, so instead of just hitting end-of-file, any read attempt will trigger a fatal "invalid file descriptor" error.

nohup some_command > /dev/null 2>&1&
That's all you need to do!

Have you tried redirecting all three I/O streams:
nohup ./yourprogram > foo.out 2> foo.err < /dev/null &

You might want to use the detach program. You use it like nohup but it doesn't produce an output log unless you tell it to. Here is the man page:
NAME
detach - run a command after detaching from the terminal
SYNOPSIS
detach [options] [--] command [args]
Forks a new process, detaches is from the terminal, and executes com‐
mand with the specified arguments.
OPTIONS
detach recognizes a couple of options, which are discussed below. The
special option -- is used to signal that the rest of the arguments are
the command and args to be passed to it.
-e file
Connect file to the standard error of the command.
-f Run in the foreground (do not fork).
-i file
Connect file to the standard input of the command.
-o file
Connect file to the standard output of the command.
-p file
Write the pid of the detached process to file.
EXAMPLE
detach xterm
Start an xterm that will not be closed when the current shell exits.
AUTHOR
detach was written by Robbert Haarman. See http://inglorion.net/ for
contact information.
Note I have no affiliation with the author of the program. I'm only a satisfied user of the program.

Following command will let you run something in the background without getting nohup.out:
nohup command |tee &
In this way, you will be able to get console output while running script on the remote server:

sudo bash -c "nohup /opt/viptel/viptel_bin/log.sh $* &> /dev/null" &
Redirecting the output of sudo causes sudo to reask for the password, thus an awkward mechanism is needed to do this variant.

If you have a BASH shell on your mac/linux in-front of you, you try out the below steps to understand the redirection practically :
Create a 2 line script called zz.sh
#!/bin/bash
echo "Hello. This is a proper command"
junk_errorcommand
The echo command's output goes into STDOUT filestream (file descriptor 1).
The error command's output goes into STDERR filestream (file descriptor 2)
Currently, simply executing the script sends both STDOUT and STDERR to the screen.
./zz.sh
Now start with the standard redirection :
zz.sh > zfile.txt
In the above, "echo" (STDOUT) goes into the zfile.txt. Whereas "error" (STDERR) is displayed on the screen.
The above is the same as :
zz.sh 1> zfile.txt
Now you can try the opposite, and redirect "error" STDERR into the file. The STDOUT from "echo" command goes to the screen.
zz.sh 2> zfile.txt
Combining the above two, you get:
zz.sh 1> zfile.txt 2>&1
Explanation:
FIRST, send STDOUT 1 to zfile.txt
THEN, send STDERR 2 to STDOUT 1 itself (by using &1 pointer).
Therefore, both 1 and 2 goes into the same file (zfile.txt)
Eventually, you can pack the whole thing inside nohup command & to run it in the background:
nohup zz.sh 1> zfile.txt 2>&1&

You can run the below command.
nohup <your command> & > <outputfile> 2>&1 &
e.g.
I have a nohup command inside script
./Runjob.sh > sparkConcuurent.out 2>&1

How can I launch an x-window from emacs ess when running R on a server?

I am using emacs-snapshot with the ssh.el package, following the instructions from the ess manual.
There are a few ways to open an R session, but this is how I do it:
open emacs
C-x C-f /server:dir/file.R this puts me in ESS [S] mode
Type 'plot(1)'
C-c C-n to run
emacs asks for starting directory, and I choose the /server:dir/
I would like for a figure to pop up but it wont.
This also doesn't work when using ess-remote in shell or tramp mode, but it does work if I set the starting directory to my local desktop.
Any advice much appreciated. My current workaround is to print the file to pdf and then open pdf in DocView mode, but this takes a few extra steps and is slow.

I do it the other way around:
ssh -X some.server.com to connect to a remote server with x11 forwarding.
emacsclient -nw to restart an Emacs session that is already running
plot(cumsum(rnorm(100))) in R as usual
Then the plot windows appears on the initial machine I ssh'ed away from.
Edit: As a follow-up to the comment: This works for any emacs, either emacs or emacs-snapshot. For a long time I used (server-start) in the ~/.emacs but now I prefer that (just once) lauch emacs --daemon after which I can then connect to via emacsclient (which also exists as emacsclient-snapshot). I really like this -- it gives me Emacs around R in a persistent session that I connect, disconnect and reconnect to.

I selected Dirk's answer because he pointed me in the right direction, and especially for lowering the energy of activation required to visualize my data, but here I am going to give the details of how I got this to work on my desktop.
1) set ssh keypairs (I had previously done this, full instructions for Ubuntu here)
mkdir ~/.ssh
chmod 700 ~/.ssh
ssh-keygen -t rsa
ssh-copy-id username#hostname
2) include the following in ~/.ssh/config
Host any_server_nickname
HostName hostname
User username
ForwardX11 yes
3) open emacs on local machine
4) C-x C-f
5) /any_server_nickname:dir/file.R for files in home directory or /any_server_nickname:/path/to/file.R
6) plot(1)
7) C-x C-b to evaluate entire buffer.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Calling ssh with system in R shell eats subsequent commands - r

ssh eats up any stdin during the system() command. If you paste it line by line then ssh terminates before you submit a=a+1 and thus it gets passed to R instead of ssh. Use system("ssh .. < /dev/null") or system(..., input="") if you don't want terminal input to be eaten by the subprocess.

Related

Running ssh script with background process

Opening a program and then waiting for it

R Parallel - connecting to remote cores

How do I use the nohup command without getting nohup.out?

How can I launch an x-window from emacs ess when running R on a server?

Categories

Resources