R system() doesn't behave like cmd itself - r

I'm doing some bioinformatics analysis in Rstudio, but something strange happens when using system(). I'm also using Windows Subsystem for Linux, so I can run a UNIX executable in my Windows cmd like so:
bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat -f SSWtemplate.fa -q SSWtest.fa -O EMBOSS -d >OUT.txt"
Don't worry about the specifics: what's important is that I use bash -c to indicate I want to use the UNIX bash, and I'm running the executable parasail_aligner. It all works out, and I get the nice output file "OUT.txt".
Now, since I'm doing my analysis in Rstudio, I want to execute this directly from an R script, like so:
system('bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat -f SSWtemplate.fa -q SSWtest.fa -O EMBOSS -d >OUTER.txt"')
So: just give it as an argument to system()? But this gives the following error:
input file, query file, and stdin detected; max inputs is 2
This is obviously an error specifically generated by parasail_aligner. The funny thing is: I don't get this error at all from cmd directly, but I do get it when running the command in R using system(). Does anyone have any idea why something like this can happen at all? I would expect system() to just give its argument to cmd, but clearly it doesn't do this... Running the command in a command terminal opened in Rstudio also works fine, it is specifically system() that seems to mess up.
I'm terribly sorry if this question is vague, but I can't give you a simple example which you could use to replicate the error. I've been using system() for a while now and I've never had this kind of problem. I am on Windows and I've found some people online that say you should use shell() instead of system(), but doing so just gives me the same error.
Maybe it has something to do with this "stdin" thing the error mentions and how R/RStudio handles this, I don't know. But parasail seems to think I give it an extra input "stdin": it is true I give an Input File and a Query File (see error message), but I don't know what this "stdin" is.
If anyone has any ideas about what could be behind this strange behaviour of system(), I'm all ears. I understand that helping me is difficult since I can't give a simple example in which the problem occurs, but I hope someone might know what could be the problem anyway
UPDATE (answer?): so, I managed to resolve the issue, like so:
system('bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat < SSWtemplate.fa -f SSWtest.fa -O EMBOSS -d >OUTER.txt"')
I did some searching about stdin, and (forgive me if what I say sounds amateur, I'm not really familiar with UNIX or command line) found out its "symbol" is <. So you can see in the code above, I changed the way I give in my inputs "SSWtemplate" and "SSWtest", giving one of them using "<", and this solves the problem.
I have no idea why this happens. Especially since it only happens when calling the command from inside RStudio, and not when doing so from cmd. If anyone can clarify this further (i.e. why and how functions like system() and shell() seem to mess with stdin), it would be a big help. Otherwise, I'll just answer this to my own question and leave it at that.

Related

Bash Script function not showing segmentation fault for C program in ZSH [duplicate]

I have a program written in C that fails. When I run it in zsh, the program fails, but it's error output is not displayed in command line (note that the second ➜ was red, indicating a failed execution):
hpsc-hw2-USER on  master
➜ ./sort -a 405500
hpsc-hw2-USER on  master
➜
However, when I run it in bash, the error shows up:
[USER#COMPUTER hpsc-hw2-USER]$ ./sort -a 405500
Segmentation fault (core dumped)
Any idea why? I'm running oh-my-zsh with the spaceship theme if that helps.
but it's error output is not displayed in command line
That's not something that your program writes to the screen. It's the shell that's writing it. So the shell is not 'hiding' it.
I'm not using zsh myself, but since a red arrow indicates that the program was abnormally terminated, I guess you can look at the code for the red arrow and create a custom message. Here is a question about the error codes that might help you: What error code does a process that segfaults return?
I remember that I once made a custom bash prompt that showed the last exit code. Maybe I used this, but I'm not sure: Bash Prompt with Last Exit Code
Here is the documentation for how to customize the prompt for spaceship theme: https://denysdovhan.com/spaceship-prompt/docs/Options.html#exit-code-exit_code
I assume that you need to add exit_code to the SPACESHIP_PROMPT_ORDER section in your .zshrc file.
Any idea why?
You probably have to ask the developers. An equally valid question is: "Why does bash print 'segmentation fault'?" It's just a design choice. Bash does not print the memory address where the segfault occurred. Why is that?
Seems like the developers of oh-my-zsh thought it was enough with a red arrow.
Any ideas how to make zsh output the error message?
The result of the last command is stored in the shell variable ?, so you can print it with something like print $? or echo $?. Like most shells, zsh is incredibly configurable, so you can write a script that e.g. includes the result of the last command in your prompt. Here's a blog post in which the author configures zsh to display non-zero results in the right prompt: Show Exit Code of Last Command in Zsh.
Or why it doesn't do it to begin with?
Shell commands don't normally crash, and if they encounter errors they typically provide more useful output than just the return code on their own. The return code is most useful for scripts, so that you can write a script that handle error conditions.

Run two command line instructions from matlab

I want to run a R script from matlab.
I can run the R code perfectly from cmd using:
cd "C:\Program Files\R\R-3.1.3\bin\x64"
R CMD BATCH "C:\Users\name\Desktop\Code.R"
However in Matlab I am not sure how to run this two instructions.
First, I noticed that I could use:
system('cd "C:\Program Files\R\R-3.1.3\bin\x64"')
to run a commnand line command. However I need to run two. And making:
system('cd "C:\Program Files\R\R-3.1.3\bin\x64"')
system('R CMD BATCH "C:\Users\name\Desktop\Code.R"')
does not work.
I also saw this post about running multiple command line instructions in a single line, but also that did not work.
Anyone knows how to do it?
Your script should generally not care where it’s executed. So you don’t need the cd statement at all:
system('"C:\Program Files\R\R-3.1.3\bin\x64\R.exe" CMD BATCH "C:\Users\name\Desktop\Code.R"')
Be careful thought that the R path might not always be the same … it’s probably safer to find R’s location programmatically. Though how to do that in Matlab on Windows, I don’t know.
Furthermore, I honestly don’t really know why R CMD BATCH even exists but I strongly recommend using RScript instead. It works much nicer for a number of reasons.
The code then becomes:
system('"C:\Program Files\R\R-3.1.3\bin\x64\Rscript.exe" "C:\Users\name\Desktop\Code.R"')
Try to use dos command instead of system.

Off-limits arguments to Rscript

I have written a series of functions to allow easier 'arg-parse'-type commandline argument parsing and check in R. I have run into the issue briefly mentioned here but still don't entirely understand whats going on. It seems that when trying to use "-g" as a flag,
$ Rscript test_args.R -g foo
where test_args.R is simply:
#minimal program for reproducing "-g" arg behaviour
args<-commandArgs(T)
print(args)
I get the error and output:
WARNING: unknown gui 'foo', using X11
[ 1 ] "-g" "foo"
it interprets that as something to change the GUI prior to actually running the script test_args.R (although the printed message shows that the "-g" does actually make it to the actual R program). This isn't mentioned in the Rscript man page, or anywhere else I can find.
Is there any way around this, and if not, is there a list of any other potential off-limits argument flags?
Thanks in advance!
EDIT:
based of of this post, I tried adding either the hash-bang #!/usr/bin/Rscript, but would ideally like to avoid users having to install littler to use the #!/usr/local/bin/r option if possible...

Weird behaviour with zsh PATH

I just encourage a weird problem with zsh today.
My environment is Mac OS X Yosemite, zsh 5.0.5 (x86_64-apple-darwin14.0)
In .zshrc, I have manually set the PATH variable to something like
export PATH="$PATH:~/.composer/vendor/bin"
Try echo $PATH in terminal, the result is as expected (contained ~/.composer/vendor/bin). Then try executing a binary from ~/.composer/vendor/bin, It'll always return me "zsh: command not found" error.
Try switching to bash, echo $PATH is also as expected, have the same result as zsh shell.
Try executing a binary from ~/.composer/vendor/bin, no problem found. Seem the PATH var is acting well on the bash shell.
What's wrong with my zsh shell?
Thanks
Try using $HOME instead of ~. In many situations, shells do not expand ~ when you expect them to and it is usually better to use $HOME. ~ is really only intended to be a short cut for interactive use. (The only case I can recall where ~ was preferred was in a .gitalias, where ~ was expanded and variables were not.)
Type rehash to pick-up $PATH changes.
From the zsh user guide:
The way commands are stored has other consequences. In particular, zsh
won't look for a new command if it already knows where to find one. If
I put a new ls command in /usr/local/bin in the above example, zsh
would continue to use /bin/ls (assuming it had already been found). To
fix this, there is the command rehash, which actually empties the
command hash table, so that finding commands starts again from
scratch. Users of csh may remember having to type rehash quite a lot
with new commands: it's not so bad in zsh, because if no command was
already hashed, or the existing one disappeared, zsh will
automatically scan the path again; furthermore, zsh performs a rehash
of its own accord if $path is altered. So adding a new duplicate
command somewhere towards the head of $path is the main reason for
needing rehash.
EDIT However #WilliamPursell could be onto something with his comment:
note that "composer" != ".composer"

Unexpected nohup behaviour

I have an executable file test, it contains
a="$RANDOM"
echo "$a">>out
Now, if I simply ./test then out contains a random number. But if I nohup ./test & then out is empty. Why?
Transferring comment into an answer
Are you on Ubuntu (or another Linux distro)? Is /bin/sh a link to /bin/dash rather than /bin/bash? If so, when you run it with bash as your shell, then $RANDOM works, but when nohup runs it with /bin/sh (aka dash), you get nothing.
You might be able to fix it by using #!/bin/bash as the shebang line.
Actually I'm on Debian. But anyway I forgot the shebang indeed; now it works perfectly.
Great!
Incidentally, however tempting it is, it's generally a bad idea to call programs test because there is a shell built-in called test (also known as [) that can readily cause confusion if it is run instead of your test program.

Resources