How to run an awk command? - r

I am trying to learn how to calculate polygenic risk scores and I am following a step-by-step tutorial (this one: https://choishingwan.github.io/PRS-Tutorial/plink/). However, I have been stuck trying to figure out how to run this command:
awk 'NR!=1{print $3}' EUR.clumped > EUR.valid.snp
Obviously, this is not something I can make run in R, but apparently by using system(), people said it should work. But it doesn't. I then tried to run this command in my own Windows command prompt but it doesn't recognize awk as an intern command.
I then tried to maybe update my command prompt with wsl --install (because that's the only conclusion I could come up to) but apparently my administrator account needs permission to do so.

It would be useful if you could mention a couple of things:
Does R give you an error?
I assume you are trying to run it on a linux system with awk installed?
To check if awk is installed, try running this in your linux terminal:
which awk
To run your awk command from within R, you should escape the ' characters with a \ in your awk command, and put the entire command within quotes:
system('awk \'NR!=1{print $3}\' EUR.clumped > EUR.valid.snp')

Related

Command works in powershell but not via system()/system2()/shell()?

I'm looking to run a powershell command from R. it works in powershell, but I can't get it working in R.
This works in powershell
[guid]::NewGuid()
But none of these work from R
system("[guid]::NewGuid()", intern=TRUE)
system2("[guid]::NewGuid()")
shell("[guid]::NewGuid()")
Any ideas?
I think you need to tell system() that you want to use powershell and not cmd as executable.
Assuming powershell is in you PATH variable try
system('powershell -command "[guid]::NewGuid()"', intern=TRUE)
You can also try leaving intern=TRUE away depending what kind of output you expect.

R system() doesn't behave like cmd itself

I'm doing some bioinformatics analysis in Rstudio, but something strange happens when using system(). I'm also using Windows Subsystem for Linux, so I can run a UNIX executable in my Windows cmd like so:
bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat -f SSWtemplate.fa -q SSWtest.fa -O EMBOSS -d >OUT.txt"
Don't worry about the specifics: what's important is that I use bash -c to indicate I want to use the UNIX bash, and I'm running the executable parasail_aligner. It all works out, and I get the nice output file "OUT.txt".
Now, since I'm doing my analysis in Rstudio, I want to execute this directly from an R script, like so:
system('bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat -f SSWtemplate.fa -q SSWtest.fa -O EMBOSS -d >OUTER.txt"')
So: just give it as an argument to system()? But this gives the following error:
input file, query file, and stdin detected; max inputs is 2
This is obviously an error specifically generated by parasail_aligner. The funny thing is: I don't get this error at all from cmd directly, but I do get it when running the command in R using system(). Does anyone have any idea why something like this can happen at all? I would expect system() to just give its argument to cmd, but clearly it doesn't do this... Running the command in a command terminal opened in Rstudio also works fine, it is specifically system() that seems to mess up.
I'm terribly sorry if this question is vague, but I can't give you a simple example which you could use to replicate the error. I've been using system() for a while now and I've never had this kind of problem. I am on Windows and I've found some people online that say you should use shell() instead of system(), but doing so just gives me the same error.
Maybe it has something to do with this "stdin" thing the error mentions and how R/RStudio handles this, I don't know. But parasail seems to think I give it an extra input "stdin": it is true I give an Input File and a Query File (see error message), but I don't know what this "stdin" is.
If anyone has any ideas about what could be behind this strange behaviour of system(), I'm all ears. I understand that helping me is difficult since I can't give a simple example in which the problem occurs, but I hope someone might know what could be the problem anyway
UPDATE (answer?): so, I managed to resolve the issue, like so:
system('bash -c "./parasail-master/build/parasail_aligner -a sw_trace_striped_sat < SSWtemplate.fa -f SSWtest.fa -O EMBOSS -d >OUTER.txt"')
I did some searching about stdin, and (forgive me if what I say sounds amateur, I'm not really familiar with UNIX or command line) found out its "symbol" is <. So you can see in the code above, I changed the way I give in my inputs "SSWtemplate" and "SSWtest", giving one of them using "<", and this solves the problem.
I have no idea why this happens. Especially since it only happens when calling the command from inside RStudio, and not when doing so from cmd. If anyone can clarify this further (i.e. why and how functions like system() and shell() seem to mess with stdin), it would be a big help. Otherwise, I'll just answer this to my own question and leave it at that.

`Terminal` vs `system()` in R

I tried running the following in R
system("Message=HelloWoRld;echo $(sed 's/R/r/' <(echo ${Message}))")
but it fails, while
Message=HelloWoRld
echo $(sed 's/R/r/' <(echo ${Message}))
works fine when copy pasted in the terminal. The issue seems related to <(..). When I do either which bash or system("which bash"), I get /bin/bash.
Why does the same command via system() or directly on the terminal window does not yield to the same output?
FYI, I am on Mac OS X 10.11.3. Bash is GNU bash, version 3.2.57(1) and R is R version 3.2.3.
system is not a terminal emulator, and it’s not running Bash. Your terminal runs Bash. To get the same effect with system, run the command inside Bash. E.g.
system('bash -c \'echo $(date)\'')
What’s more, your current Bash command is quite convoluted and uses unnecessary command invocations; you can achieve the same via the much simpler
sed s/R/r/ <<< $Message
#chepner makes the excellent point that another solution can be used directly in system without need to pass execution to Bash:
system("Message=HelloWoRld; echo $Message | sed 's/R/r/'")

Rscript in silent mode

I am using Rscript to run an R script but I get a lot of output on my screen. Can I run Rscript in silent mode (meaning without any screen output)?
Several options come to mind:
within R: use sink() to divert output to a file, see help(sink)
on the shell: Rscript myscript.R 2>&1 >/dev/null
edit the code :)
on Linux, use our littler frontend as it runs in --slave mode by default :)
Options 3 is the most involved but possibly best. You could use a logging scheme where you print / display in "debug" or "verbose" but not otherwise. I often do that, based on a command-line toggle given to the script.
You can redirect the output with
Rscript myscript.R >& >/dev/null (linux)
or
Rscript myscript.R >$null (windows)
or use R directly:
R --quiet --vanilla < myscript.R
or
R CMD BATCH myscript.R
(That last version writes the output to a file myscript.Rout)
One more option: if you want to separate the output and the error message into different files, which makes it easier to identify the problems, you can use the command on the shell:
Rscript myscript.R >a.Rout 2>a.Rerr
This will write the program output to a.Rout and the error messages to a.Rerr. Note that the files of a.Rout and a.Rerr should be removed beforehand, to avoid an error.

Unable to get -r parameter working

I've tried in every way imaginable to execute a shell command from command line but it simply doesn't work. What am I doing wrong?
C:\Console2\Console.exe -r runstuff.bat
C:\Console2\Console.exe -d C:\Console2 -r runstuff.bat
C:\Console2\Console.exe -r dir
Neither works. (Win7 x64)
I am playing with that now.
Did you try:
console2 -r "/K runstuff.bat"
The /K is needed to keep the command open after running the script.
The problem I'm having with the "-r" option is that I'm having to type exit twice to leave the window.
If you add the command to shell command (settings... -> Tabs -> Shell) field you will not have to type exit twice:
%comspec% /K runstuff.bat
I don't think the "%comspec%" is necessary (could use "cmd" instead), but I got it from an example somewhere on the web years ago. Console2's included help file shows using "cmd".

Resources