I am trying to make a R script - test.R - that can take either a file or a text string directly from a pipe in unix as in either:
file | test.R
or:
cat Sometext | test.R
Tried to follow answers here and here but I am clearly missing something. Is it the piping above or my script below that gives me a error like:
me#lnx: cat AAAA | test.R
bash: test.R: command not found
cat: AAAA: No such file or directory
My test script:
#!/usr/bin/env Rscript
input <- file("stdin", "r")
x <- readLines(input)
write(x, "")
UPDATE.
The script:
#!/usr/bin/env Rscript
con <- file("stdin")
open(con, blocking=TRUE)
x <- readLines(con)
x <- somefunction(x) #Do something or nothing with x
write(x,"")
close(con)
Then both cat file | ./test.R and echo AAAA | ./test.R yield the expected.
I still like r over Rscript here (but then I am not unbiased in this ...)
edd#rob:~$ (echo "Hello,World";echo "Bye,Bye") | r -e 'X <- readLines(stdin());print(X)' -
Hello,World
Bye,Bye
[1] "Hello,World" "Bye,Bye"
edd#rob:~$
r can also read.csv() directly:
edd#rob:~$ (echo "X,Y"; echo "Hello,World"; echo "Bye,Bye") | r -d -e 'print(X)' -
X Y
1 Hello World
2 Bye Bye
edd#rob:~$
The -d is essentially a predefined 'read stdin into X via read.csv' which I think I borrowed as an idea from rio or another package.
Edit: Your example works with small changes:
Make it executable: chmod 0755 ex.R
Pipe output in correctly, ie use echo not cat
Use the ./ex.R notation for a file in the current dir
I changed it to use print(x)
Then:
edd#rob:~$ echo AAA | ./ex.R
[1] "AAA"
edd#rob:~$
I generally use R from a terminal application (BASH shell). I have only done a few experiments with Rscript, but including the #! line allows the script to be run in R, while permitting the use of RScript to generate an executable file. I have to use chmod to set the executable flag on my test file. Your call to write() should print the same output to the console in R or RScript, but if I want to save my output to a file I call sink("fileName") to open the connection and sink() to close it. This generally gives me control of the output and how it is rendered. If I called my script "myScript.rs" and made it executable (chmod u+x myScript.rs) I can type something like ./myScript.rs to run it and get the output on OS X or Linux. Instead of a pipe | you might try redirection > or >> to create or append.
Related
I'm trying to write the equivalent of this Bash command in Windows:
(tee <<"EOF"
<some-markdown-and-html>
EOF
) | Rscript -e '
input <- file("stdin", "r")
content <- readLines(input)
print(content)
'
I am quite unfamiliar with Windows, and thought Powershell seemed like a good choice, however I have a strange issue outputting text to stdout. This command works:
(#"
<some-markdown-and-html>
"#) | Rscript -e #"
print(readLines(file('stdin', 'r')))
"#
has the desired output:
[1] "<some-markdown-and-html>"
however when I break the R script into multiple lines it outputs nothing:
(#"
<some-markdown-and-html>
"#) | Rscript -e #"
input <- file('stdin', 'r')
content <- readLines(input)
print(content)
"#
How can I print to stdout with a multiline R script? Doesn't need to be Powershell, just Windows compatible (ideally without installing WSL). Thanks
I have a a Shell script that contain a Perl script and R script.
my Shell script R.sh:-
#!/bin/bash
./R.pl #calling Perl script
`perl -lane 'print $F[0]' /media/data/abc.cnv > /media/data/abc1.txt`;
#Shell script
Rscript R.r #calling R script
This is my R.pl (head):-
`export path=$PATH:/media/exe_folder/bin`;
print "Enter the path to your input file:";
$base_dir ="/media/exe_folder";
chomp($CEL_dir = <STDIN>);
opendir (DIR, "$CEL_dir") or die "Couldn't open directory $CEL_dir";
$cel_files = "$CEL_dir"."/cel_files.txt";
open(CEL,">$cel_files")|| die "cannot open $file to write";
print CEL "cel_files\n";
for ( grep { /^[\w\d]/ } readdir DIR ){
print CEL "$CEL_dir"."/$_\n";
}close (CEL);
The output of Perl script is input for Shell script and Shell's output is input for R script.
I want to run the Shell script by providing the input file name and output file name like :-
./R.sh home/folder/inputfile.txt home/folder2/output.txt
If folder contain many files then it will take only user define file and process it.
Is There is a way to do this?
I guess this is what you want:
#!/bin/bash
# command line parameters
_input_file=$1
_output_file=$2
# #TODO: not sure if this path is the one you intended...
_script_path=$(dirname $0)
# sanity checks
if [[ -z "${_input_file}" ]] ||
[[ -z "${_output_file}" ]]; then
echo 1>&2 "usage: $0 <input file> <output file>"
exit 1
fi
if [[ ! -r "${_input_file}" ]]; then
echo 1>&2 "ERROR: can't find input file '${input_file}'!"
exit 1
fi
# process input file
# 1. with Perl script (writes to STDOUT)
# 2. post-process with Perl filter
# 3. run R script (reads from STDIN, writes to STDOUT)
perl ${_script_path}/R.pl <"${_input_file}" | \
perl -lane 'print $F[0]' | \
Rscript ${_script_path}/R.r >"${_output_file}"
exit 0
Please see the notes how the called scripts should behave.
NOTE: I don't quite understand why you need to post-process the output of the Perl script with Perl filter. Why not integrate it directly into the Perl script itself?
BONUS CODE: this is how you would write the main loop in R.pl to act as proper filter, i.e. reading lines from STDIN and writing the result to STDOUT. You can use the same approach also in other languages, e.g. R.
#!/usr/bin/perl
use strict;
use warnings;
# read lines from STDIN
while (<STDIN>) {
chomp;
# add your processing code here that does something with $_, i.e. the line
# EXAMPLE: upper case first letter in all words on the line
s/\b([[:lower:]])/\U\1/;
# write result to STDOUT
print "$_\n";
}
In the Unix script, is there any way to run R files but with arguments in the Unix script?
I know that to run R files in that system, you will need to type "R -f "file" but what codes do you need in R so that you will need to type this instead on Unix:
"R -f "file" arg1 arg2"
Here is an example. Save this code in test.R:
#!/usr/bin/env Rscript
# make this script executable by doing 'chmod +x test.R'
help = cat(
"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Help text here
Arguments in this order:
1) firstarg
2) secondarg
3) thirdarg
4) fourtharg
./test.R firstarg secondarg thirdarg fourtharg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
\n\n")
# Read options from command line
args = commandArgs(trailingOnly = TRUE)
if(is.element("--help", args) | is.element("-h", args) | is.element("-help", args) | is.element("--h", args)){
cat(help,sep="\n")
stop("\nHelp requested.")
}
print(args)
Do chmod +x test.R
Then invoke it using ./test.R a b c d. It should print: [1] "a" "b" "c" "d".
You can access each of the args by doing args[1] to get to a and args[4] to get to d.
The suggestion to use Rscript does seem useful but possibly not what is being asked. One can also start R from the command line with an input file that gets sourced. The R interpreter can access the commandArgs in that mode as well. This is a minimal "ptest.R" file in my user directory that is also my default working directory:
ca <- commandArgs()
print(ca)
From a Unix command line I can do:
$ r -f ~/ptest.r --args "test of args"
And R opens, displays the usual startup messages and announces packages loaded by .Rprofile and then:
> ca <- commandArgs()
> print(ca)
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/R"
[2] "-f"
[3] "/Users/davidwinsemius/ptest.r"
[4] "--args"
[5] "test of args"
>
>
And then quits.
So, I'm trying to reproduce the example here
So the first three examples:
echo 'cat(pi^2,"\n")' | r
and
r -e 'cat(pi^2, "\n")'
and
ls -l /boot | awk '!/^total/ {print $5}' | \
r -e 'fsizes <- as.integer(readLines());
print(summary(fsizes)); stem(fsizes)'
work great. The third one:
$ cat examples/fsizes.r
#!/usr/bin/env r
fsizes <- as.integer(readLines())
print(summary(fsizes))
stem(fsizes)
How do you run this? Sorry for the dumb question I am no bash guru...
If the file is in examples/fsizes.r, then make it executable:
chmod +x examples/fsizes.r
And then run it with:
./examples/fsizes.r
The script expects input, one integer per line. When you run it, you can enter line by line, and press control-d to end the input. Or, you can create a file with numbers, and use input redirection, for example:
./examples/fsizes.r < input.txt
If I have an R script:
print("hi")
commandArgs()
And I run it using:
r CMD BATCH --slave --no-timing test.r output.txt
The output will contain:
[1] "hi"
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R"
[2] "-f"
[3] "test.r"
[4] "--restore"
[5] "--save"
[6] "--no-readline"
[7] "--slave"
How can i suppress the line numbers[1]..[7] in the output so only the output of the script appears?
Use cat instead of print if you want to suppress the line numbers ([1], [2], ...) in the output.
I think you are also going to want to pass command line arguments. I think the easiest way to do that is to create a file with the RScript shebang:
For example, create a file called args.r:
#!/usr/bin/env Rscript
args <- commandArgs(TRUE)
cat(args, sep = "\n")
Make it executable with chmod +x args.r and then you can run it with ./args.r ARG1 ARG2
FWIW, passing command line parameters with the R CMD BATCH ... syntax is a pain. Here is how you do it: R CMD BATCH "--args ARG1 ARG2" args.r Note the quotes. More discussion here
UPDATE: changed shebang line above from #!/usr/bin/Rscript to #!/usr/bin/env Rscript in response to #mbq's comment (thanks!)
Yes, mbq is right -- use Rscript, or, if it floats your boat, littler:
$ cat /tmp/tommy.r
#!/usr/bin/r
cat("hello world\n")
print(argv[])
$ /tmp/tommy.r a b c
hello world
[1] "a" "b" "c"
$
You probably want to look at CRAN packages getopt and optparse for argument-parsing as you'd do in other scripting languages/
Use commandArgs(TRUE) and run your script with Rscript.
EDIT: Ok, I've misread your question. David has it right.
Stop Rscript from command-numbering the output from print
By default, R makes print(...) pre-pend command numbering to stdout like this:
print("we get signal")
Produces:
[1] "we get signal"
Rscript lets the user change the definition of functions like print, so it serves our purpose by default:
print = cat
print("we get signal")
Produces:
we get signal
Notice the command numbering and double quoting is gone.
Get more control of print by using R first class functions:
my_print <- function(x, ...){
#extra shenanigans for when the wind blows from the east on tuesdays, go here.
cat(x)
}
print = my_print
print("we get signal")
Prints:
we get signal
If you're using print as a poor mans debugger... We're not laughing at you, we're laughing with you.