Makefile with SHELL=/usr/bin/R : handling multilines - r

I'm playing with R and Gnu Make (4.0, the code below won't work with <=3.81) and I'd like to use R instead of a classical shell:
I wrote the following code:
.PHONY: all clean
SHELL = /usr/bin/R
.SHELLFLAGS= --vanilla --no-readline --quiet -e
.ONESHELL:
UCSC=http://hgdownload.cse.ucsc.edu/goldenpath/hg17/database/
all: chr1_gold.txt.gz
gold <- read.delim(gzfile("$<"))
head(gold)
chr1_gold.txt.gz:
download.file("${UCSC}/$#","$#")
clean:
$(foreach F,chr1_gold.txt.gz,file.remove("$F");)
the target chr1_gold.txt.gz works fine but not the target "all" because there is more than one line:
$ /make-4.0/make
download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg17/database//chr1_gold.txt.gz","chr1_gold.txt.gz")
> download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg17/database//chr1_gold.txt.gz","chr1_gold.txt.gz")
trying URL 'http://hgdownload.cse.ucsc.edu/goldenpath/hg17/database//chr1_gold.txt.gz'
Content type 'application/x-gzip' length 45866 bytes (44 Kb)
opened URL
==================================================
downloaded 44 Kb
>
>
gold <- read.delim(gzfile("chr1_gold.txt.gz"))
head(gold)
ARGUMENT 'head(gold)' __ignored__
> gold <- read.delim(gzfile("chr1_gold.txt.gz"));\
Error: unexpected input in "\"
Execution halted
Makefile:9: recipe for target 'all' failed
make: *** [all] Error 1
I tried to add a backslash, a semi colon but that doesn't work: how can I fix this ? Can I tell make to pipe a file to the SHELL instead of using an argument (-e string) ?
EDIT:
with
all: chr1_gold.txt.gz
gold <- read.delim(gzfile("$<")) \
head(gold)
.
read.delim(gzfile("chr1_gold.txt.gz")) \
head(gold)
ARGUMENT 'head(gold)' __ignored__
> gold <- read.delim(gzfile("chr1_gold.txt.gz")) \
Error: unexpected input in "gold <- read.delim(gzfile("chr1_gold.txt.gz")) \"
Execution halted
with ';'
all: chr1_gold.txt.gz
gold <- read.delim(gzfile("$<")) ;
head(gold)
.
gold <- read.delim(gzfile("chr1_gold.txt.gz")) ;
head(gold)
ARGUMENT 'head(gold)' ignored
> gold <- read.delim(gzfile("chr1_gold.txt.gz")) ;
>
>
with ';\'
all: chr1_gold.txt.gz
gold <- read.delim(gzfile("$<")) ;\
head(gold)
.
ARGUMENT 'head(gold)' __ignored__
> gold <- read.delim(gzfile("chr1_gold.txt.gz")) ;\
Error: unexpected input in "\"
Execution halted
Makefile:9: recipe for target 'all' failed

It looks to me like this is a problem with R's -e option: it appears that unlike the shell's -e option, R's version will accept only a single command and ignores embedded newlines (as you suspected). Unfortunately there's no option in GNU make to have it automatically write a temporary file and send that to the SHELL. The logistics here are somewhat daunting: how would you specify the name of the file in the shell command? Or what if you wanted to pipe via stdin? Etc. It could be done for sure, but requires some careful consideration of the design.
Currently GNU make requires that the interpreter used for SHELL must be able to accept a multi-line script provided on the command line, that's just the way it is.
The most straightforward way to work with R that I can think of is to put the recipe into a variable using define/enddef to preserve newlines, then use the new $(file ...) function to write it to a file and call R with the name of that file. You can make this somewhat cleaner with a user-defined variable, but you'll probably have to go back to using /bin/sh as the SHELL.

I think an alternative is to use "littler"
For example:
.PHONY: all
SHELL = /usr/bin/r
.SHELLFLAGS= -e
.ONESHELL:
.SILENT: all
all:
x <- rnorm(10)
cat(sd(x), "\n")

Related

Check syntax of R script without running it

After making changes to a R script, is there a way to check its syntax by running a command, before running the R script itself?
Base R has parse which will parse a script without running it.
parse("myscript.R")
The codetools package, which comes with R, has checkUsage and checkUsagePackage to check single functions and packages respectively.
There is lint() from the lint package:
lintr::lint("tets.r")
#> tets.r:1:6: style: Place a space before left parenthesis, except in a function call.
#> is.na(((NA))
#> ^
#> tets.r:1:12: error: unexpected end of input
#> is.na(((NA))
#> ^
The tested file contains only this wrong code
is.na(((NA))
You can customise what lint() checks. By default, it is quite noisy about code style (which is the main reason I use it).
From within an R console/REPL session, you can use the builtin base's parse function:
$ R
> parse("hello.r")
expression(cat("Hello, World!\n"))
> parse("broken.r")
Error in parse("broken.r") : broken.r:2:0: unexpected end of input
1: cat("Hello, World!\n"
^
The parse function throws an error in case it fails to parse. A limitation here is that it will only detect parse errors, but not for example references to undefined functions.
Directly from the command line
You can also check parsing directly from the command line by using Rscript's -e option to call parse:
$ Rscript -e 'parse("hello.r")'
expression(cat("Hello, World!\n"))
$ Rscript -e 'parse("broken.r")'
Error in parse("broken.r") : broken.r:2:0: unexpected end of input
1: cat("Hello, World!\n"
^
Execution halted
A nice result of this is that Rscript returns successfully (no exit code) when the parse works and
it returns an error code when the parse fails. So you can use your shell's || and && operators as usual or other forms of error detection (set -e).
Custom parsing script
You can also create a custom script that wraps everything nicely:
#!/bin/bash
#
# Rparse: checks whether R scripts parse successfully
#
# usage: Rscript script.r
# usage: Rscript file1.r file2.r file3.r ...
set -e
for file in "$#"
do
Rscript -e "parse(\"$file\")"
done
Place the script in a folder pointed by your $PATH variable and use it to check your R files as follows:
$ Rparse hello.r broken.r
expression(cat("Hello, World!\n"))
Error in parse("broken.r") : broken.r:2:0: unexpected end of input
1: cat("Hello, World!\n"
^
Execution halted
Here I am using bash as a reference language but there's nothing impeding alternatives to be built for other shells including Windows' ".bat" files or even an R script!
(Other answers already address the question quite nicely, but I wanted to document some additional possibilities in a more complete answer.)

Using rscript for expression with dash

I am using rscript to run some expressions but I'm having an issue with some cases with dashes. A simple example would be:
$ rscript -e '-1'
ERROR: option '-e' requires a non-empty argument
Adding parenthesis works out (rscript -e (-1)) but I'm not always sure that they will be properly parenthesized.
In the documentation it says
When using -e options be aware of the quoting rules in the shell used
So I tried using different quoting rules for bash, escaping the dashes or using single quotes but it still doesn't work.
$ rscript -e "\-1"
Error: unexpected input in "\"
Execution halted
Is there something I'm missing?
You misunderstand one part here. "Expression" is something R can parse, ie:
$ R --slave -e '1+1'
[1] 2
$
What you hit with -1 is a corner case. You can do
$ R --slave -e 'a <- -1; a'
[1] -1
$
or
$ R --slave -e 'print(-1)'
[1] -1
$
For actual argument parsing do you want an package like docopt (which I like and use a lot), or getopt (which I used before) or optparse. All are on CRAN.

compiling a ICC binary [duplicate]

I am getting the following error running make:
Makefile:168: *** missing separator. Stop.
What is causing this?
As indicated in the online manual, the most common cause for that error is that lines are indented with spaces when make expects tab characters.
Correct
target:
\tcmd
where \t is TAB (U+0009)
Wrong
target:
....cmd
where each . represents a SPACE (U+0020).
Just for grins, and in case somebody else runs into a similar error:
I got the infamous "missing separator" error because I had invoked a rule defining a function as
($eval $(call function,args))
rather than
$(eval $(call function,args))
i.e. ($ rather than $(.
This is a syntax error in your Makefile. It's quite hard to be more specific than that, without seeing the file itself, or relevant portion(s) thereof.
For me, the problem was that I had some end-of-line # ... comments embedded within a define ... endef multi-line variable definition. Removing the comments made the problem go away.
My error was on a variable declaration line with a multi-line extension. I have a trailing space after the "\" which made that an invalid line continuation.
MY_VAR = \
val1 \ <-- 0x20 there caused the error.
val2
In my case, I was actually missing a tab in between ifeq and the command on the next line. No spaces were there to begin with.
ifeq ($(wildcard $DIR_FILE), )
cd $FOLDER; cp -f $DIR_FILE.tpl $DIR_FILE.xs;
endif
Should have been:
ifeq ($(wildcard $DIR_FILE), )
<tab>cd $FOLDER; cp -f $DIR_FILE.tpl $DIR_FILE.xs;
endif
Note the <tab> is an actual tab character
In my case error caused next. I've tried to execute commands globally i.e outside of any target.
UPD. To run command globally one must be properly formed. For example command
ln -sf ../../user/curl/$SRC_NAME ./$SRC_NAME
would become:
$(shell ln -sf ../../user/curl/$(SRC_NAME) ./$(SRC_NAME))
In my case, this error was caused by the lack of a mere space. I had this if block in my makefile:
if($(METHOD),opt)
CFLAGS=
endif
which should have been:
if ($(METHOD),opt)
CFLAGS=
endif
with a space after if.
In my case, the same error was caused because colon: was missing at end as in staging.deploy:. So note that it can be easy syntax mistake.
I had the missing separator file in Makefiles generated by qmake. I was porting Qt code to a different platform. I didn't have QMAKESPEC nor MAKE set. Here's the link I found the answer:
https://forum.qt.io/topic/3783/missing-separator-error-in-makefile/5
Just to add yet another reason this can show up:
$(eval VALUE)
is not valid and will produce a "missing separator" error.
$(eval IDENTIFIER=VALUE)
is acceptable. This sort of error showed up for me when I had an macro defined with define and tried to do
define SOME_MACRO
... some expression ...
endef
VAR=$(eval $(call SOME_MACRO,arg))
where the macro did not evaluate to an assignment.
I had this because I had no colon after PHONY
Not this,
.PHONY install
install:
install -m0755 bin/ytdl-clean /usr/local/bin
But this (notice the colon)
.PHONY: install
...
Following Makefile code worked:
obj-m = hello.o
all:
$(MAKE) -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
$(MAKE) -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
So apparently, all I needed was the "build-essential" package, then to run autoconf first, which made the Makefile.pre.in, then the ./configure then the make which works perfectly...

making commandargs comma delimited or parsing spaces

I'm trying to run R from the command line using command line arguments. This includes passing in some filepaths as arguments for use inside the script. It all works most of the time, but sometimes the paths have spaces in and R doesn't understand.
I'm running something of the form:
R CMD BATCH --slave "--args inputfile='C:/Work/FolderWith SpaceInName/myinputfile.csv' outputfile='C:/Work/myoutputfile.csv'" RScript.r ROut.txt
And R throws out a file saying
Fatal error: cannot open file 'C:\Work\FolderWith': No such file or directory
So evidently my single quotes aren't enough to tell R to take everything inside the quotes as the argument value. I'm thinking this means I should find a way to delimit my --args using a comma, but I can't find a way to do this. I'm sure it's simple but I've not found anything in the documentation.
The current script is very basic:
ca = commandArgs(trailingOnly=TRUE)
eval(parse(text=ca))
tempdata = read.csv(inputFile)
tempdata$total = apply(tempdata[,4:18], 1, sum)
write.csv(tempdata, outputFile, row.names = FALSE)
In case it's relevant I'm using windows for this, but it seems like it's not a cmd prompt problem.
Using eval(parse()) is probably not the best and most efficient way to parse command line arguments. I recommend to use a package like the optparse to do the parsing for you. Parsing command line args has already been solved, no need to reimplement this. I could imagine that this solves your problems. Although, spaces in path names are a bad idea to begin with.
Alternatively, you could take a very simple approach and pass the arguments like this:
R CMD BATCH --slave arg1 arg2
Where you can retrieve them like:
ca = commandArgs(TRUE)
arg1 = ca[2]
arg2 = ca[3]
This avoids the eval(parse which I think is causing the issues. Finally, you could try and escape the space like this:
R CMD BATCH --slave "C:/spam\ bla"
You could also give Rscript a try, R CMD BATCH seems to be less favored than Rscript.
As an enhancement of #PaulHimestra answer here how you can use Rscript :
you create a launcher.bat ,
echo off
C:
PATH R_PATH;%path%
cd DEMO_PATH
Rscript youscript.R arg1 arg2
exit
with R_PATH something like C:/Program Files/R/R-version
There are many similarities with this post:
R command line passing a filename to script in arguments (Windows)
Also this post is very OS related. My answer applies only to Windows.
Probably what you are looking for is RScript.exe instead of R.exe. The latter has no problem with spaces: path\to\RScript "My script.r".
One boring thing may be searching or setting the path for RScript and doing this every time one updates R.
Among the convenience scripts I have in my search path, I wrote a little facility to run RScript without bothering with paths. Just in case it may be of interest for someone:
#echo off
setlocal
::Get change to file dir par (-CD must be 1st par)
::================================================
Set CHANGEDIR="F"
If /I %1 EQU -cd (
Set CHANGEDIR="T"
SHIFT
)
::No args given
::=============
If [%1] EQU [] GoTo :USAGE
::Get R path from registry
::========================
:: may check http://code.google.com/p/batchfiles for updates on R reg keys
Call :CHECKSET hklm\software\R-core\R InstallPath
Call :CHECKSET hklm\software\wow6432Node\r-core\r InstallPath
if not defined RINSTALLPATH echo "Error: R not found" & goto:EOF
::Detect filepath when arg not starting with "-"
::==============================================
::Note the space after ARGS down here!!!
Set ARGS=
:LOOP
if [%1]==[] (GoTo :ELOOP)
Set ARGS=%ARGS% %1
::Echo [%ARGS%]
Set THIS=%~1
if [%THIS:~0,1%] NEQ [-] (Set FPATH=%~dp1)
SHIFT
GoTo :LOOP
:ELOOP
::echo %FPATH%
::Run Rscript script, changing to its path if asked
::=================================================
If %CHANGEDIR%=="T" (CD %FPATH%)
Echo "%RINSTALLPATH%\bin\Rscript.exe" %ARGS%
"%RINSTALLPATH%\bin\Rscript.exe" %ARGS%
endlocal
:: ==== Subroutines ====
GoTo :EOF
:USAGE
Echo USAGE:
Echo R [-cd] [RScriptOptions] Script [ScriptArgs]
Echo.
Echo -cd changes to script dir. Must be first par.
Echo To get RScript help on options etc.:
Echo R --help
GoTo :EOF
:CHECKSET
if not defined RINSTALLPATH for /f "tokens=2*" %%a in ('reg query %1 /v %2 2^>NUL') do set RINSTALLPATH=%%~b
GoTo :EOF
The script prints the actual RScript invoking line, before running it.
Note that there is an added argument, -cd, to change automatically to the script directory. In fact it is not easy to guess the script path from inside R (and set it with setwd()), in order to call other scripts or read/write data files placed in the same path (or in a relative one).
This (-cd) might possibly make superfluous your other commandargs, as you may find convenient calling them straight from inside the script.

Unable to get a system variable work for manuals

I have the following system variable in .zshrc
manuals='/usr/share/man/man<1-9>'
I run unsuccessfully
zgrep -c compinit $manuals/zsh*
I get
zsh: no matches found: /usr/share/man/man<1-9>/zsh*
The command should be the same as the following command which works
zgrep -c compinit /usr/share/man/man<1-9>/zsh*
How can you run the above command with a system variable in Zsh?
Try:
$> manuals=/usr/share/man/man<0-9>
$> zgrep -c compinit ${~manuals}/zsh*
The '~' tells zsh to perform expansion of the <0-9> when using the variable. The zsh reference card tells you how to do this and more.
From my investigations, it looks like zsh performs <> substitution before $ substitution. That means when you use the $ variant, it first tries <> substitution (nothing there) then $ substitution (which works), and you're left with the string containing the <> characters.
When you don't use $manuals, it first tries <> substitution and it works. It's a matter of order. The final version below shows how to defer expansion so they happen at the same time:
These can be seen here:
> manuals='/usr/share/man/man<1-9>'
> echo $manuals
/usr/share/man/man<1-9>
> echo /usr/share/man/man<1-9>
/usr/share/man/man1 /usr/share/man/man2 /usr/share/man/man3
/usr/share/man/man4 /usr/share/man/man5 /usr/share/man/man6
/usr/share/man/man7 /usr/share/man/man8
> echo $~manuals
/usr/share/man/man1 /usr/share/man/man2 /usr/share/man/man3
/usr/share/man/man4 /usr/share/man/man5 /usr/share/man/man6
/usr/share/man/man7 /usr/share/man/man8

Resources