littler determine if running as deployed - r

I am pretty excited to have found Jeff and Dirk's application littler to run R functions from terminal. ¡kudos!
Since then, I have been able to pass my functions to my development team and have them running in other servers.
My question is about it's deployment. Before passing it to others, I try it out in my computer and prepare it with RStudio... (also kudos).
I was wondering if there's a command to run in the script on which I can tell if the function is run from the command or if it's been executed with R.
Thanks.

I don’t know whether there’s a littler specific answer. But in general it is impossible (or very hard) in R to determine how the code is run, which was one of the motivations for my work on modules.
The only thing R knows is whether the code is being run in an interactive shell (via interactive()).
With modules, you can test whether module_name() is set, analogous to Python’s __name__:
if (is.null(module_name()) && ! interactive()) {
# Stand-alone, execute main entry point
}
if (! is.null(module_name())) {
# Code is being loaded as a module.
}
I’ve written a small wrapper based on this which I’m using to write my command line applications. For instance, a very simple cat-like application would look as follows:
#!/usr/bin/env Rscript
sys = modules::import('sys')
sys$run({
if (length(sys$args) == 0) {
message('Usage: ', script_name(), ' filename')
sys$exit(1)
}
input = sys$args[1]
cat(readLines(input))
})

I am not sure I understand your question. Do you mean something like
edd#max:~$ which r
/usr/local/bin/r
edd#max:~$
You can compare the result of which against the empty string as nothing comes back when you ask for a non-existing program.
edd#max:~$ which s # we know we don't have this
edd#max:~$
You can then use the result of which r to check for, say, the version:
edd#max:~$ `which r` --version
r ('littler') version 0.2.2
git revision 8df31e5 as of Thu Jan 29 17:43:21 2015 -0800
built at 19:48:17 on Jan 29 2015
using GNU R Version 3.1.2 (2014-10-31)
Copyright (C) 2006 - 2014 Jeffrey Horner and Dirk Eddelbuettel
r is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License. For more information about
these matters, see http://www.gnu.org/copyleft/gpl.html.
edd#max:~$
Edit: As you seem confused about interactive() true or false, consider r --help:
edd#max:~$ r --help
Usage: r [options] [-|file]
Launch GNU R to execute the R commands supplied in the specified file, or
from stdin if '-' is used. Suitable for so-called shebang '#!/'-line scripts.
Options:
-h, --help Give this help list
--usage Give a short usage message
-V, --version Show the version number
-v, --vanilla Pass the '--vanilla' option to R
-t, --rtemp Use per-session temporary directory as R does
-i, --interactive Let interactive() return 'true' rather than 'false'
-q, --quick Skip autoload / delayed assign of default libraries
-p, --verbose Print the value of expressions to the console
-l, --packages list Load the R packages from the comma-separated 'list'
-d, --datastdin Prepend command to load 'X' as csv from stdin
-e, --eval expr Let R evaluate 'expr'
edd#max:~$
and
edd#max:~$ r -e'print(interactive())'
[1] FALSE
edd#max:~$ r -i -e'print(interactive())'
[1] TRUE
edd#max:~$
but that is setting it as opposed to querying it.

Related

Use valgrind with `R CMD check`

I want to run valgrind on the tests, examples and vignettes of my package. Various sources insinuate that the way to do this should be:
R CMD build my-pkg
R CMD check --use-valgrind my-pkg_0.0.tar.gz
R CMD check seems to run fine, but shows no evidence of valgrind output, even after setting the environment variable VALGRIND_OPTS: --memcheck:leak-check=full. I've found sources that hint that R needs to run interactively for valgrind to show output, but R -d CMD check (or R -d "CMD check") seems to be the wrong format.
R -d "valgrind --tool=memcheck --leak-check=full" --vanilla < my-pkg.Rcheck/my-pkg-Ex.R does work, but only on the example files; I can't see a simple way to run this against my vignettes and testthat tests.
What is the best way to run all relevant scripts through valgrind? For what it's worth, the goal is to integrate this in a GitHub actions script.
Edit Mar 2022: The R CMD check case is actually simpler, running R CMD check --use-valgrind [other options you may want] will run the tests and examples under valgrind and then append the standard valgrind summary at the end of the examples output (i.e., pkg.Rcheck/pkg-Ex.Rout) and test output (i.e., pkg.Rcheck/tinytest.Rout as I use tinytest)_. What puzzles me now is that an error detected by valgrind does not seem to fail the test.
Original answer below the separator.
There is a bit more to this: you helps to ensure that the R build is instrumented for it. See Section 4.3.2 of Writing R Extensions:
On platforms where valgrind is installed you can build a version of R with extra instrumentation to help valgrind detect errors in the use of memory allocated from the R heap. The configure option is --with-valgrind-instrumentation=level, where level is 0, 1 or 2. Level 0 is the default and does not add anything. Level 1 will detect some uses117 of uninitialised memory and has little impact on speed (compared to level 0). Level 2 will detect many other memory-use bugs118 but make R much slower when running under valgrind. Using this in conjunction with gctorture can be even more effective (and even slower).
So you probably want to build yourself a Docker-ized version of R to call from your GitHub Action. I think the excellent 'sumo' container by Winston has a valgrind build as well so you could try that as well. It's huge at over 4gb:
edd#rob:~$ docker images| grep wch # some whitespace edit out
wch1/r-debug latest a88fabe8ec81 8 days ago 4.49GB
edd#rob:~$
And of course if you test dependent packages you have to get them into the valgrind session too...
Unfortunately, I found the learning curve involved in dockerizing R in GitHub actions, per #dirk-eddelbuettel's suggestion, too steep.
I came up with the hacky approach of adding a file memcheck.R to the root of the package with the contents
devtools::load_all()
devtools::run_examples()
devtools::build_vignettes()
devtools::test()
(remembering to add to .Rbuildignore).
Running R -d "valgrind --tool=memcheck --leak-check=full" --vanilla < memcheck.R then seems to work, albeit with the reporting of some issues that appear to be false positives, or at least issues that are not identified by CRAN's valgrind implementation.
Here's an example of this in action in a GitHub actions script.
(Readers who know what they are doing are invited to suggest shortcomings of this approach in the comments!)

Library loads with R but not with Rscript [duplicate]

I am automating some webscraping with R in cron and sometimes I use R CMD BATCH and sometimes I use Rscript.
To decide which one to use I mainly focus if I want the .Rout file or not.
But reading the answers to some questions here in SO (like this or this) it seems that Rscript is preferred to R CMD BATCH.
So my questions are:
Besides the fact that the syntax is a little different and R CMD BATCH saves an .Rout file while Rscript does not, what are the main differences between the two of them?
When should I prefer one over another? More specifically, in the cron job above mentioned, is one of them preferred?
I have not used yet littler, how is it different from both Rscript and R CMD BATCH?
From what I understand:
R CMD BATCH:
echo the input statements
can not output to stdout
Rscript:
does NOT echo
output to stdout
can be used in one-liner (i.e. with no input file)
littler:
all that Rscript does
can read commands from stdin (useful for pipelining)
faster startup time
load the methods package
R CMD BATCH is all we had years ago. It makes i/o very hard and leaves files behind.
Things got better, first with littler and then too with Rscript. Both can be used for 'shebang' lines such as
#!/usr/bin/r
#!/usr/bin/Rscript
and both can be used with packages like getopt and optparse --- allowing you to write proper R scripts that can act as commands. If have dozens of them, starting with simple ones like this which I can call as install.r pkga pkgb pkgc and which will install all three and their dependencies) for me from the command-line without hogging the R prompt:
#!/usr/bin/env r
#
# a simple example to install one or more packages
if (is.null(argv) | length(argv)<1) {
cat("Usage: installr.r pkg1 [pkg2 pkg3 ...]\n")
q()
}
## adjust as necessary, see help('download.packages')
repos <- "http://cran.rstudio.com"
## this makes sense on Debian where no packages touch /usr/local
lib.loc <- "/usr/local/lib/R/site-library"
install.packages(argv, lib.loc, repos)
And just like Karl, I have cronjobs calling similar R scripts.
Edit on 2015-11-04: As of last week, littler is now also on CRAN.

How can I print R documentation from a Linux command shell (e.g. bash)?

How can I check documentation for R code from a Linux command shell such as bash? I DO NOT mean an interactive session.
With Perl, I can use perldoc to print out documentation at the command line:
perldoc lib
I was hoping for something simple like that for R. I don't always want to pull up a full interactive R session just to look up some documentation.
There might be other ways, but one that works for me is using the -e flag to execute code on the command line. I also use the --slave flag, which prevents anything from being printed to standard output (e.g. no R startup messages, etc.):
R --slave -e '?function'
I actually created a super small script I call rdoc to act like a simple R version of perldoc:
#!/bin/bash
R --slave -e "?$1"
After installing that in my ~/bin directory (or however you install it in your PATH), it's easy:
rdoc function
If you want to look at documentation of a function from a particular package, prepend the library name followed by two colons. For example, to pull up documentation of the dmrFinder function from the charm package:
rdoc charm::dmrFinder

Why (or when) is Rscript (or littler) better than R CMD BATCH?

I am automating some webscraping with R in cron and sometimes I use R CMD BATCH and sometimes I use Rscript.
To decide which one to use I mainly focus if I want the .Rout file or not.
But reading the answers to some questions here in SO (like this or this) it seems that Rscript is preferred to R CMD BATCH.
So my questions are:
Besides the fact that the syntax is a little different and R CMD BATCH saves an .Rout file while Rscript does not, what are the main differences between the two of them?
When should I prefer one over another? More specifically, in the cron job above mentioned, is one of them preferred?
I have not used yet littler, how is it different from both Rscript and R CMD BATCH?
From what I understand:
R CMD BATCH:
echo the input statements
can not output to stdout
Rscript:
does NOT echo
output to stdout
can be used in one-liner (i.e. with no input file)
littler:
all that Rscript does
can read commands from stdin (useful for pipelining)
faster startup time
load the methods package
R CMD BATCH is all we had years ago. It makes i/o very hard and leaves files behind.
Things got better, first with littler and then too with Rscript. Both can be used for 'shebang' lines such as
#!/usr/bin/r
#!/usr/bin/Rscript
and both can be used with packages like getopt and optparse --- allowing you to write proper R scripts that can act as commands. If have dozens of them, starting with simple ones like this which I can call as install.r pkga pkgb pkgc and which will install all three and their dependencies) for me from the command-line without hogging the R prompt:
#!/usr/bin/env r
#
# a simple example to install one or more packages
if (is.null(argv) | length(argv)<1) {
cat("Usage: installr.r pkg1 [pkg2 pkg3 ...]\n")
q()
}
## adjust as necessary, see help('download.packages')
repos <- "http://cran.rstudio.com"
## this makes sense on Debian where no packages touch /usr/local
lib.loc <- "/usr/local/lib/R/site-library"
install.packages(argv, lib.loc, repos)
And just like Karl, I have cronjobs calling similar R scripts.
Edit on 2015-11-04: As of last week, littler is now also on CRAN.

What's the best way to use R scripts on the command line (terminal)?

It's very convenient to have R scripts for doing simple plots from the command line. However, running R from bash scripts is not convenient at all. The ideal might be something like
#!/path/to/R
...
or
#!/usr/bin/env R
...
but I haven't been able to make either of those work.
Another option is keeping the scripts purely in R, e.g. script.R, and invoking it with R --file=script.R or similar. However, occasionally a script will rely on obscure command line switches at which point part of the code exists outside the script. Example: sneaking things into R from bash via a local .Rprofile, the desired switches are then everything --vanilla implies except --no-init-file.
Another option is a bash script to store the R flags and be painlessly executable, which then calls the R script. The problem is that this means a single program just got split into two files which now have to be keep in sync, transferred to new machines together, etc.
The option I currently despise least is embedding the R in a bash script:
#!/bin/bash
... # usage message to catch bad input without invoking R
... # any bash pre-processing of input
... # etc
R --random-flags <<RSCRIPT
# R code goes here
RSCRIPT
Everything's in a single file. It's executable and easily handles arguments. The problem is that combining bash and R like this pretty much eliminates the possibility of any IDE not failing on one or the other, and makes my heart hurt real bad.
Is there some better way I'm missing?
Content of script.r:
#!/usr/bin/env Rscript
args = commandArgs(trailingOnly = TRUE)
message(sprintf("Hello %s", args[1L]))
The first line is the shebang line. It’s best practice to use /usr/bin/env Rscript instead of hard-coding the path to your R installation. Otherwise you risk your script breaking on other computers.
Next, make it executable (on the command line):
chmod +x script.r
Invocation from command line:
./script.r world
# Hello world
Try littler. littler provides hash-bang (i.e. script starting with #!/some/path) capability for GNU R, as well as simple command-line and piping use.
Miguel Sanchez's response is the way it should be. The other way executing Rscript could be 'env' command to run the system wide RScript.
#!/usr/bin/env Rscript
#!/path/to/R won't work because R is itself a script, so execve is unhappy.
I use R --slave -f script
If you are interested in parsing command line arguments to an R script try RScript which is bundled with R as of version 2.5.x
http://stat.ethz.ch/R-manual/R-patched/library/utils/html/Rscript.html
This works,
#!/usr/bin/Rscript
but I don't know what happens if you have more than 1 version of R installed on your machine.
If you do it like this
#!/usr/bin/env Rscript
it tells the interpreter to just use whatever R appears first on your path.
If the program you're using to execute your script needs parameters, you can put them at the end of the #! line:
#!/usr/bin/R --random --switches --f
Not knowing R, I can't test properly, but this seems to work:
axa#artemis:~$ cat r.test
#!/usr/bin/R -q -f
error
axa#artemis:~$ ./r.test
> #!/usr/bin/R -q -f
> error
Error: object "error" not found
Execution halted
axa#artemis:~$
Just a note to add to this post. Later versions of R seem to have buried Rscript somewhat. For R 3.1.2-1 on OSX downloaded Jan 2015 I found Rscript in
/sw/Library/Frameworks/R.framework/Versions/3.1/Resources/bin/Rscript
So, instead of something like #! /sw/bin/Rscript, I needed to use the following at the top of my script.
#! /sw/Library/Frameworks/R.framework/Versions/3.1/Resources/bin/Rscript
The locate Rscript might be helpful to you.
You might want to use python's rpy2 module. However, the "right" way to do this is with R CMD BATCH. You can modify this to write to STDOUT, but the default is to write to a .Rout file. See example below:
[ramanujan:~]$cat foo.R
print(rnorm(10))
[ramanujan:~]$R CMD BATCH foo.R
[ramanujan:~]$cat foo.Rout
R version 2.7.2 (2008-08-25)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
~/.Rprofile loaded.
Welcome at Fri Apr 17 13:33:17 2009
> print(rnorm(10))
[1] 1.5891276 1.1219071 -0.6110963 0.1579430 -0.3104579 1.0072677 -0.1303165 0.6998849 1.9918643 -1.2390156
>
Goodbye at Fri Apr 17 13:33:17 2009
> proc.time()
user system elapsed
0.614 0.050 0.721
Note: you'll want to try out the --vanilla and other options to remove all the startup cruft.
Try smallR for writing quick R scripts in the command line:
http://code.google.com/p/simple-r/
(r command in the directory)
Plotting from the command line using smallR would look like this:
r -p file.txt
The following works for me using MSYS bash on Windows - I don't have R on my Linux box so can't try it there. You need two files - the first one called runr executes R with a file parameter
# this is runr
# following is path to R on my Windows machine
# plus any R params you need
c:/r/bin/r --file=$1
You need to make this executable with chmod +x runr.
Then in your script file:
#!runr
# some R commands
x = 1
x
Note the #! runr line may need to include the full path to runr, depending on how you are using the command, how your PATH variable is set etc.
Not pretty, but it does seem to work!

Resources