Is there a comprehensive list somewhere of what you can pass to R CMD CHECK? I don't see anything in the manual but it's brutal to read. It'd be great if every line of this output could be run independently or if I could at least skip some things but I don't see a comprehensive list describing how to do this.
Even better would be something I could use like
check_dependencies()
check_executable_files()
check_hidden_files()
For context, I had a note about large file sizes that was hard to debug (it was plotly using it's full JS library in my vignettes) and devtools::check() took 3 minutes each time I ran it.
As rawr# points out, the "official" answer is to run R CMD check --help, which will give the most accurate results for the version of R you are running.
Still, it is surprisingly difficult to find this info online. It would be nice not to need the terminal to look this up... so I'm filing this answer with some pointers in the right direction for anyone else that manages to Google their way to this Q&A:
This permalink highlights the documentation as of this writing, but that will stale as the function evolves.
tools/R/check.R is the correct file at HEAD; search for "Usage <- function", where the contents of R CMD check --help are maintained. (in principle, this could also change, but this anchor has remained accurate for the entire 12-year history of the R implementation of R CMD check)
Related
I have bits of code that I want to show in the examples of a package but neither run (when example(my_fun) is run) nor test (when R CMD check is run) because they're slow enough to annoy users who might unthinkingly run them, and definitely slow enough to annoy the CRAN maintainers.
Writing R Extensions says
You can use \dontrun{} for text that should only be shown, but not run ...
and
Finally, there is \donttest, used (at the beginning of a separate line) to mark code that should be run by example() but not by R CMD check.
Should I nest these, i.e.
\donttest
\dontrun{first slow example ...}
\dontrun{second slow example ...}
? That technically seems to go against the wording in WRE (i.e. it says that \donttest code should be run by example() ...) ?
I could just include them in the examples in a commented-out form or using if (FALSE) { ... } if it came to it ... but that seems ugly.
\dontrun subsumes \donttest: code that is marked with the former will neither be run by example(), nor by R CMD check. I know this because my packages for talking to Azure use \dontrun liberally, for examples that assume you have an Azure account.
When I've written stuff in Matlab, I've often greatly appreciated its "Run and Time" functionality: for those who don't know, this runs the file and upon completion outputs not only the run time, but also opens a new window showing the code, and saying how many times each line was run and how long the program spent on each line. For finding bottlenecks in my code, this has been invaluable!
I am not aware of a similar functionality in R -- whether that be an R package, or part of RStudio -- and searching using a well-known search engine has not rectified this.
Is it possible to do a similar thing for R? It would be most appreciated!
It would help you if you knew that the "Run and Time" option in MATLAB is simply a user interface on top of the profile command. In particular, in MATLAB you can do
profile on
% Run some code
profile off; profile viewer % Stops profiling and opens the timing window
I say this is helpful because you can "profile" in a similar way in RStudio, via the "Profile" menu.
Please see this RStudio Support page for in depth details.
To summarise the above RStudio help page, in essence, one wants to write
profvis({
#CODE
})
(Note that the package profvis may need to be installed.) Further details on how to use can be found by typing ?Rprof, and by visiting this related SO question: How to efficiently use Rprof in R?.
I'm working to define a Docker container which can be spun up in a cloud environment and run some reporting on our firm's database and spin itself down, with as little involvement from our data science team (including myself) as possible.
I'm pretty much done getting everything up and running, with one irritating exception- the reporting is done in R using some code that we've been using for a few years. I'm building on top of Rocker verse, and I'm adding the needs library.
The annoying thing (in this use case) about needs is that when it is first run, it asks the following:
>library('needs')
Should `needs` load itself when it's... needed? (this is recommended)
1: No
2: Yes
Selection:
In a typical interactive setting this is fine, I just type "Yes" and hit enter and I'm good to go. However, when I want the whole environment to build and run once a week on its own, I don't want to have to answer this question. I'd like it to assume Yes.
What I've tried so far includes each of these:
library('needs', quiet=TRUE)
library('needs', quietly=TRUE)
suppressMessages(library('needs', quietly=TRUE))
suppressWarnings(suppressMessages(library('needs', quietly = T)))
suppressPackageStartupMessages(library('needs', quietly=TRUE))
none of which solves the issue. The needs documentation provides for changing this setting later in a programmatic way, but not for defining the setting when first running needs:
Recommended use is to allow the function to autoload when prompted the
first time the package is loaded interactively. To change this setting
later, run needs:::autoload(TRUE) or needs:::autoload(FALSE) to turn
autoloading on or off, respectively.
I've also tried quietly installing needs, also to no avail. Unfortunately, I can't run bash commands in my Dockerfile to respond Yes, or at least I haven't found a way.
I'd like to avoid removing dependencies for needs, as it will involve a LOT of code refactoring.
Any ideas on how to solve this?
Thank you! :]
-Vince
Update
Solution is a bit hacky, but in my Dockerfile I'm doing a vim edit of the file which needs assigns to the sysfile variable:
sysfile <- system.file("extdata", "promptUser", package = "needs")
which for ME was /usr/local/lib/R/site-library/needs/extdata/promptUser, and changing its contents from "1" to "0" solving my problem.
A better solution would probably be to make it so it doesn't ask the question in the first place. You can view the code it runs on package load on github: https://github.com/cran/needs/blob/master/R/needs-package.R
If you set the option it checks for before hand then it doesn't need to ask the question in the first place:
options(needs.promptUser = FALSE)
Despite numerous searches, I can't seem to find a clear explanation as to what "Source on Save" means in RStudio.
I have tried ?source and the explanation there isn't clear, either.
As far as I can tell, it seems to run the script when I hit Save, but I don't understand the relevance/significance of it.
In simple terms, what exactly does Source on Save do and why would/should I use it?
This is kind of a shortcut to save and execute your code. You type something, save the script and it will be automatically sourced.
Very useful for short scripts but very annoying for time consuming longer scripts.
So sourcing is basically running each line of your file.
EDIT:
SO thinking of a scenario where this might be useful...
You developing a function which you will later put into a package... So you write this function already in an extra file but execute the function for testing in the command line...
Normally, you have to execute the whole function again, when you changed something. While using "Source on Save" the function will be executed and you can use Ctrl + 2 to jump into command line and test the function directly.
Since I am working with R, my datasets are much bigger. But I am remembering starting coding in python and vi, I updated my setting in a way to execute the code on save, since these little scripts where done in less then 10 seconds...
So maybe it is just not standard to work with small datasets... But I can still recommend it, for development, to use only 10% of a normal dataset. It will speed up the graphics creation and a lot of other things as well. Test it with the complete dataset every now and then.
Is there an easy way to have R record all input and output from your R session to disk while you are working with R interactively?
In R.app on Mac OS X I can do a File->Save..., but it isn't much help in recovering the commands I had entered when R crashes.
I have tried using sink(...,split=T), but it doesn't seem to do exactly what I am looking for.
Many of us use ESS / Emacs for this very reason. Saving old sessions with extension '.Rt' even gives you mode-specific commands for re-running parts of your session.
Greg Snow wrote recently on the R-help list (a very valuable resource, SO R people!):
"You may also want to look at ?TeachingDemos::txtStart as an alternative to sink, one advantage is that the commands as well as the output can be included. With a little more work you can also include graphical output into a transcript file."
r-help
Check out the savehistory() command
I'm not sure yet how to answer an answer, but there is an updated version of Ranke's vim r-plugin called r-plugin2 available here. It seems more user-friendly and robust than the original.
Emacs is good, but for those of us with a vi preference there's the vim-r plugin at:
http://www.uft.uni-bremen.de/chemie/ranke/index.php?page=vim_R_linux
It works brilliantly and has a tiny memory footprint.