Can R CMD check run examples/tests in parallel (on Windows)? - r

R CMD check takes a significant amount of time to complete on one of my packages because there are many examples/tests to run. Perhaps there's a way to run in parallel?
I stumbled upon this post which seems to have a solution for R CMD install on linux (I can't see how it would work on Windows):
http://r.789695.n4.nabble.com/parallel-build-for-package-equivalent-of-make-j8-td921920.html
Is there a solution for parallel R CMD check on Windows?

It's a hack, but you could take the tests out of the tests directory and put them somewhere else that they won't get run automatically (e.g. inst/tests), then use your own, parallelizable, framework (e.g. make run in parallel: http://dannythorpe.com/2008/03/06/parallel-make-in-win32/ may be relevant) to run the tests ... this won't help for examples, though.

A completely different approach would be to use the cacheSweave package, which caches the unchanging parts of your code from run-to-run. If you are tweaking some code but most of it is unchanged, this could save a lot of time. If plots are what's slowing things down however, cacheSweave won't help much (as explained in the vignette).

Related

Link Project and R Version

I have two different versions of R installed, one which is up to date and which I use for all my regular R coding (needs to be up to date so that I can use various updated and new packages) and one which I use to access OLAP cubes (needs to be the R Client from Microsoft, because this is the only one which supports the olapR package, and which currently uses R version 3.4.3).
Since, in theory, I only have to access the OLAP cube once a month, I "outsourced" this task to a different RStudio project, in which I download and save the required data for all other projects. Hence, all other projects never require the olapR package to be installed and can and will be run in the up to date R version.
Now, ideally I would like to link my R version to my projects, so that I do not have to change my global R version and restart RStudio every time I access the OLAP cube or work on this data retrieval project (and then switch it back). However, I could not find any options in RStudio to achieve this result.
There are a few threads out there describing the same problem, but with no satisfactory answer in my opinion:
https://support.rstudio.com/hc/en-us/community/posts/200657296-Link-Project-and-R-Version
Rstudio project using different version of R
I also tried looking for a different package than olapR but with similar functionality, but could not find anything except X4R, which seems outdated and does not work for me (https://github.com/overcoil/X4R). Sadly, I am also unable to directly access the databases which the OLAP cube uses for its results, so I cannot go "around" it.
I am happy for any help or suggestions you can offer, whether it is a general workaround to link a project to a specific R version or the (less helpful for the community) solution of accessing the OLAP cube in a different way.
Thanks in advance!
Using the answer from MrGumble I created a .bat file that will execute my .R file using the desired R installation. Even though it is not the answer I thought I would get, I think it is an even better solution to the problem.
For all facing a similar issue, here is the .bat file (never created one before, so also had to google how to do it and I guess some might be in the same position):
#echo off
title Getting data for further processing in R
echo Retrieving OLAP data
echo.
"C:\Program Files\Microsoft\R Client\R_SERVER\bin\Rscript.exe" "C:\Users\me\Documents\Projects\!Data\script.R"
echo.
echo Saved data
echo.
pause
Thanks again to MrGumble for his help.
Skip RStudio.
RStudio is really just an editor (albeit powerful and useful) editor, which starts an R console for you (and the surrounding PATH variables, library locations, etc.).
If your monthly task only requires you to run the R-script (or a bit of interactive work), you can simply execute your preferred version of R from the command line and have it run your R script. E.g.
C:\Users\me>"C:\Program Files (x64)\Microsoft R\bin\Rscript" myscript.R
You might have to define some PATH variables so that the older R doesn't look for packages in the newer R's libraries, but that depends entirely on your current setup.

When should I restart R session, GUI or computer?

I use R, Rstudio and Rcpp and I spent over a week debugging some code, that was just giving errors and warnings in unexpected places, in some cases with direct sample code from online or package documentation.
I often restart the R session or Rstudio if there are obvious problems and they usually go away.
But this morning it was really bad to the point were basic R commands would fail and restarting R did nothing. I closed all the Rstudio sessions and restarted the machine for good measure, (which was unnecessary).
When it came back and I re-loaded the sessions everything seems to be working.
Even the some rcpp code I was working on for weeks with outside packages will now compile and run where it gave gibberish errors before.
I have known for a while that R needs to be restarted once in a while, but I know it when basic functions don't run, how can I know earlier.
I am looking for a good general resource or function that can tell me I need to restart because something is not running right. I would be nice if I can also know what to restart.
Whether the R session, the GUI such as Rstudio, all sessions and GUIs or a full machine restart.
For as long as I have been dabbling with or actually using R (ie more than two decades), it has always been recommended to start a clean and fresh session.
Which is why I prefer to work on command-line for tests. When you invoke R, or Rscript, or, in my case, r (from littler) you know you get a fresh session free of possible side-effects. By keeping these tests to the command-line, my main sessions (often multiple instances inside Emacs via ESS, possibly multiple RStudio sessions too) are less affected.
Even RStudio defaults to 'install and restart' when you rebuild a package.
(I will note that a certain development package implies you could cleanly unload a package. That has been debated at length, and I think by now even its authors qualify that claim. I don't really know or case as I don't use it, having had established workflows before it appeared.)
And to add: You almost never need to restart the computer. But a fresh clean process is a something to use often. Your computer can create millions of those for you.

is it possible to run R as a daemon

I have a script in R that is frequently called during the day (by other scripts). I call R in a terminal using
Rscript code.R
I notice it takes a lot of time to load packages and set up R.
Is it possible to run R as a background service which I hit using a port or something?
Yes, look into RServe which has been available for over a dozen years for this reason. There are a couple of fairly high profile applications too.
You can check out this add-in for Rstudio, it is not a port like solution but maybe it can help you https://github.com/bnosac/taskscheduleR

Build system for R

I've got a large data analysis project containing dozens of R scripts that depend in complicated ways on each other and so I thought it would be a good idea to formalize all these dependencies and set the project in a build system that runs things in the correct order and re-runs anything that changes or anything that's downstream from things that change.
But even after some hours worth of googling I haven't found any build systems that are custom-made for R (though there are plenty for more genreal purposes). I've previously worked with waf to organize data analysis projects in Python and know I could use waf to run R scripts as well. But having to manage a whole Python environment just to run some R scripts seems clunky.
What are other people using to solve this problem?
There is a package called "GRANBase" that is capable of doing something similar to what you're referring to. It's an R package management & build tool, so if your scripts are put into R package(s), you could probably make use of it.

Refresh R console without quitting the session?

I usually open the R console all day long, but sometimes I need to clean my history and my workspace's background so that I can test functions or load new data.
I'm wondering whether there is an easier way to use a command line in .Rprofile so that I can refresh the R console without quitting or rebooting my current session.
What I have usually done for this is to q() without saving and then start R again and clean the History. I think somebody here might be able to give me some better suggestions.
Thanks in advance.
For what concerns history, in UNIX-like systems (mine is Debian) this command refreshes it
loadhistory("")
However, as said in comments, loadhistory seems to be platform-dependent.
Check your ?loadhistory if present on your platform. Mine says:
There are several history mechanisms available for the different
R consoles, which work in similar but not identical ways. There
are separate versions of this help file for Unix and Windows.
The functions described here work on Unix-alikes under the
readline command-line interface but may not otherwise (for
example, in batch use or in an embedded application)

Resources