How to replicate the package check time performed on CRAN? - r

I've been trying to reduce the check time on a package I am submitting to CRAN. On my local machines, check time is somewhere between a minute (i7 CPU) and 2 minutes (i5 CPU). However, CRAN reviewers keep pointing out the check time is over 10 minutes. The only way I could find to reproduce such long check times is by uploading my package to http://win-builder.r-project.org/, where it indeed takes > 600 s to check.
I wish I could reproduce this check time locally so I am not dependent on a remote solution. The only difference I can see between Win builder and my local machine is the OS (Win vs. Linux) and how Win builder seems to be doing multiarch checks (i386 and x64).
I am not sure how to reproduce this locally. I have tried R CMD check with seemingly-relevant switches like --multiarch and --force-multiarch but it doesn't seem to be doing anything differently. I guess I have to install some extra packages like r-cran-i386 or whatever, but I couldn't find anything of the sort in my repositories ("R" can be such a PITA of a search expression) and the instructions on README files like the one on https://cran.r-project.org/bin/linux/ubuntu/ didn't get me far enough.
I am already using --as-cran, and am aware of solutions like this, though I think installing R i386 on a separate VM containing a 32-bit OS defeats the purpose of what I am trying to accomplish.

Related

renv::restore() is nail-bitingly slow

I am trying out the renv package for the first time.
I took an existing project I manage with packrat, removed the .Rprofile and the packrat dir (I was happy starting from scratch)
I added a local work repo using options, and then ran renv::init(). This discovered what looks like a complete list of dependencies (138 CRAN packages, 10 work packages, and one github installed package)
I then copied that folder to another comupter, changed RENV_PATHS_SOURCE to something globally available on that system and went to the project directory, started R, it told me it was out of sync and asked me to run status. I did, it looked fine, reported a bunch of lirbaries that needed installing. then I ran renv::restore()
It then correctly listed all the dependencies, and then moves on to install them.
And this is really slow.
I sit here waiting, seeing new tarballs being listed as they are fetched, and it takes, at best, a whole minute to fetch each one. but more typically somewhere in the range 1-2 minutes each.
Which is strange, because they are listed as being downloaded in 0.2-0.7 seconds each. Sniffing network interfaces confirm this. There are bursts of packages coming in for brief moments, with each new tarball listed, which seem like they could match the reported times spent.
R cpu usage fluctuates between 0.0% and 1.0% during all this.
So what is renv doing?
So for a not particularly complicated project, that pulls in 180 packages, I'd be waiting 3 hours just to fetch tarballs? That themselves only should take 0.5 seconds each? This will become a problem. Me personally I'm holding of migrating from packrat until this is solved or understood.
But what is renv doing or waiting for? There is no network activity, no cpu usage, no iowait, as far as I can tell (looking at top output).

Checking an R-devel-linux error for my CRAN package using my MacBookPro

I have a CRAN package that has inputenc errors when creating vignette PDF outputs in a strict Latin1 locale. The check results have errors for flavor
r-devel-linux-x86_64-debian-clang, which uses LANG=en_US.iso885915. I believe I may have fixed the problem (it was a warning on my Mac). However, I feel I should check the problem on Linux as well before submitting bug fixes to CRAN. It was suggested to me that to check that my package fixes the error on LANG=en_US.iso885915, that I should run the following command on Linux:
LANG=en_US.iso88591 R CMD check
I do not have access to Linux and will not for the reasonably foreseeable future. I am trying to figure out how to run this command on my macOS Catalina (version 10.15.3). With little experience running Linux on Mac, I have been doing searches online, with an example forum here. There are some negatives discussed there, including the lack of necessity of running Linux on Mac (usually), installation of Linux restraints on latest MacBooks, and virtual machines not truly representing what Linux can do.
I decided it may be helpful for me to ask here on SO. I do not have too much storage left on my Mac and I would likely delete any Linux installations quickly since I will likely not need them outside of this one issue. I also do not have much experience installing virtual machines and Linux. I also hope to avoid any other risks I may not even be aware of to my computer.
What is the best way (convenient, low risk, quick, low storage requirements) you may know for someone in my position to check this recommended command (LANG=en_US.iso88591 R CMD check) with access to simply my MacBook Pro (Retina, 13-inch, Early 2015)? Thank you for sharing suggestions!

How to install R package binaries on Linux just like on Windows?

When I run install.packages("somepkg") on Linux (Ubuntu mostly), the installation process invovled building the R package from source which can be time consuming. Also it can be prone to failure due to missing development related Linux packages.
Is there a way to install compiled binaries like on Windows? I heard that it can be done, but couldn't find an easy to understand resource. Hope by asking here I will make the answer (if it exists) more googlable.
It depends on whether binaries exist. Which, in turn, depends on which Linux distro and version you are running.
For Ubuntu 18.04 (and later, as they are compatible), you can use the Rutter PPAs which cover over four thousand CRAN package. This is described (albeit very briefly) at the top of this README at CRAN.
I also blogged about that (a few times) below my r4/ tag -- and because it didn't really "stick" amplified it again with short video plus slides, see this post. The video runs for about 5 mins during which we install rstan and tidyverse as binaries each with just one command and it takes about a good minute each (depending on bandwidth and disk speed, of course) pulling all dependencies in pre-built and in a fail-safe manner.
If this matches your needs, give it a try and please come to to the r-sig-debian list for questions.
If you are on a different Linux flavor then I unfortunately less sure if a comparable service exists.
Edit on 2020-09-17 As this was just upvoted and I was thus reminded of it, you now have better options and can get Linux binaries via install.packages("pkgname"). One way is RSPM, the other is BSPM. I have a first comparison blog post comparing both here (even with animated gif movies ;-)) and should be able to say more about BSPM "soon".
Edit on 2022-08-03 And going beyond RSPM and BSPM is the newer r2u which has been up for a few months and is currently serving around two thousand binaries a day. It is the best approach for binaries on Ubuntu LTS installations (currently: 20.04 and 22.04). See r2u for more including demos.

When should I restart R session, GUI or computer?

I use R, Rstudio and Rcpp and I spent over a week debugging some code, that was just giving errors and warnings in unexpected places, in some cases with direct sample code from online or package documentation.
I often restart the R session or Rstudio if there are obvious problems and they usually go away.
But this morning it was really bad to the point were basic R commands would fail and restarting R did nothing. I closed all the Rstudio sessions and restarted the machine for good measure, (which was unnecessary).
When it came back and I re-loaded the sessions everything seems to be working.
Even the some rcpp code I was working on for weeks with outside packages will now compile and run where it gave gibberish errors before.
I have known for a while that R needs to be restarted once in a while, but I know it when basic functions don't run, how can I know earlier.
I am looking for a good general resource or function that can tell me I need to restart because something is not running right. I would be nice if I can also know what to restart.
Whether the R session, the GUI such as Rstudio, all sessions and GUIs or a full machine restart.
For as long as I have been dabbling with or actually using R (ie more than two decades), it has always been recommended to start a clean and fresh session.
Which is why I prefer to work on command-line for tests. When you invoke R, or Rscript, or, in my case, r (from littler) you know you get a fresh session free of possible side-effects. By keeping these tests to the command-line, my main sessions (often multiple instances inside Emacs via ESS, possibly multiple RStudio sessions too) are less affected.
Even RStudio defaults to 'install and restart' when you rebuild a package.
(I will note that a certain development package implies you could cleanly unload a package. That has been debated at length, and I think by now even its authors qualify that claim. I don't really know or case as I don't use it, having had established workflows before it appeared.)
And to add: You almost never need to restart the computer. But a fresh clean process is a something to use often. Your computer can create millions of those for you.

Create R Windows Binary from .tar.gz linux

This is sort of related to a previous post of mine. I have the need to use the bigmemory library on my 32bit Windows PC to do some ugly matrix calculations. Unfortunately, it appears that the maintainers have temporarily ceased production of Windows binaries. I have Ubuntu on my home PC. I would really like to take the .tar.gz file and build it into a Windows binary that I can actually run at work. I realize there are more efficient ways, like installing RTools on the Windows device. However, our IT keeps our admin rights on lockdown, so I can never edit my PATH enviro variable. Could anyone provide some general guidance for doing this? Are there any tools I need to install on my Ubuntu PC above and beyond R?
I found similar questions, but nothing that thoroughly answered my questions.
Unless the package source is incompatible with current versions of R, you could use the R project's win-builder site to build a Windows binary. Quoting from the linked site, win-builder is a service:
intended for useRs who do not have Windows available for checking and building Windows binary packages.
As a convenience, Hadley Wickham's devtools package includes a utility function, build_win(), that you can use for this purpose. From ?build_win:
Works by building source package, and then uploading to http://win-builder.r-project.org/>. Once building is complete you'll receive a link to the built package in the email address listed in the maintainer field. It usually takes around 30 minutes.
Windows has four sets of environment variables (system, user, volatile and process sets). The first three sets are stored in the registry but the process set is not so even if they have locked down the registry its typically still possible to set the process environment variables (including the PATH) in a local process, i.e. on a temporary basis, so you might double check your assumptions that you can't modify anything. Its more likely that you can't modify the system variables and registry but can still modify the set in your local process. To check this from the Windows cmd line enter this:
set mytest=123
set mytest
and if the second line shows that mytest has the value 123 then you likely have all the permissions you need.
Furthermore anything you need to set is all handled automatically for you by R.bat in the batchfiles distribution so you don't have to set anything yourself.
Just ensure that Rtools and R are installed into the standard locations (you can tell them to skip the setting of any registry keys during the installation process), ensure R.bat is on your path or in current directory and run:
R.bat CMD INSTALL mypackage.tar.gz
without setting environment variables, registry keys or path.
If that does not work try Rpathset.bat also from the batchfiles which is not automatic like R.bat but on the other hand is extremely flexible since you must modify the SET statments in it to whatever you want.
There is a PDF document that comes with the batchfiles which gives more info.

Resources