I'm trying to install the doMPI package in R.
Apparently there are no binaries available for the 3.x version?
Do i need to build it from source?
http://cran.r-project.org/web/packages/doMPI/
<>
The goal is to run parallel processing with caret on a windows machine.
CRAN doesn't build binaries of doMPI for Mac OS X or Windows because it depends on the Rmpi package, and it doesn't build binaries for Rmpi because it depends on MPI libraries which don't come by default on those platforms. Some people have suggested that I declare Rmpi to be a suggested package to work-around this issue, but in fact, doMPI really does depend on Rmpi, so it always seemed like an odd thing to do. The way I see it, if you're able to build Rmpi from source, you'll have no trouble building doMPI from source.
So yes, you have to build it from source, but the bigger problem is to build Rmpi from source, unless you're using a Linux distribution like Debian that distributes both Rmpi and doMPI as binary deb packages.
But if you just want to run caret in parallel on a Windows machine, the normal solution is to use the doParallel package using a PSOCK cluster. People have trouble with that as well, but at least installation of the packages is easy since there are binary packages available for doParallel on CRAN.
Related
This question already has answers here:
R: apt-get install r-cran-foo vs. install.packages("foo")
(2 answers)
Closed 7 years ago.
In Debian, there are some compiled R packages in the official repositories. But one could also install a R package from source.
I am interested to know why would a user prefer one method of installation to another.
It's sometimes preferable to 'compile' the sources on your server rather than just using an existing executable file.
This is because the compiler makes the exe file specifically for your machine so may run faster and work much better, for instance the compiler knows the processor you have so can optimise for this.
I already provided a somewhat detailed answer in response to this SO question.
As an update, these days you even have lots of packages prebuilt thanks to updated cran2deb initiaives:
On Ubuntu you now have almost all CRAN packages prebuilt via Michael Rutter's 'cran2deb for ubuntu' ppa on Launchpad.
For Debian, Don Armstrong now provides a similar service (also covering BioConductor and OmegaHat) at debian-r.debian.net.
The idea of pre-compiled R packages for Debian/Ubuntu is borrowing from Windows and MacOS. Those OSes have pre-compiled packages since they typically don't have the standard tools in standard locations for building packages from source (c and fortran compilers, latex, perl, etc.).
If there is a new release of a package on CRAN, is the pre-compiled package on Debian repos automatically updated? I believe that you better sync with CRAN. Check out the package ctv to help you manage large collections of R packages ("CRAN views"), both for installing and updating.
You need root privileges to install a pre-compiled package from the OS repos, while any regular user may install any packages using install.packages() in R (but I recommend to run sudo R, if you are the sysadmin, for installing CRAN views, so as to make them available system-wide, instead of inflating your ~/).
One inconvenient to source packages is that if you fetch many, the compiling will take extra time to install (depending on your machine). You might gain in performance from compiling, but it is not guaranteed to be noticeable.
I have some complicated code in R that uses the neuralnet library for some computations.
Sorrily, I'm new to R and I'have less than a week to obtain some results using the existing code, which take quite a while in the processors I have to my disposal.
My idea is to implement the code via microsoft R open (MRO), which could accelerate the computations, but I haven't been able to install the neuralnet library via anaconda (I prefer anaconda because it's simple and allows to create environments easily). The installation goes with "Solving environment" forever.
Is there a way to install these libraries to be compatible with MRO on anaconda?? should i desist of using anaconda for this task?
We have established a simple local CRAN-like repository for R packages. There are many users, all of which use the same version of Linux.
Is there a way of convincing R to provide pre-compiled Linux packages instead just source ones? The compilation step takes a considerable amount of time for anyone using our repository. It should be possible to precompile and reuse the same binaries, since we can guarantee that the Linux version is consistent for all users.
How could one hack something like this together?
In the very narrow sense of "all of which use the same version of Linux" you actually have an option (that happens to be relatively littler known). Create binary packages using e.g.
R CMD INSTALL --build nameOfDirectoryWithSources
As R CMD INSTALL --help says it
--build build binaries of the installed package(s)
and these are not .deb or .rpm alike packages: no dependency information or alike is added. But they do exactly what you ask for: save on compilation time.
I am not aware of a repository structure one can build of this though.
I have a problem with parallel estimation of random survival forest from randomForestSRC package. I have followed the this guide and tried installing it on Mac (Sierra). However, the rfsrc() function still runs on a single thread. Could you please advice what to do in order to achieve parallel execution, as the function takes ages to compute on a larger dataset. I have directly followed the steps described in the tutorial and no success.
Thanks in advance!
The guide noted in your question is from 2013 and the process for successful OpenMP parallel execution has been significantly streamlined since then. In fact, the binaries available on CRAN for the current build (2.5.1) should run in parallel on Sierra. The source code includes a ready-made configure file that is the result of the autoconf command. Thus, parallel execution is the default behaviour now. If you haven't yet upgraded to the latest build, I would recommend doing so. If the binary build provided by CRAN still does not switch on parallel execution, I would recommend upgrading your compiler to GCC using Homebrew or another package manager, and then appropriately create and massage a Makevars file as given in the instructions on our GitHub page so as to allow the CRAN package installation process to pick up the GCC compiler instead of the default Clang compiler:
https://kogalur.github.io/randomForestSRC/building.html
I am developing a new R package using Rcpp.
We reached the point were compile times become significantly long.
So I was wondering how to compile an R package in parallel.
We develop on Linux, OSX and Windows for max compatability and so far I was only able to answer my question for Linux (sudo MAKE="make -j8" R CMD INSTALL package).
Can some one tell me how to do the same thing on a Windows and OSX system?
thanks
Cedric