Issue with multithreading on Yosemite with R and accelerate - r

My R implementation on Mavericks was linking the accelerate BLAS library (instead of the standard R-provided one).
After a clean install on Yosemite I tried to link to accelerate with
ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib libRblas.dylib
(as suggested by Zachary Mayer here).
The link works, I think (otherwise I would have got an error calling the library functions during the benchmark).
On the other hand trying the notorious R 25 benchmark I did not see any improvement.
I tried on the same laptop the new RRO (Revolution R Open), that includes intel MKL and the test went from ~30 s to ~5 s.
Is it possible that the Apple library for Yosemite is not offering multi-threading anymore?

Related

Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so in R

I wrote a R package with Rcpp and RcppArmadillo and load it into a supercomputer cluster running on Unix. However, it produces the above error while I try to run one of the function in a R instance on the cluster. Does anyone know how to solve this? Module loaded on the cluster is R-3.5.0
I don't know the particulars of your HPC system but on "vanilla" machines it is easy to use MKL ... because, as frequently stated before, BLAS and LAPACK are an interface to which MKL adheres.
See for example the blog posts I wrote here about using MKL on Ubuntu
http://dirk.eddelbuettel.com/blog/2018/06/24#019_mkl_soon_in_debian
http://dirk.eddelbuettel.com/blog/2018/04/15#018_mkl_for_debian_ubuntu
and the GitHub repo with the script they reference
https://github.com/eddelbuettel/mkl4deb
So if I had to guess I'd say that your dynamic linker part only sees parts of the MKL library directory. And for what it is worth, I have had from other users on large research systems. Good luck!

Can homebrew R and "standard" R for MacOS from CRAN coexist?

I am running R 3.6.1 on a Mac Mini running Sierra and a MacBook Pro running El Capitan. I normally get all the R packages that I need from CRAN or github and use them without issues, but I am trying to install and use an R package (NicheMapR) that requires a fortran compiler and this is giving me issues. Even after installing gfortran, the R package still does not work (the fortran code seems to be compiled but the package installation fails). The package developer suggested that installing R via homebrew might solve the problem. On the contrary, my hunch is that it would lead to a world of pain, to quote Walter from the Big Lebowski. My questions are:
What is the advantage of a homebrew version of R for MacOSX over the "regular" version installed from CRAN?
Can the two versions coexist?
Is the homebrew version going to affect the regular one?
Finally: is homebrew going to help or will it simply open a whole
new can of worms?
Many thanks in advance.
Yes, installing from homebrew is a recipe for pain. It's specifically recommended against by the official CRAN binary maintainer see his remarks from March 2016 on r-sig-mac.
Regarding your questions, this can be summarized as:
What is the advantage of a homebrew version of R for MacOSX over the "regular" version installed from CRAN?
Positives: Select your own BLAS and easily work with geospatial tools.
Downsides: Always needing to compile each R package.
Can the two versions coexist?
Yes. The homebrew version installs into a different directory. But, watch out for library collision (see next question). However, you will have to deal with symbolic linking regarding what version of R is accessible from the console and you will also need to look into using RSwitch to switch between R versions.
Is the homebrew version going to affect the regular one?
Yes, if the library paths overlap. There will be problems regarding package installation and loading. Make sure to setup different library paths. To do so, please look at the .libPaths() documentation.
Finally: is homebrew going to help or will it simply open a whole new can of worms?
Yes and no. Unless you know what you're doing, opt for the CRAN version of R and its assorted goodies.

Parallel execution of randomforestSRC on macOS

I have a problem with parallel estimation of random survival forest from randomForestSRC package. I have followed the this guide and tried installing it on Mac (Sierra). However, the rfsrc() function still runs on a single thread. Could you please advice what to do in order to achieve parallel execution, as the function takes ages to compute on a larger dataset. I have directly followed the steps described in the tutorial and no success.
Thanks in advance!
The guide noted in your question is from 2013 and the process for successful OpenMP parallel execution has been significantly streamlined since then. In fact, the binaries available on CRAN for the current build (2.5.1) should run in parallel on Sierra. The source code includes a ready-made configure file that is the result of the autoconf command. Thus, parallel execution is the default behaviour now. If you haven't yet upgraded to the latest build, I would recommend doing so. If the binary build provided by CRAN still does not switch on parallel execution, I would recommend upgrading your compiler to GCC using Homebrew or another package manager, and then appropriately create and massage a Makevars file as given in the instructions on our GitHub page so as to allow the CRAN package installation process to pick up the GCC compiler instead of the default Clang compiler:
https://kogalur.github.io/randomForestSRC/building.html

Linking Intel's Math Kernel Library (MKL) to R on Windows

Using an alternative BLAS for R has several advantages, see e.g. https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf.
Microsoft R Open https://mran.revolutionanalytics.com/documents/rro/installation/#sysreq is using Intel's MKL instead of the default Reference BLAS to speed up calculations.
My question is:
What would be the exact steps to link Intel's MKL library **manually to R**'s most recent version on Windows (https://cran.r-project.org/bin/windows/base/)?
UPDATE 20-07-2016:
Here is very detailed description on how to build a OpenBLAS-based Rblas.dll for 64-bit R for Windows for R ≥ 3.3.0: http://www.avrahamadler.com/r-tips/build-openblas-for-windows-r64/
Easier solution than having to recompile R against the Intel MKL libraries on Windows is just to
Install Microsoft R Open from https://mran.microsoft.com/download, which comes with the outdated R version 3.5.3 but also with the Intel MKL multithreaded BLAS libraries
Install the latest version of R from https://cran.r-project.org/bin/windows/base/, i.e. currently R 3.6.2
copy files libiomp5md.dll, Rblas.dll and Rlapack.dll from C:\Program Files\Microsoft\R Open\R-3.5.3\bin\x64 to C:\Program Files\R\R-3.6.2\bin\x64 (you can back up your existing default non-hyperthreaded Rblas.dll and Rlapack.dll files first if you like)
copy Microsoft R Open libraries/packages MicrosoftR, RevoIOQ, RevoMods, RevoUtils, RevoUtilsMath and doParallel from C:\Program Files\Microsoft\R Open\R-3.5.5\library to your default package directory, e.g. C:\Documents\R\win-library\3.6
copy files Rprofile.site and Renviron.site from directory C:\Program Files\Microsoft\R Open\R-3.5.5\etc to C:\Progral Files\R\R-3.6.2\etc
replace line 24 in file Rprofile.site options(repos=r) with options(repos="https://cran.rstudio.com") (or your favourite CRAN repository - you can also use "https://cran.revolutionanalytics.com", the MRO repository that has the latest daily builds of all packages) to make sure that it will install the latest CRAN packages as opposed to the outdated mran.microsoft.com mirror that has outdated package versions, frozen at the 15th of April 2019. Also comment out lines 153, 154 and 155 with a #
Then restart RStudio to check that it works, with small SVD benchmark on my Intel Core i7-4700HQ 2.4GHz 4 core/8 thread laptop:
getMKLthreads()
4
# Singular Value Decomposition
m <- 10000
n <- 2000
A <- matrix (runif (m*n),m,n)
system.time (S <- svd (A,nu=0,nv=0))
user system elapsed
15.20 0.64 4.17
That same benchmark without Intel MKL installed ran at
user system elapsed
35.11 0.10 35.21
so we get a >8 fold speed increase here!
Screenshot of Microsoft R Open 6.2 with Intel MKL up and running:
Alternatively, if you don't like copying files from MRO to your latest R installation, you can also copy the files from the free Intel MKL installation to your R installation to get multithreaded operation (as outlined in the other answer below):
Install Intel MKL from https://software.intel.com/en-us/mkl/choose-download (free)
Copy all the contents from inside these folders
C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\redist\intel64\mkl
C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\redist\intel64\compiler
to
C:\Program Files\R\R-3.6.1\bin\x64
Inside the destination folder, create 2 copies of mkl_rt.dll and rename one of them Rblas.dll and the other Rlapack.dll replacing the originals and also keeping mkl_rt.dll.
This will not provide you with the function setMKLthreads() and getMKLthreads() functions though to set the nr of intel MKL threads, as these come with the MRO package RevoUtilsMath. But for most people the default nr of threads set equal to the nr of physical cores will be OK though...
Not sure what's up with Microsoft, and why they are no longer updatig MRO... And why they also dropped Mac OS X support...
I hope that, given that Intel MKL is free now, the R core people will sooner or later provide a precompiled R version that is compiled to use the Intel MKL libs, or possibly detect at run time if Intel MKL is installed, and if it is, use it. I think this is important, especially since the easy availability of a good multithreaded BLAS also determines how one would develop packages - e.g. if a good multithreaded BLAS would be available for all OSes one would veer towards using RcppArmadillo, which falls back on whatever BLAS one has installed (but on Windows would give drastically worse timings if Intel MKL is not installed), and if not RcppEigen would be the best option, as that has its own multithreaded matrix algebra, irrespective of the BLAS against which R is compiled......
On Ubuntu btw it's very easy to make R use Intel MKL, without having to recompile R, as outlined here: https://github.com/eddelbuettel/mkl4deb
PS Slight problem is that running setMKLthreads(4) will crash RStudio (that was already a problem in the official MRO 3.5.3 though) but it does work OK in the R console...
I was able to link R 3.6.0 with custom dlls you create using the builder. Basically you have to export the same symbols Rblas.dll and Rlapack.dll do. Start the Compiler 19.0 Update 4 for Intel 64 Visual Studio 2017 environment command prompt:
Get the symbols:
dumpbin /exports Rblas.dll > Rblas_list
dumpbin /exports Rlapack.dll > Rlapack_list_R
Edit both files deleting the "header" and "footer" and have all the lines with the symbol names (ex.: 248 F7 00138CE0 dgeevx_) be like dgeevx_ (only with the names).
Copy the builder directory to somewhere in your pc and inside it run:
# blas links fine
nmake libintel64 export=..path..\Rblas_list name=Rblas
# save lapack errors in another list
nmake libintel64 export=..path..\Rlapack_list_R name=Rlapack 1> undefined_symbols_list
Edit undefined_symbols_list keep only the names in each line and create a new
list with the difference
findstr /v /g:undefined_symbols_list Rlapack_list_R > Rlapack_list
nmake libintel64 export=..path..\Rlapack_list name=Rlapack
With dumpbin /dependents Rlapack.dll, you can see that they depend on libiomp5md.dll, which you can find inside the redist folder in mkl installation.
Method 2
This method uses more disk space, but it's simpler. Copy all the contents from inside these folders
C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\redist\intel64\mkl
C:\Program Files (x86)\IntelSWTools\compilers_and_libraries\windows\redist\intel64\compiler
to
C:\Program Files\R\R-3.6.1\bin\x64
Inside the destination folder, create 2 copies of mkl_rt.dll and rename one of them Rblas.dll and the other Rlapack.dll replacing the originals and also keeping mkl_rt.dll.
just tried for R 3.5.1 installation. I installed Microsoft R Open alongside with the CRAN R and copy libiomp5md.dll and overwrite Rblas.dll, Rlapack.dll from MRO MKL counterparts to link to CRAN R on Windows (similar to another answer above but need to copy the file libiomp5md.dll as well). This worked out fine and the CRAN R runs as fast as MRO according to the version.compare package on Github (https://github.com/andrie/version.compare)
The solution presented by Tom Weenseleers worked for me with the latest R version. Thank you.
I wanted to add something to this discussion, as its related, and I was unsure how to add this to the greater community. Please forgive my descriptions, I am an amateur.
This workaround seems to break at least two other R packages, igraph, and clusterProfiler. Clusterprofiler depends on igraph, so the root cause was igraph. Likely other igraph dependent packages as well.
I am posting this because I found a simple workaround, and after extensive searching I never found this addressed explicitly on any forum, and this may help someone else.
for reference,
clusterProfiler_4.4.4 and igraph_1.3.4
R version 4.2.1, Platform: x86_64-w64-mingw32/x64 (64-bit)
Windows 10 x64 (build 22000)
calling library(igraph) after implementing the above workaround yielded the following error in a popup:
rsession-utf8.exe Entry Point Not Found
The procedure entry point quadmath_snprintf could not be located in the dynamic link library C:Program Files\R\R->4.2.1\library\graph\libs\x64\igraph.dll.
Pressing OK yields an error message in R
Error: package or namespace load failed for ‘igraph’ in inDL(x, >as.logical(local), as.logical(now), ...):
unable to load shared object 'C:/Program Files/R/R->4.2.1/library/igraph/libs/x64/igraph.dll':
LoadLibrary failure: The specified procedure could not be found.
The workaround - When implementing the workaround to use MKL in R, keep the original Rlapack.dll and Rlablas.dll (i simply renamed these i.e Rblas_orig.dll) To use igraph or dependent packages, swap the .dll filenames, such that the two original R blas file has the original name, and the MKL file is renamed (i.e. Rblas_mkl.dll). Restart R, and igraph/clusterprofiler loads fine.
Unfortunately, this will disable the MKL until you revert the .dll filenames again and restart R, but as long as you don't need to use both igraph and MKL at the same time it works, although frustrating.
If anyone finds a better solution please let me know.

Compiling R package in parallel on multiple platforms

I am developing a new R package using Rcpp.
We reached the point were compile times become significantly long.
So I was wondering how to compile an R package in parallel.
We develop on Linux, OSX and Windows for max compatability and so far I was only able to answer my question for Linux (sudo MAKE="make -j8" R CMD INSTALL package).
Can some one tell me how to do the same thing on a Windows and OSX system?
thanks
Cedric

Resources