PyCaret methods on GPU/TPU - jupyter-notebook

PyCaret methods on GPU/TPU - jupyter-notebook

When I ran best_model = compare_models() there is a huge load on CPU memory, while my GPU is unutilized. How do I run the setup() or compare_models() on GPU?
Is there an in-built method in PyCaret?

Only some models can run on GPU, and they must be properly installed to use GPU. For example, for xgboost, you must install it with pip and have CUDA 10+ installed (or install a GPU xgboost version from anaconda, etc). Here is the list of estimators that can use GPU and their requirements: https://pycaret.readthedocs.io/en/latest/installation.html?highlight=gpu#pycaret-on-gpu
As Yatin said, you need to use use_gpu=True in setup(). Or you can specify it when creating an individual model, like xgboost_gpu = create_model('xgboost', fold=3, tree_method='gpu_hist', gpu_id=0).
For installing CUDA, I like using Anaconda since it makes it easy, like conda install -c anaconda cudatoolkit. It looks like for the non-boosted methods, you need to install cuML for GPU use.
Oh, and looks like pycaret can't use tune-sklearn with GPU (in the warnings here at the bottom of the tune_model doc section).

To use gpu in PyCaret you have to simply pas use_gpu=True as parameter in setup function.
Example:
model = setup(data,target_variable,use_gpu=True)

Related

How to use parallel make when compiling R packages?

When using the make tool directly, one can use the -j option to build in parallel.
How can I use parallel build when installing an R package using install.packages()? make is invoked by R, not by me, so I can't pass the -j option to it. Setting export MAKE_FLAGS=-j4 before starting R did not work. I am looking to set up parallel build permanently for my R installation.

The options(Ncpus=8) route is one way. In install.packages() you have Ncpus = getOption("Ncpus") and that option is described as
Ncpus: the number of parallel processes to use for a parallel
install of more than one source package. Values greater than
one are supported if the ‘make’ command specified by
‘Sys.getenv("MAKE", "make")’ accepts argument ‘-k -j Ncpus’.
I don't see it listed in update.packages() but it functions that same way on my Linux machines so builds generally happen in parallel.
In short, this uses parallel builds of multiple packages, as opposed to make -j ... when just building one package. I tried that route too but found the gains less compelling.

can't run bartMachine parallel

bartMachine package for R should rely in parallel processing for reducing computing time, however I can't find how to make it work: the documentation of the packages repeats that it supports parallel processing, but there are no instruction for how to make it do it, and I can see that only one of the logical cores of my pc is working.
I use ubuntu 16.04.4 and I tied installing bartMachine via compilation from source, as recommended by its github page, thought I'm not sure I did everything correctly.
what can I do to make bartMachine finally work in parallel?

Have you tried running set_bart_machine_num_cores(num_cores) in R before running bartMachine? This did the trick for me.
See https://rdrr.io/cran/bartMachine/man/set_bart_machine_num_cores.html

Why can R be linked to a shared BLAS later even if it was built with `--with-blas = lblas`?

The BLAS section in R installation and administration manual says that when R is built from source, with configuration parameter --without-blas, it will build Netlib's reference BLAS into a standalone shared library at R_HOME/lib/libRblas.so, along side the standard R shared library R_HOME/lib/libR.so. This makes it easier for user to switch and benchmark different tuned BLAS in R environment. The guide suggests that researcher might use symbolic link to libRblas.so to achieve this, and this article gives more details on this.
On contrary, when simply installing a pre-compiled binary version of R, either from R CRAN's mirrors or Ubuntu's repository (for linux user like me), in theory it should be more difficult to switch between different BLAS without rebuilding R, because a pre-compiled R version is configured with --with-blas = (some blas library). We can easily check this, either by reading configuration file at R_HOME/etc/Makeconf, or check the result of R CMD config BLAS_LIBS. For example, on my machine it returns -lblas, so it is linked to reference BLAS at build time. As a result, there is no R_HOME/lib/libRblas.so, only R_HOME/lib/libR.so.
However, this R-blog says that it is possible to switch between difference BLAS, even if R is not installed from source. The author tried the ATLAS and OpenBLAS from ubuntu's repository, and then use update-alternatives --config to work around. It is also possible to configure and install tuned BLAS from source, add them to "alternatives" through update-alternatives --install, and later switch between them in the same way. The BLAS library (a symbolic link) in this case will be found at /usr/lib/libblas.so.3, which is under both ubuntu and R's LD_LIBRARY_PATH. I have tested and this does work! But I am very surprised at how R achieves this. As I said, R should have been tied to the BLAS library configured at building time, i.e., I would expect all BLAS routines integrated into R_HOME/lib/libR.so. So why is it still possible to change BLAS via /usr/lib/libblas.so.3?
Thanks if someone can please explain this.

fast install package during development with multiarch

I'm working on a package "xyz" that uses Rcpp with several cpp files.
When I'm only updating the R code, I would like to run R CMD INSTALL xyz on the package directory without having to recompile all the shared libraries that haven't changed. That works fine if I specify the --no-multiarch flag: the source directory src gets populated the first time with the compiled objects, and if the sources don't change they are re-used the next time. With multiarch on, however, R decides to make two copies of src, src-i386 and src-x86_64. It seems to confuse R CMD INSTALL which always re-runs all the compilation. Is there any workaround?
(I'm aware that there are alternative ways, e.g. devtools::load_all, but I'd rather stick to R CM INSTALL if possible).
The platform is MacOS 10.7, and I have the latest version of R.

I have a partial answer for you. One really easy for speed-up is provided by using ccache which you can enable for all R compilation (e.g. via R CMD whatever thereby also getting inline, attributes, RStudio use, ...) globally through .R/Makevars:
edd#max:~$ tail -10 .R/Makevars
VER=4.6
CC=ccache gcc-$(VER)
CXX=ccache g++-$(VER)
SHLIB_CXXLD=g++-$(VER)
FC=ccache gfortran
F77=ccache gfortran
MAKE=make -j8
edd#max:~$
It takes care of all caching of compilation units.
Now, that does not "explicitly" address the --no-multiarch aspect which I don;t play much with that as we are still mostly 'single arch' on Linux. This will change, eventually, but hasn't yet. Yet I suspect but by letting the compiler decide the caching you too will get the net effect.
Other aspects can be controlled too, eg ~/.R/check.Renviron can be used to turn certain tests on or off. I tend to keep'em all on -- better to waste a few seconds here than to get a nastygram from Vienna.

Kernel compilation for Click modular router

I am trying to install Click modular router in kernel mode. For this I need to patch and compile a custom kernel. I am presently running ubuntu on kernel 2.6.22.14 and I am trying to compile the kernel 2.6.24 from kernel.org
I patched the downloaded kernel with the I use the /boot/config file of my present kernel to do a compilation of new kernel via make oldconfig. I then did a make modules_install and finally make install.
The kernel compiled fine and boots nicely. However when I try to insert the kernel module Click.o it fails.
I am guessing I need to do some changes in config file before installing. Please help.

I think that the better way is to try patchless installation of Click in kernel mode. Patchless installation works on modern kernel versions. For example you can use Debian 6.0 Squeeze (kernel 2.6.32) or Debian Wheezy (kernel 3.2). I checked, it works.
When you get error like that on Wheezy:
=========================================
Can't find include/linux/skbuff.h in /lib/modules/3.2.0-4-686-pae/build.
Are you sure /lib/modules/3.2.0-4-686-pae/build contains Linux kernel source?
=========================================
You might need to apply that hack: https://github.com/kohler/click/issues/104

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex