I need to limit the number of threads running my neural network using these instructions here: https://github.com/keras-team/keras/issues/4740.
However, I am using keras in R, and I am not sure how do I access the tensorflow implementation used in keras I load in R using
library("keras")
I can call library(tensorflow), however, isn't it loading a library copy unrelated to the one loaded by keras? And I cannot find any functionality in R that allows to load tensorflow backend associated with keras in Rstudio. Also I cannot find any links to anyone doing the same.
Can someone suggest a way to do the operations in the link from R, given keras loaded with library("keras") (in the link tensorflow backend for keras is used to set the number of threads per core). It would also be good to know how to check which version is loaded into R by keras.
Related
A little background to understand my problem:
My company uses a private server to put our ML models into production using opencpu. The ML models, which were generated using Caret, are usually written into an R package, that does data preprocessing before passing to the model. The R package, along with opencpu, and the R package dependencies are compiled using Docker into a Docker container, that is then deployed onto the server. I don't understand the deployment process, but that's not my job. My job is to come up with the ML models, create the R package and make sure it works (in our test environment) before it goes to production.
Recently I developed a model using Keras/Tensorflow in R, and want to test this model within our test environment (which is mimics the production environment). This means that I need to include the keras/tensorflow model inside an R package, similar to the Caret version.
I want to know how I can do this without having to install Keras as a dependency. The only point of using Keras is to load the model using the load_model_hdf5 function, with the prediction being done by the base R predict function. Personally I think it is overkill to install such a large package (Keras in R installs Tensorflow, conda and a python environment as well) just to load a model.
This page (https://cran.r-project.org/web/packages/tfdeploy/vignettes/introduction.html) describes methods for deploying Tensorflow models in R, but they only discuss using Rstudio connect, CLoudML and TensorFlow Serving (but this uses gRPC).
I'm running Keras in R and using Tensorflow-GPU backend. Is it possible to force Keras to run on CPU without re-installing the backend?
Let me give you 2 answers.
Answer #1 (normal answer)
No, unfortunately not. For keras CPU and GPU are 2 different versions, from which you select at install time.
It seems you remember that you selected GPU at install time. I guess you're hoping that you were only setting a minor option, not selecting a version of the program. Unfortunately, you were selecting the version of keras to install.
Answer #2 (ok, maybe you can "trick" keras)
It seems you can use environment variable values to trick keras into thinking that your CPU is your GPU.
This seems like it may have unexpected results, but it seemed to work for these Python users.
I wouldn't worry about the fact that they are using Python. They are just using their language to set environment variables. So you can do the same in R
or directly within your OS.
Imagine that I open two session R.
In the first (R1) I loaded the package dplyr.
Now, my questions is, is there way to get the sessionInfo/packages loaded in R1
through R2??
UPDATE:
I am writing a R help system in Atom editor. Atom editor currently not support the function help of R. So i am creating one. And to find the help of the function you need to search into packages where this function is, the best way is know what packages are loaded in your current session R. And that is my difficult. One way to solution this is to forgett the loaded packages and search in all installed packages, but it is to slowly if you have a lot of packages installed.
So in my script R i have a line that has this code:
pkg <- .packages() # all packages loaded in this currently session
But when I run this script R1 in other script R2, it not get the packages loaded in the currently script R2, but the script R1.
Use the Services API to interact with Hydrogen
The following details interacting with other packages in atom: http://flight-manual.atom.io/behind-atom/sections/interacting-with-other-packages-via-services/
Hydrogen is an interface to a jupyter kernel. It's is maintaining the session with the kernel, and it has a plugin API currently which you could use to get the connection information to the backing kernel. https://nteract.gitbooks.io/hydrogen/docs/PluginAPI.html. Using that you could send your call to packages().
There is also r-exec, but I believe that's Mac only. In that case, you could get the
Thanks in advance for your input. I am a newbie to ML.
I've developed a R model (using R studio on my local) and want to deploy on the hadoop cluster having R Studio installed. I want to use SparkR to leverage high performance computing. I just want to understand the role of SparkR here.
Will SparkR enable the R model to run the algorithm within Spark ML on the Hadoop Cluster?
OR
Will SparkR enable only the data processing and still the ML algorithm will run within the context of R on the Hadoop Cluster?
Appreciate your input.
These are general questions, but they actually have a very simple & straightforward answer: no (to both); SparkR wiil do neither.
From the Overview section of the SparkR docs:
SparkR is an R package that provides a light-weight frontend to use Apache Spark from R.
SparkR cannot even read native R models.
The idea behind using SparkR for ML tasks is that you develop your model specifically in SparkR (and if you try, you'll also discover that it is much more limited in comparison to the plethora of models available in R through the various packages).
Even conveniences like, say, confusionMatrix from the caret package, are not available, since they operate on R dataframes and not on Spark ones (see this question & answer).
I have saved a trained model (deep net, but it is more general I think) in H2O. Now I want to load it by another instance of H2O and use it for scoring, but the problem is, that the version of H2O used for training (3.10.0.3) was different than the one I started the production cluster with (3.10.0.6). The error message is quite self-explanatory
ERROR MESSAGE:
Found version 3.10.0.3, but running version 3.10.0.6
Is there a way to migrate the saved model between versions? Or am I stuck with using the same version of H2O for training and scoring?
Yes, you are stuck using the same version for training and scoring. No migration route.
(You can export a model as a POJO, which can be bundled with the version of h2o-genmodel.jar that it needs. But that requires writing Java code to get the data in and results out, which is not ideal if you are using R code for data preparation.)
This has been discussed on the h2o-stream mailing list before, but I couldn't see a feature request ticket for it, so I just created one: https://0xdata.atlassian.net/browse/PUBDEV-3432