JRI program terminates java during REngine creation - r

I want to call an R script using Java. I am trying the JRI method to call the R script. However my JVM gets terminated when creating an Rengine.
I am running one of the examples which has been provided alongside rJava library installation in R.
Code I have tried
Version 1:
`Rengine re=new Rengine(args, true, new TextConsole2());`
Version 2:
`Rengine re=new Rengine(args, false, new TextConsole());`
Version 3:
`Rengine re = new Rengine(new String[] { "--vanilla" }, false, null);`
All three have terminated the JVM while they were getting executed.
I am using STS 3.9.1 and have exposed the below variables before running the Java program
java.library.path - pointing to the r java dll's
R_HOME - pointing to the R.exe
PATH - reinforcing the systems path with the paths of rJava dll's and R.exe
I am using R-3.5.0, followed all steps as per
Study Trails - R and Java
and Mavlarn - R and Java but still facing the same issue.
What could I be doing wrong ?

Related

Azure Machine Learning integration of R: Should the 'azureml' module have an attribute 'core'?

I'm having issues with Azure Machine Learning SDK for R: "module 'azureml' has no attribute 'core'"...
For reasons that aren't my own, I have to use azureml to apply machine learning (my own stuff, written in R) to data from our data warehouse that is put in the blob storage. The modelled output should be put back into the blob storage so it can be accessed from the data warehouse.
I've written the code in R on my local machine (stored in a git repo). Preferably, I'd find some method to pull my code from git into a pipeline in the azureml environment so that it can be directly run whenever new data is available in the blob storage.
I've embarked on a tutorial-spree and found this seemingly relevant walkthrough: Train and deploy your first model with Azure ML (and this one).
But... after trying all I could think of, I'm stuck on the first steps. After installing all (or at least.. that's what I think) packages, modules, apps etc, and running the following code in RStudio:
library(azuremlsdk)
existing_ws <- get_workspace(name = name,
subscription_id = subscription_id,
resource_group)
I run into an error that I haven't been able to fix:
AttributeError: module 'azureml' has no attribute 'core'
It seems that the azuerml is supposed to have an attribute "core", but when looking at it more closely, there is indeed no such attribute.
The function "get_workspace()" is trying to access: "azureml$core$Workspace$get".
I found that "azuerML$Workspace" does exist, but then I can't figure out how to make that work.
Can anyone explain to me why I'm encountering this error?
Does anyone know of a better tutorial on how to connect my R code the azureml's cloud service?
Any pointers in the right direction are much appreciated!
EDITS - still not solved:
After advice from others, I double, triple and quadruple checked the installation.
I updated R and I'm now running:
R.version
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 6.2
year 2019
month 12
day 12
svn rev 77560
language R
version.string R version 3.6.2 (2019-12-12)
nickname Dark and Stormy Night
I installed Conda with Python 3.6.10.
I installed the azuremlsdk R package (I tried both provided options).
I then realized that there are some inconsistencies with the versions of the azure-modules, so I also tried installing it with the keyword 'multi-arch':
remotes::install_cran('azuremlsdk', repos = 'http://cran.us.r-project.org', INSTALL_opts=c("--no-multiarch"))
Then, I installed the azureml python sdk.
I had a look at all the versions again (using python -m pip freeze):
azure-common==1.1.24
azure-graphrbac==0.61.1
azure-mgmt-authorization==0.60.0
azure-mgmt-containerregistry==2.8.0
azure-mgmt-keyvault==2.0.0
azure-mgmt-resource==7.0.0
azure-mgmt-storage==7.1.0
azureml==0.2.7
azureml-automl-core==1.0.83.1
azureml-core==1.0.69
azureml-dataprep==1.1.36
azureml-dataprep-native==13.2.0
azureml-pipeline==1.0.69
azureml-pipeline-core==1.0.69
azureml-pipeline-steps==1.0.69
azureml-sdk==1.0.69
azureml-telemetry==1.0.69
azureml-train==1.0.69
azureml-train-automl-client==1.0.83
azureml-train-core==1.0.69
azureml-train-restclients-hyperdrive==1.0.69
As I was surprised to see all the 1.0.69 versions, instead of the 1.0.83 versions, I re-installed the azureml python sdk using:
azuremlsdk::install_azureml(version = "1.0.83")
This worked, in the sense that indeed all versions are now 1.0.83:
azure-common==1.1.24
azure-graphrbac==0.61.1
azure-mgmt-authorization==0.60.0
azure-mgmt-containerregistry==2.8.0
azure-mgmt-keyvault==2.0.0
azure-mgmt-resource==7.0.0
azure-mgmt-storage==7.1.0
azureml==0.2.7
azureml-automl-core==1.0.83.1
azureml-core==1.0.83
azureml-dataprep==1.1.36
azureml-dataprep-native==13.2.0
azureml-pipeline==1.0.83
azureml-pipeline-core==1.0.83
azureml-pipeline-steps==1.0.83
azureml-sdk==1.0.83
azureml-telemetry==1.0.83
azureml-train==1.0.83
azureml-train-automl-client==1.0.83
azureml-train-core==1.0.83
azureml-train-restclients-hyperdrive==1.0.83
But still... I get the error with the missing core. I get it both when running:
library(azuremlsdk)
get_current_run()
and also when running:
library(azuremlsdk)
existing_ws <- get_workspace(name = name,
subscription_id = subscription_id,
resource_group)
Note that the first time running this code after starting up RStudio, I get the error:
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'azureml' has no attribute '_base_sdk_common'
And every time after that I get this error:
Error in py_get_attr_impl(x, name, silent) :
AttributeError: module 'azureml' has no attribute 'core'
Any help would be much appreciated!
This issue was introduced by the latest reticulate 1.14 release, in which reticulate would create a default r-reticulate conda environment. Since Azure ML was installing the python SDK in an environment named r-azureml, the r-reticulate environment used by reticulate was missing the python SDK. A fix for this issue was addressed in a PR and has been merged into master. Please install from GitHub for now if you have reticulate version 1.14 and are running into this issue. We will be releasing an update to CRAN shortly.
I seemed to have fixed the issue by specifically installing the python package azureml AND azureml.core:
python -m pip install azureml
and then...
python -m pip install azureml.core
I did this for the Conda version that was called by R (r-reticulate). It's a bit odd to not be able to use the Conda environment 'r-azureml' without R switching back to 'r-reticulate', but ah well... at least I don't get my 'azureml' has no attribute 'core' anymore.

using RStudio with self compiled R

How can I get RStudio to recognize my version of R which is installed to
/opt/R/3.4.3/
by compiling it myself (make install) and ln -s /opt/R/${R_VERSION}/bin/R /bin/R. When executing on a shell, R works just fine. Only RStudio does not recognize the different path and is still looking at:
/usr/local/lib64/R/bin/exec/R
exact error message:
Feb 3 14:50:18 devbox systemd: Starting RStudio Server...
Feb 3 14:50:18 devbox systemd: Started RStudio Server.
Feb 3 14:50:18 devbox rserver[22411]: ERROR R did not return any output when queried for directory location information; LOGGED FROM: bool rstudio::core::r_util::<unnamed>::detectRLocationsUsingR(const std::string&, rstudio::core::FilePath*, rstudio::core::FilePath*, rstudio::core::config_utils::Variables*, std::string*) /root/rstudio/src/cpp/core/r_util/REnvironmentPosix.cpp:483
Feb 3 14:50:18 devbox rserver[22411]: ERROR system error 71 (Protocol error) [description=Unable to parse version from R, version-info=, r-error=/usr/local/lib64/R/bin/exec/R: error while loading shared libraries: libmkl_gf_lp64.so: cannot open shared object file: No such file or directory|||]; OCCURRED AT: rstudio::core::Error rstudio::core::r_util::rVersion(const rstudio::core::FilePath&, const rstudio::core::FilePath&, const std::string&, std::string*) /root/rstudio/src/cpp/core/r_util/REnvironmentPosix.cpp:784; LOGGED FROM: bool rstudio::core::r_util::detectREnvironment(const rstudio::core::FilePath&, const rstudio::core::FilePath&, const std::string&, std::string*, std::string*, rstudio::core::r_util::EnvironmentVars*, std::string*) /root/rstudio/src/cpp/core/r_util/REnvironmentPosix.cpp:678
I realized (see answer below) that R only worked as long as I did not loose the current bash environment. Executing:
source /opt/intel/mkl/bin/mklvars.sh intel64
fixes this. However, I cant get RStudio to execute this before starting up. I played around with ExecStartPre=/opt/intel/mkl/bin/mklvars.sh intel64, but it fails to get the environment up correctly
On Linux, RStudio Desktop and Open-Source Server use the version of R pointed to by the output of which R. If RStudio is unable to locate R using which R, it will fall back to scanning explicitly for the R script in the /usr/local/bin and /usr/bin directories.
If you want to override which version of R is used then you can set the RSTUDIO_WHICH_R environment variable to the R executable that you want to run against. For example:
export RSTUDIO_WHICH_R=/usr/local/bin/R
See RStudio Support: Using Different Versions of R
I manually need to load
source /opt/intel/mkl/bin/mklvars.sh intel64
into the environment for R to work, as otherwise links are broken and R won't start up which leads to RStudio complaining (with a not 100% helpful error message).

External Scripting and R (Kognitio)

I have created the R script environment (used this command to create it "create script environment RSCRIPT command '/usr/local/R/bin/Rscript --vanilla --slave'") and tried running the one R script but it fails with the below error message.
ERROR: RS 10 S 332659 R 31A004F LO:Script stderr: external script vfork child: No such file or directory
Is it because of the below line which i am using in the script ?
mydata <- read.csv(file=file("stdin"), header=TRUE)
if (nrow(mydata) > 0){
I am not sure what is it expecting.
I have one more questions to ask.
1) do we need to install the R package on our unix box ? if not then the kognitio package has it
I suspect the problem here is that you have not installed the R environment on ALL the database nodes in your system - it must be installed on every DB node involved in processing (as explained in chapter 10 of the Kognitio Guide which you can download from http://www.kognitio.com/forums/viewtopic.php?t=3) or you will see errors like "external script vfork child: No such file or directory".
You would normally use a remote deployment tool (e.g. HP's RDP) to ensure the installation was identical on all DB nodes. Alternatively, you can leverage the Kognitio wxsync tool to synchronise files across nodes.
Section 10.6 of the Kognitio Guide also explains how to constrain which DB nodes are involved in processing - this is appropriate if your script environment should not run on all nodes for some reason (e.g. it has an expensive per-node/per-core licence). That does not seem appropriate for using R though.

Starting Rserve in debug mode and printing variables from Tableau to R

I can't start Rserve in debug mode.
I wrote these commands in R:
library(Rserve)
Rserve(debug=T, args="RS-enable-control", quote=T, port = 6311)
library(RSclient)
c=RSconnect(host = "localhost", port = 6311)
RSeval(c, "xx<-12")
RSeval(c, "2+6")
RSeval(c, "xx")
RSclose(c)
install.packages("fpc")
I placed the Rserve_d.exe in the same directory where the R.dll file is located. But when I launch it and I launch Tableau with the Rserve connection I can't see anything in the debug console, just these few lines.
Rserve 1.7-3 () (C)Copyright 2002-2013 Simon Urbanek
$Id$
Loading config file Rserv.cfg
Failed to find config file Rserv.cfg
Rserve: Ok, ready to answer queries.
-create_server(port = 6311, socket = <NULL>, mode = 0, flags = 0x4000)
INFO: adding server 000000000030AEE0 (total 1 servers)
I tried another solution by the command Rserve(TRUE) in R, but I can't see the transactions between R and Tableau neither in the Rstudio console.
I wanted then to print the output of the variable in R from the R-script function, by print(.arg1). But nothing appears in the R console
but when I run print in the R console it works fine.
According to this article*, RServe should be run with the following command to enable debugging:
R CMD Rserve_d
An alternative is to use the ‘write.csv’ command within the calculated field that calls an R script, as suggested by this FAQ document from Tableau
Starting Rserve_d.exe from command line works. Most likely you have multiple instances of Rserve running and Tableau is sending requests to one that is not Rserve_d running in the command line.
Did you try killing all Rserve processes and then starting Rserve_d from command line?
If you don't want to run from the command line you can try starting Rserve in process from RStudio by typing run.Rserve() then using print() statements in your Tableau calculated fields for things you want to print.
In the R bin directory, you have two executables Rserve for normal execution and Rserve.dbg for debug execution. Use
R CMD Rserve.dbg
My OS is CENTOS7 and I am using the R installation from anaconda. If your RServe debug executable has a different name you should be using that.

rbundler build error: "cannot open file 'startup.Rs': No such file or directory"

I'm running into an issue when building the following package: https://github.com/yoni/rbundler
My test attempts to run rbundler's bundle command on a trivial package which has a single dependency. The test passes on my OSX machine, but fails on my x86_64-redhat-linux-gnu Jenkins server. Both machines are running R 2.15.1 with devtools 0.7.1, which includes this bug fix.
The full test output can be found in this gist.
Here's a short summary of error I'm seeing:
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
Calls: local ... eval.parent -> eval -> eval -> eval -> eval -> source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
cannot open file 'startup.Rs': No such file or directory
Execution halted
The background for this is that I'm trying to build a dependency management system for R. The idea is that an R project should be able to run without using system-wide or user-wide libraries. Rather, the R project will have it's own library installed under it's root directory.
For my previous Stack Overflow question related to Dependency Management in R, see Dependency management in R
In my case this issue was caused by the environment variable R_TESTS that was set to startup.Rs
When you execute another R process from within your tests (in my case it was submitted via OGS qsub), the presence of this environment variable causes issues.
I can't answer your question directly, but two things you can try get more information about what is happening.
use 'env' to dump environment variables on your OSX machine and the Jenkins host
run the process through strace on Linux and dtruss on OSX to trap the system calls
strace/dtruss should reveal the places in which it is searching for startup.Rs and env output will likely give you a environment variable that differs between the system accounting for the different outcome.

Resources