Running h2o in R script in Azure machine learning - r

I know how to access packages in R scripts in Azure machine learning by either using the Azure supported ones or by zipping up the packages.
My problem now is that Azure machine learning does not support the h2o package and when I tried using the zipped file - it gave an error.
Has anyone figured out how to use h2o in R in Azure machine learning?

So since there was no reply to my question, I made some research and came up with the following:
H2O cannot be ran in a straightforward manner in Azure machine learning embedded R scripts. A workaround the problem is to consider using an Azure created environment - specially for H2O. The options available are:
Spinning up an H2O Artificial Intelligence Virtual Machine solution
Using an H2O application for HDInsight
For more reading, you can go to: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/azure.html

Related

Notebook instance running R with a GPU

I am new to cloud computing and GCP. I am trying to create a notebook instance running R with a GPU. I got a basic instance with 1 core and 0 GPUs to start and I was able to execute some code which was cool. When I try to create an instance with a GPU I keep getting all sorts of errors about something called live migration, or that there are no resources available, etc. Can someone tell me how to start an R notebook instance with a GPU? It can't be this difficult.
The CRAN (The Comprehensive R Archive Network) doesn't support GPU. However, you can follow this link might help you to install a Notebook instance running R with a GPU. You need a machine with Nvidia GPU drivers installed then install R and Jupyter Lab. After that compile those R packages which require it for use with GPU's.

R will not run after latest windows 10 updates

I have updated my windows and R cannot run, and hence neither can R studio. When I run R GUI it just freezes and is unresponsive. I have allowed chromium exemption to the firewall
I am on Windows Insider program and has just updated to
Windows 10 Home, Insider Preview
Evaluation Copy.Build 20190.rs_prerelease.200807-1609
Note that R GUI freezes and then shuts down on its own, so maybe the problem is R GUI and not R Studio.
I get the following errors on R studio.
This site can’t be reached
127.0.0.1 refused to connect.
Try:
Checking the connection
Checking the proxy and the firewall
ERR_CONNECTION_REFUSED
Cannot Connect to R
RStudio can't establish a connection to R. This usually indicates one of the following:
The R session is taking an unusually long time to start, perhaps because of slow operations in startup scripts or slow network drive access.
RStudio is unable to communicate with R over a local network port, possibly because of firewall restrictions or anti-virus software.
Please try the following:
If you've customized R session creation by creating an R profile (e.g. located at ~/.Rprofile), consider temporarily removing it.
If you are using a firewall or antivirus software which guards access to local network ports, add an exclusion for the RStudio and rsession executables.
Run RGui, R.app, or R in a terminal to ensure that R itself starts up correctly.
Further troubleshooting help can be found on our website:
Troubleshooting RStudio Startup
This has been fixed with Windows 10 Insider Preview Build 20201 (released on August 26, 2020 in the Dev channel).The previous two builds were missing 64-bit APIs required by the prebuilt version of R.
Same issue.
Rollback to the previous version solves the problem.
I think it is about the update of the graphic features of Windows.
Here is what Microsoft said in the build 20190 changelog:
Improved Graphics Settings experience
While this isn’t a new feature all together, we have made significant changes based on customer feedback that will benefit our customers’ Graphics Settings experience. We have made the following improvements:
We’ve updated the Graphics Settings to allow users to specify a default high performance GPU.
We’ve updated the Graphics Settings to allow users to pick a specific GPU on a per application basis.

Difference between using RStudio on a virtual machine and Rstudio on RServer

I am new in R and I am working with a datasets that has more than 5 millions of observations. So I thought that it would be a good idea to use RStudio on a virtual machine instead of using it on my local machine.
I am reading the documentation about virtual machines and RServer but it is still not clear to me if I have to use Microsoft R Server to create a VIM and then just install Rstudio as I would do in my local machine or if I can create a generic VIM and then install RStudio. Which is the correct way? Why?
If both of these options are possible, which one is the best?
Please help me. Sorry for my confusion.
You can do either. If you are using Azure (which I think you are given that you mention Microsoft R Server), there is also the Data Science VM, which will come preinstalled with RStudio and many other useful programs.
R Server is more for production workloads with R, so unless you are planning that you could probably stick with the Data Science VM. If you end up choosing this option, you can connect directly to an RStudio instance on the R Server from the Azure portal.

h2o implementation in R

I am learning h2o package now,
I installed h2o package from CRAN and couln't run this code
&#35&#35 To import small iris data file from H\ :sub:`2`\ O's package
irisPath = system.file("extdata", "iris.csv", package="h2o")
iris.hex = h2o.importFile(localH2O, path = irisPath, key = "iris.hex")
I am getting the below error,
Error in h2o.importFile(localH2O, path = irisPath, key = "iris.hex") :
unused argument (key = "iris.hex")
My second question is, Do we have good resources to learn h2o in R apart from this:
http://h2o-release.s3.amazonaws.com/h2o/rel-lambert/5/docs-website/Ruser/rtutorial.html
My third question is I want to know how the h2o works in simple words.?
The reason this code no longer works is that there was a breaking API change from H2O 2.0 to H2O 3.0 back in 2015. The docs you have discovered (probably via a Google search) are for a very old version of H2O 2.0. The up-to-date docs can always be found at http://docs.h2o.ai/h2o/latest-stable/h2o-docs/index.html
Answering your error question:
H2O changed a bit from this documentation. Reading the iris file works as follows:
iris.hex = h2o.importFile(path = irisPath, destination_frame = "iris.hex")
Your second (and third question) is against SO rules. But below you will find a short list of resources:
H2O training materials (go to the h2o.ai
website) and go to general documentation. You can find all the
material there presented on h2o world 2015. There is also a link to
h2o university.
Check their blog. There are some gold nuggets in there.
Read the booklets they have available on GBM, GLM, Deep Learning. They contain examples in R and Python.
Kaggle. Search the scripts / kernels for h2o.
As for your third question, read their "Why H2O pages".
To answer your question about how H2O works it is little hard to put together here. however in nutshell, H2O is an open source enterprise ready machine intelligence engine with accessibility from popular machine learning languages i.e. R, Python as well as programming languages Java and Scala. Enterprise ready means users can distribute execution to multiple machines depending on extremely large size of data. The Java based core engine has builtin multiple algorithms implementation and any language interface goes through interpreter to H2O core engine which could be a distributed cluster to build models and score results. There is a lot in between so I would suggest visiting link below to learn more about H2O architecture and execution from various supported language:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/architecture.html
You can dig out more on H2O implementation in R starting from installation to implementation of h2o machine learning library in R. Go through this link.
This also helps you in order to implement h2o machine learning on top of SparkR framework.
If you want to get an idea of h2o working prototype from very basic than follow this link. It provides the basic flavor of working prototype with code walk-through (quick learning tutorial).
Apart from above points, it also covers the following key points:
How to convert H2O data frame to R and Spark data frame and vice-versa
What are the pros and cons between SparkMLlib and H2O machine library
What are the strengths of h2o compare to other ML library
How to apply ML algorithm to R and Spark data frame etc.

Connect Windows version of R to Hadoop

I am trying to connect R to a Hadoop cluster using R. The cluster has HDFS, Map Reduce, Hive, Pig and Sqoop installed on it.
R will be running on in the Windows environment. I know that rhdfs, rhadoop and rmr exist for Linuix, but I can't find anything on Windows.
Does anyone know of a library to use?
Thank you
Revolution Analytrics is trying to make a name for themselves in this space. They have a couple of nice packages (some of which are open-source and/or free for non-commercial use) which allow you to interact with Hadoop from R in a Windows environment fluidly.

Resources