How to connect to remote server from R studio - r

I need to connect to Unix server. I went through some sites and believe I may need to use 'ODBC', but not sure how.
Basically, I want some syntax which is equivalent of "RSUBMIT" in SAS. I want all the processing to be done in Remote Server as done by "RSUBMIT" command in SAS. I don't want to download the dataset to my local pc.
Can you kindly help me if there is any way to achieve the same in R studio?
I am running R studio -0.98.1102 and R -3.1.2.
There is a sas dataset on the server that I need access to.
In SAS I connect to server the following way-
%let remhost=sasgrid.app.xxxx.com 7554;
signon remhost user=_prompt_ password=_prompt_;
I am trying to build a tree model on the dataset that is present on the server. The server does not have R or Rstudio.
Kindly let me know if you need any other information.
Regards,
Rajat

Related

Using R32bit from PowerBI

I am using PowerBI for report server (64bit), and I need to access information in a 32bit ODBC database. I want to use an R Script to do this, but I have not been able to find the way of indicating PowerBI to use my 32bit R to do this, as it autoimatically chooses the 64bit R I have installed.
How can I indicate to PowerBI that it should use a 32Bit R installation instead of a 64bit?
R data sources don't support refresh in Power BI Report Server. https://learn.microsoft.com/en-us/power-bi/report-server/data-sources
For Power BI Desktop you configure the R home directory and it will run whatever you install there.

Workflow for using command line R?

I am used to using R in RStudio. For a new project, I have to use R on the command line, because the data storage and analysis are only allowed to be on a specific server that I connect to using ssh. This server doesn't have rstudio-server to support remote RStudio sessions.
The project involves an extremely large dataset, and some pre-written code to load/format the data that I have been told to run using "source()" before I do anything else. This takes several minutes to run and load the data each time.
What would a good workflow be for something like this? Editing my code in a .r file, saving, then running it would require taking several minutes to load the data each time. But just running R in an interactive session would make it hard to keep track of what I am doing and repeat things if necessary.
Is there some command-line equivalent to RStudio where you can have an interactive session but be editing/saving a file of your code as you go?
Sounds like JuPyteR might be your friend here.
The R kernel works great.
You can use it on a remote server either with exposing an open port (and setting up JuPyteR login credentials)
Or via port forwarding over SSH.
It is a lot like an interactive reply, except it holds state.
And you can go back and rerun cells.
(Of course state can be dangerous for reproduceability)
For RStudio you can launch console and ssh to your remote servers even if your servers don't use expensive RStudio for servers platform. You can then execute all commands from R Studio directly into the ssh with the default shortcut key. This might allow to continue using R studio, track what you're doing in the R script, execute interactively.

Difference between using RStudio on a virtual machine and Rstudio on RServer

I am new in R and I am working with a datasets that has more than 5 millions of observations. So I thought that it would be a good idea to use RStudio on a virtual machine instead of using it on my local machine.
I am reading the documentation about virtual machines and RServer but it is still not clear to me if I have to use Microsoft R Server to create a VIM and then just install Rstudio as I would do in my local machine or if I can create a generic VIM and then install RStudio. Which is the correct way? Why?
If both of these options are possible, which one is the best?
Please help me. Sorry for my confusion.
You can do either. If you are using Azure (which I think you are given that you mention Microsoft R Server), there is also the Data Science VM, which will come preinstalled with RStudio and many other useful programs.
R Server is more for production workloads with R, so unless you are planning that you could probably stick with the Data Science VM. If you end up choosing this option, you can connect directly to an RStudio instance on the R Server from the Azure portal.

Hive connector in R / Rstudio

Does anybody knows if it's possible to interface Hadoop with R / Rstudio ? If yes, HOW?
I have some hive's table and I'd like to accès them with R / Rstudio and within 'shiny' make a visual restitution (graphs etc...).
I would appreciate any help (ideas, code examples ...).
Try the package dplyr.hive.spark. The docs are still a bit more geared towards spark, but I tested it against Hive with the latest HDP sandbox and things were going smoothly. If you give it a try please report any problems.
If you just want to access hive tables on HDFS, you can use the RJDBC package and a JDBC connection (explained here: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC). Then you can use RJDBC just like you would for a relational database except that it might launch some map/reduce jobs on your cluster to execute.

Efficient switching between 32bit and 64bit R versions

I am working with large datasets that are available in *.mdb (i.e access database) format. I am using RODBC R package to extract data from access database. I figured out that I have 32 bit office installed on my machine. Since, I have 32 bit office installed, it seems I can use only 32 bit R in order to connect to the access database using RODBC. After I read the data using 32 bit R, then doing some exploratory analysis (plotting data, summary / regression), I got the memory issues which I didn't get while using 64-bit R.
Currently, I am using Rstudio to run all my code and I could change the version of R that I use from Options >> Global Options >> R version:
However, I don't want to switch to 32-bit while reading access database using RODBC and then go back to R-studio to revert back to 64-bit for analysis. Is there an automatic solution which allows me to specify 32-bit or 64-bit ? Can we do that using batch file ? If anyone could shed some light that would be great.
Write your code that extracts the data as one R script. Have that script save the output data that you need for your analysis to an .RData file.
Write the code that you run your analyses in, to be run in 64-bit R. Using the answer found here, run your code using the 32-bit R. Then, the next line can be reading the data in from the .RData file. If needed to allow things to load, use Sys.sleep to have your first program wait a few seconds for the load to complete.

Resources