Running GUI analysis packages from RStudio server - r

RStudio server uses a headless R session and seems to pass all of the I/O operations encoded to save bandwidth. This works for everything except for packages like Rattle or Latticist, which work through their own GUI. Is there a way to use these packages through RStudio server or otherwise access the RStudio server R session to run these packages remotely?
Bonus if there's an efficient way to run these packages remotely without forwarding an X session over SSH.

I'm not sure this is possible over the RStudio interface because of the way these graphical programs work. It's easy enough for RStudio to capture textual input and output for R. Capturing normal graphical output is pretty impressive, but that's done "natively" in R. Even packages like ggplot2 and lattice use the builtin R plotting capabilities -- they do some rendering and data processing on their own, pass that onto grid and then grid renders the plots via R builtins when plot() or print is called (including implicitly in the REPL for interactive sessions). RCommander, RGL and the like use external libraries (Tcl/Tk, OpenGL), which render their interfaces directly over operating system services and not via R. R doesn't even see the output from these programs -- it only knows that the R wrapper function for these services hasn't returned yet. For local RStudio, this isn't a problem because the services are forwarded directly to the local display, but for RStudio server, there is no display!
Another consideration: assuming R could capture and forward X, that would imply having an X Server (in X, Server is the display/keyboard/etc, Client is the program that needs I/O) running in your browser. Modern JavaScript is pretty amazing at times, but X is a very complicated codebase and very sensitive to latency. Running X over the Internet is much slower than over the local network -- the protocol just wasn't designed for such things and most operations involve far too many roundtrips.
On a more practical side, you can still do most of your work via RStudio and only do the graphical commands via X forwarding:
Do everything that doesn't involve an external graphics interface.
Save your R Session (in the Environment tab or via the command line) as .RData in your project directory. (You can actually do this elsewhere, but it's generally more convenient if your workspace is saved in the working directory.)
Login in via SSH and X Forwarding and cd to the project directory.
Start R -- R will automatically load any existing workspaces saved as .RData. (You can disable this behavior with --vanilla. Depending on the size of your workspace, R may take a few seconds to a few minutes to load.
Have fun with Rattle, Latticist, RCommander, RGL, etc! Be ready for massive lag if you're doing this over the Internet and not the local network (see above).

Related

Downloading R on Linux for multiple clients

I've created a program that runs in R that I plan on distributing among a lot of other people. Currently the R script is ran completely automatically and behind the scenes with one .sh script which is exactly how it is intended to be. I'm trying to make it so theres no need for client intervention. The R script itself loads the packages and installs them if they aren't present which takes away the task of them installing the packages themselves.
Is there a way I can provide a folder within my Application's folder that they already download that contains R-script and its dependencies so the code can use that location of Rscript to compile and run the R-program I have created. The goal is to be able to download it and run without the need of internet connection to download R and maybe even the programs required packages if possible.
Any help or ideas is appreciated.
I assume that process you want called "creating binary package". Binary is programs (like EXE files) which can run directly on target CPU without any interpreter software (like Python interpreter for python scripts, or Java VM for java applications). I'm not so familiar with packaging of R programs but I found some materials regarding this issue:
1 - Building binary package R
2 - https://seandavi.github.io/post/build-linux-r-binary-packages/
3 - https://support.rstudio.com/hc/en-us/articles/200486508-Building-Testing-and-Distributing-Packages
Second link assumes Linux as target system. Opposite to interpreted languages, binary files often OS dependent (Linux, Windows, or Mac). I, personally, don't know how compatible are packages between Linux systems with different library sets.
Please comment if you find some information misleading, I'll correct the answer.

Can people without R installed run an R Notebook file successfully?

I have an R Notebook that I am building to provide an analysis for somebody, and I am wondering if I should choose another option as I don't know if she will be able to run the Notebook without having R installed.
Is it possible to run an R Notebook as a single entity or must you have R installed in order to do it?
To rerun the notebook they require R. But the whole point of R Notebooks is that they produce a static document as output. That document (usually in HTML format) can be shared in isolation, and does not require any additional software besides a web browser to be viewerd.
Notebook will need R to run. To distribute a notebook without the R dependency will be a bit more elaborate, like installing rstudio server within a docker container. User will, in this particular case, need to have Docker installed and know how to start a container. From there on the user can interact with the code through a web browser.
Another option would be to use the cloud solution that some companies offer. It offers sharing functionality and you don't have to worry about the infrastructure or distribution of your work. There are some free plans that may work for you, but the real power is in premium features.

Workflow for using command line R?

I am used to using R in RStudio. For a new project, I have to use R on the command line, because the data storage and analysis are only allowed to be on a specific server that I connect to using ssh. This server doesn't have rstudio-server to support remote RStudio sessions.
The project involves an extremely large dataset, and some pre-written code to load/format the data that I have been told to run using "source()" before I do anything else. This takes several minutes to run and load the data each time.
What would a good workflow be for something like this? Editing my code in a .r file, saving, then running it would require taking several minutes to load the data each time. But just running R in an interactive session would make it hard to keep track of what I am doing and repeat things if necessary.
Is there some command-line equivalent to RStudio where you can have an interactive session but be editing/saving a file of your code as you go?
Sounds like JuPyteR might be your friend here.
The R kernel works great.
You can use it on a remote server either with exposing an open port (and setting up JuPyteR login credentials)
Or via port forwarding over SSH.
It is a lot like an interactive reply, except it holds state.
And you can go back and rerun cells.
(Of course state can be dangerous for reproduceability)
For RStudio you can launch console and ssh to your remote servers even if your servers don't use expensive RStudio for servers platform. You can then execute all commands from R Studio directly into the ssh with the default shortcut key. This might allow to continue using R studio, track what you're doing in the R script, execute interactively.

Using RStudio with R backend on cluster via SSH

I have access to (not authority over) a computing cluster which has R installed. Is there a way for me to use R-Studio on my local computer -- but have the code running on the cluster via SSH?
To clarify -- No I don't really have non-SSH access, no I can't install R-Studio (server or desktop) on the cluster.
In line with the hackish options #hrbrmstr mentioned...
If your aim is to run mostly non-interactive code, then you can probably establish an n-node parallel::makePSOCKcluster() on the remote machines and run each of your commands via parallel like commands. Similarly, you could use package::svSocket, see this neat demo on YouTube for more details than fit in a reasonable answer.
But, given that you said RStudio, I suspect you are thinking of interactive use, and the above would be doable (but painful). Nothing I know of will let you just pretend that the remote machine is the local machine (which is a pity to be sure). However you might be able to hack something together, with sink() etc and a server and client side loop, e.g. How to connect two computers using R?.

Is there a way to send dataframes from server to client in R?

I have an R script which I program on my laptop. After I am done, I FTP the R script up to my university cluster and run my code there (in parallel if needed). Most of my functions return data frames that I'd like to plot using ggplot. This is perfectly fine however I'd like to use tikzDevice to create tikz (latex code) for my plot to have the same font and style as my thesis.
The problem:
I can't run tikzdevice on the university cluster because of the lack of LaTeX packages. I also can't install them due to no sudo access. Essentially, this route is a dead end for me.
Solution:
I can run tikzDevice on my own laptop. Since I am working on my latex document(thesis) on my laptop, its a seamless \include.
The problem is that the data (as dataframes) exist on the university cluster. I COULD save dataframes as textfiles, download them onto my laptop, and read.table them but this is gonna kill my productivity.
Are there any pacakges, tools, software, anything that will let me "extract" my data from the university server?
A possible solution is https://gist.github.com/SachaEpskamp/5796467
but I have no idea how to use this.
Note: I also don't know which part of the SE network this could go on.
I've found a workaround solution to this.
To those who are looking to transfer data back and forth from server/client, you can send and receive objects by serializing it.
On the server, you use the saveRDS command, and on the client you have the readRDS command. To provide a URL to readRDS, you must use gzcon, so like the following:
con = gzcon(url("http://path.com/to/your/object/serialized"))
a = readRDS(file = con)
Obviously this depends on some protocol installed on the server (like http)

Resources