Code secure protection - r

I wrote an R script on an institutional computer. I saved it on an external device (USB PenDrive) and tested it on a computer not of my institution. When I run it to test it using R from Terminal and from GUI I never saved it on the host computer.
This because it has to remain secret until it will be published. I simply would like to know if R or Unix itself saved it elsewhere while running although I load it directly giving the path of Volumes and so on.
In other words the question is does running an R script off a USB leave any trace of it on the host computer.
P.S.: I checked the R history to look for it but it seems not to be saved.

Related

Unable to access internet within "R" on cmd behind proxy

I have been using R on commandline (BASH). I am unable to access the internet (download any packages). I have tried proxy system wide, and tested it with wget, which works. The "install.packages()" command however does not.
Per some user's advice, I also tried setting the proxy in .Rprofiles file. That didn't help either. Please advice.
I recently ran into the same issue on my work machine. Our Firm uses Cylance as its antivirus software. Cylance was quarantining the file "internet.dll" that R uses to access the Internet. Fortunately, however, it only does so in the 32-bit version of R. For me, there were two solutions:
First, I was able to download packages directly from the 32-bit version of R (outside of RStudio). This works fine. The downloaded packages will run in 64-bit RStudio.
The longer-term solution was to submit an IT service request to release this file from quarantine (that is, to "whitelist a blocked entity"). At my Firm this was promptly done, as there is (obviously) nothing unsafe about this R file.

Point RStudio to remote R instance

I would like to have RStudio running on my local machine (os x) and the R executable on a remote computer.
I'm aware that I could run RStudio Server on the remote machine and connect to it using a web interface, however I abhor using web interfaces for things like this due to the delays and limited ability to move windows and use shortcuts.
Since RStudio isn't stand alone but points to an R executable elsewhere on the local machine (in a location that can be varied), it seems like in theory this pointer could be to a remote location. (Is there some reason this is incorrect?)
How might I accomplish this?

Workflow for using command line R?

I am used to using R in RStudio. For a new project, I have to use R on the command line, because the data storage and analysis are only allowed to be on a specific server that I connect to using ssh. This server doesn't have rstudio-server to support remote RStudio sessions.
The project involves an extremely large dataset, and some pre-written code to load/format the data that I have been told to run using "source()" before I do anything else. This takes several minutes to run and load the data each time.
What would a good workflow be for something like this? Editing my code in a .r file, saving, then running it would require taking several minutes to load the data each time. But just running R in an interactive session would make it hard to keep track of what I am doing and repeat things if necessary.
Is there some command-line equivalent to RStudio where you can have an interactive session but be editing/saving a file of your code as you go?
Sounds like JuPyteR might be your friend here.
The R kernel works great.
You can use it on a remote server either with exposing an open port (and setting up JuPyteR login credentials)
Or via port forwarding over SSH.
It is a lot like an interactive reply, except it holds state.
And you can go back and rerun cells.
(Of course state can be dangerous for reproduceability)
For RStudio you can launch console and ssh to your remote servers even if your servers don't use expensive RStudio for servers platform. You can then execute all commands from R Studio directly into the ssh with the default shortcut key. This might allow to continue using R studio, track what you're doing in the R script, execute interactively.

Using RStudio with R backend on cluster via SSH

I have access to (not authority over) a computing cluster which has R installed. Is there a way for me to use R-Studio on my local computer -- but have the code running on the cluster via SSH?
To clarify -- No I don't really have non-SSH access, no I can't install R-Studio (server or desktop) on the cluster.
In line with the hackish options #hrbrmstr mentioned...
If your aim is to run mostly non-interactive code, then you can probably establish an n-node parallel::makePSOCKcluster() on the remote machines and run each of your commands via parallel like commands. Similarly, you could use package::svSocket, see this neat demo on YouTube for more details than fit in a reasonable answer.
But, given that you said RStudio, I suspect you are thinking of interactive use, and the above would be doable (but painful). Nothing I know of will let you just pretend that the remote machine is the local machine (which is a pity to be sure). However you might be able to hack something together, with sink() etc and a server and client side loop, e.g. How to connect two computers using R?.

detect which computer I'm running an R script on

I am looking for an R function to return an identifier of the computer the script is being run on, or at the least to distinguish between one of two known computers.
I have two PCs, both running Windows and RStudio. I use the desktop in the office and the laptop over VPN, typically working on the same projects, always using RStudio.
My scripts and permanent data sets are in a commmon repository. However, since I/O to that repository is slow, I keep a local directory for temp files.
On the desktop, I have a dedicated drive, and each project lives in its folder 'D:/workspace/this_project/'. On the laptop, the path is 'C:/Users/myself/Documents/workspace/this_project/' or just '~/workspace/this_project/'.
Currently, I keep two setwd() statements at the top of each script, and I just rely on the fact that one of them will fail because of the file structure.
setwd('~/workspace/this_project') # will fail on the desktop
setwd('D:/workspace/this_project') # will fail on the laptop
This seems like a bad practice.
I've looked through ?"environment variables" and don't see how to get my computer's name on the network or something else that is persistent and unique to the computer.
The desired solution could modify the laptop's tilde expansion to D:/ on the laptop only so that a common '~/workspace/' could be used, or a function using_laptop() like this:
set_project_wd <- function(folder_nm){
if(using_laptop()) setwd(paste0('~/workspace/',folder_nm))
else setwd(paste0('D:/workspace/',folder_nm))
}
If you call Sys.info() you can get your details:
names(Sys.info())
[1] "sysname" "release" "version" "nodename" "machine" "login"
[7] "user" "effective_user"
the entry under nodename will be your pc name.
Then you can do something like:
set_project_wd <- function(folder_nm){
if(Sys.info()[[4]]=="mylaptopname") setwd(paste0('~/workspace/',folder_nm))
else setwd(paste0('D:/workspace/',folder_nm))
}

Resources