Run Same R Script from Multiple Computers Simultaneously

Run Same R Script from Multiple Computers Simultaneously - r

I have an R script, a Shiny app to be specific that lives on a network drive. There are multiple computers that need to be able to run this app, and therefore multiple people who may need to run it at the same time.
For the moment, I have gotten around the problem simply by housing multiple duplicate Shiny apps, and giving each computer access to a unique copy. However, as the number of users expands, it is becoming more and more difficult to keep up with.
Is there a way to have multiple computers access the same R script at the same time, and hold open a session for however long they need?

If you go with the R package route, and:
you want you're user's to know when their package is out of date
your package code is in a git repo (always a good idea)
your users install the package using devtools::install_git("path/to/package/git/repo")
then you can add these lines to your package's .onload() method (documented here and here: ?.onLoad):
# Check if the package is up to date
pd <- packageDescription(pkgname)
out_of_date_message_template <-
'Your copy of package %s is not up to date.\nUse devtools::install_git("%s") to update this package\n'
if(identical(pd$RemoteType,"git")){
try({
# get the hash of the remote repo
out_file <- tempfile()
on.exit(unlink(out_file))
failed <- system2("git",sprintf('ls-remote "%s"',pd$RemoteUrl),stdout = out_file)
if(failed)
return() # failed to get the git repo hash
remotes <- readLines(out_file)
if(!identical(pd$RemoteSha,gsub("\t.*","",remotes[1])))
packageStartupMessage(
sprintf(out_of_date_message_template,
pkgname,
gsub("\\\\","\\\\\\\\",pd$RemoteUrl)))
})
}
then when you push an update to your network git repo, users with out of date code will get this message when they call library(my_app)
Your copy of package my_app is not up to date.
Use devtools::install_git("path\\to\\package\\git\\repo") to update this package

Once you have git installed on your local computer, initialize the repo on the network drive like this:
cd my/network/shiny-app
git init
git add .
git commit -m "initial commit"
Then on each computer,
cd some/place/nice
git clone my/network/shiny-app
cd shiny-app
ls # hey look- all my files are now in this directory!
There are plenty of good git tutorials on the internets for when you decide to update your code and need to pull it to each computer.

Related

Ignore a dependency during R CMD check

I have a package that I developed to enable my team (and perhaps other interested users) to install and use a particular R package (RQDA) that was archived on CRAN. I have hosted this package on GitHub and am trying to set up GitHub Actions so that I have a CI workflow in place.
Whenever I run R CMD check locally everything is fine, but when I push to GitHub the build fails. This is because, by default, Actions tries to install that same (archived) package. Expectedly, this fails.
So, my question is this: is there a way I can disable the check for a specific package dependency? There are no plans to ever send this package to CRAN, so I am happy to bypass their package policy in this instance.

2 possible ways:
Upload the source for RQDA to a Github repo, or other publicly accessible location, and put a Remotes: line in your DESCRIPTION file
Save the package to cloud storage, eg an S3 bucket or Azure storage container, and download it from there as a separate workflow step prior to checking

This is how I was able to deal with the problem:
The changes made were entirely in the workflow file at ./.github/workflows/. One of the jobs there is for installing R package dependencies for the project:
- name: Install dependencies
run: |
remotes::install_deps(dependencies = TRUE)
remotes::install_cran("rcmdcheck")
shell: Rscript {0}
The first thing I did was to change the dependencies argument to NA so that only packages listed in Depends and Imports are installed. (The RQDA dependency that was giving me trouble is under Suggests).
There was still an error but this time with some guidance that involved going to the job Check and setting the environment variable _R_CHECK_FORCE_SUGGESTS_ to false.
The check now works as expected.

concurrent install of git multiple branches of an R package with `devtools::install_github()`

As the title states: is it possible to install multiple git branches of the same package side-by-side in the same R environment? I want to do some benchmarking and it would be easier to compare the two branches in the same session. I think one workaround is to change the package name in the DESCRIPTION file in the new branch, but is there a more clever way to do this with devtools?
Sample code:
devtools::install_github("mkoohafkan/RAStestR", ref = "master")
# overwrites the prior install
devtools::install_github("mkoohafkan/RAStestR", ref = "hdf5r_transition")

In short, no. At least not without an extra layer. Read on.
While git (the protocol, as well as the client) support "branches" akin to a virtual filesystem allowing you to switch easily, R does not.
For every package you install, one and only one version can be installed.
But don't despair, because the file system can be used as a backend, and R can then switch by adjusting the library path. This is all in help(Startup) but it may help to be explicit.
What you can do (and I mock this here)
mkdir master; cd master; installFromBranch.R master; cd ..
mkdir featureA; cd featureA; installFromBranch.R featureA; cd ..
mkdir featureB; cd featureA; installFromBranch.R featureB; cd ..
and then in R use, say,
.libPaths("master"); library("mypackage")
or if you want a feature
.libPaths("featureA"); library("mypackage")
You can also use R_LIB_USER=featureA Rscript -e '.....someCommandHere...'
So in short: map the branches to directories into which you install and tell R about those directories.

How to install stringi from local file (ABSOLUTELY no Internet Access)

I am working on a remote server using RStudio. This server has no access to the Internet. I would like to install the package "stringi." I have looked at this stackoverflow article, but whenever I use the command
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/icudt.zip")
It simply tries to access the Internet, which it cannot do. Up until now I have been using Tools -> Install Packages -> Install from Packaged Archive File. However, due to this error, I can no longer use this method.
How can I install this package?

If you have no internet access on local machines, you can build a distributable source package that includes all the required
ICU data files (for off-line use) by omitting some relevant lines in
the .Rbuildignore file. The following command sequence should do the trick:
wget https://github.com/gagolews/stringi/archive/master.zip -O stringi.zip
unzip stringi.zip
sed -i '/\/icu..\/data/d' stringi-master/.Rbuildignore
R CMD build stringi-master
Assuming the most recent development version is 1.3.1,
a file named stringi_1.3.1.tar.gz is created in the current working directory.
The package can now be installed (the source bundle may be propagated via
scp etc.) by executing:
R CMD INSTALL stringi_1.3.1.tar.gz
or by calling install.packages("stringi_1.3.1.tar.gz", repos=NULL),
from within an R session.

For a Linux machine the easiest way is from my point of view:
Download the release you need from Rexamine in tar.gz format to your local pc. In opposition to the version on CRAN it already contains the icu55\data\ folder.
Move the archive to your target linux machine without internet access
run R CMD INSTALL stringi-1.0-1.tar.gz (in case of release 1.0-1)

You provided the wrong value of configure.vars.
It indicates that you have to give the directory's name, not a final file name.
Correct your code to the following:
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/")
Regards,
Sean

Follow the steps below
Download icudt55l.zip seperately from server where you have internet access with
wget http://www.mini.pw.edu.pl/~gagolews/stringi/icudt55l.zip
Copy the downloaded packages to the server where you want to install stringi
Execute the following command
R CMD INSTALL --configure-vars='ICUDT_DIR=/tmp/ALL' stringi_1.1.6.tar.gz
icudt55l.zip is copied to /tmp/ALL

The suggestion from #gagolews almost worked for me. Here's what actually did the trick with RStudio.
Download the master.zip file that will save as stringi-master.zip.
Unzip the file onto your desktop. The unzipped folder should be stringi-master.
Edit the .Rbuildignore file by removing ^src/icu55/data and ^src/icu61/data or similar lines.
Move the folder from your desktop to the home directory of your server.
Create a New Project in RStudio with ~/stringi-master as the Existing Directory
From RStudio's menu, select Build and Build Source Package. (You may need to first select Configure Build Tools. For Project build tools choose Package then select OK.)
It should create a tar.gz file, in the following format: stringi_x.x.(x+1).tar.gz. For example, if the current version of stringi is 1.5.3, it will create version 1.5.4. (I received a few warnings that didn't seem to affect the outcome.)
Move the newly created package to your local repository. Update the repository index. And install the package.

How can I download R packages using FTP

I need to support an R environment on a Windows 7 PC that doesn't have internet access.
I'd like to download (to DVD, eventually) a current version of all ~ 5,000 packages to make available to users of R on this PC.
Is there an FTP script, or another good way, to download all of the zip files for the R packages?
I know there are daily updates to R, but one good day will be enough to get me started.

Presumably you have an installation somewhere that does have internet access. I would just set that installation to download everything. There's an example at http://www.r-bloggers.com/r-package-automated-download/. Start R, and try this:
pkg.list = available.packages()
download.packages(pkgs = pkg.list, destdir = "E:\MyRPackages")
Once you have these files, copy them to some kind of portable media (thumb drive, hard drive, whatever) or burn a CD / DVD and take that to the standalone machine.
Note: there may be a reason this other machine was not connected to the internet. So be careful! Make sure the virus protection is up to date on the non-connected machine, and that your IT folks won't come down on you like a ton of bricks for transferring data this way.
Next, you need to point the standalone machine at the portable media or the CD / DVD. A simple way to do this is to redefine where R looks for the repository. See e.g. Creating a local R package repository for examples.
In your case, try something like this in R:
update.packages(repos="complete-path-to-portable-media",repos = NULL, type = "source")

Use rsync to create a mirror and then install packages by pointing to your local mirror as the repos argument of install.packages. No need to make the repository publicly available. Specialize the path (e.g., to rsync based on /bin/windows/contrib/3.0/) to retrieve just the windows binaries (to a directory that you've created with similar structure repos/bin/windows/contrib/3.0/) if that's all that needs to be supported.
rsync -rtlzv --delete \
cran.r-project.org::CRAN/bin/windows/contrib/3.0/ \
repos/bin/windows/contrib/3.0/

How can I copy data over the Amazon's EC2 and run a script?

I am a novice as far as using cloud computing but I get the concept and am pretty good at following instructions. I'd like to do some simulations on my data and each step takes several minutes. Given the hierarchy in my data, it takes several hours for each set. I'd like to speed this up by running it on Amazon's EC2 cloud.
After reading this, I know how to launch an AMI, connect to it via the shell, and launch R at the command prompt.
What I'd like help on is being able to copy data (.rdata files) and a script and just source it at the R command prompt. Then, once all the results are written to new .rdata files, I'd like to copy them back to my local machine.
How do I do this?

I don't know much about R, but I do similar things with other languages. What I suggest would probably give you some ideas.
Setup a FTP server on your local machine.
Create a "startup-script" that you launch with your instance.
Let the startup script download the R files from your local machine, initialize R and do the calculations, then the upload the new files to your machine.
Start up script:
#!/bin/bash
set -e -x
apt-get update && apt-get install curl + "any packages you need"
wget ftp://yourlocalmachine:21/r_files > /mnt/data_old.R
R CMD BATCH data_old.R -> /mnt/data_new.R
/usr/bin/curl -T /mnt/data_new.r -u user:pass ftp://yourlocalmachine:21/new_r_files
Start instance with a startup script
ec2-run-instances --key KEYPAIR --user-data-file my_start_up_script ami-xxxxxx

first id use amazon S3 for storing the filesboth from your local machine and back from the instance
as stated before, you can create start up scripts, or even bundle your own customized AMI with all the needed settings and run your instances from it
so download the files from a bucket in S3, execute and process, finally upload the results back to the same/different bucket in S3
assuming the data is small (how big scripts can be) than S3 cost/usability would be very effective

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Run Same R Script from Multiple Computers Simultaneously - r

Related

Ignore a dependency during R CMD check

concurrent install of git multiple branches of an R package with `devtools::install_github()`

How to install stringi from local file (ABSOLUTELY no Internet Access)

How can I download R packages using FTP

How can I copy data over the Amazon's EC2 and run a script?

Categories

Resources