R and Rstudio Docker vs Binder - r

My problem is that I can't use R-studio at my work place as the IT does not support it . I want to use R and R-studio that installed on my personnel laptop on my company laptop ( using a modern browser which is behind firewall ) . Some of the options I am thinking of two two things
should I need to build a docker for R and R-studio (I see base images are already available) , I am mostly interested in basic R , Dplyr (haven ,xporter, and Reticulate ) packages .
Should I have to use a binder . I am not technical person and my programming skills are very limited can any one suggest me way .
What exactly are the difference between using Docker option vs Binder ?
I know I can use R-Studio online and get my work done but with the new paid account I am running out of project hours and very slow sometimes . Thanks in advance

Here are some examples beyond the modern RStudio MyBinder example:
https://github.com/fomightez/pythonista_skewedf
https://github.com/fomightez/r_phylogenetics_worshop
https://github.com/fomightez/chapter7/tree/master/binder
The modern RStudio MyBinder example has been set as a template on GitHub so you can use
The first one is for a special use of a package not on conda. And I started that one from square one.
The other two were converted from content by others to aid in making them Binder-ready.
You essentially list everything you need from conda in the environment.yml along with the appropriate channels. If you need special stuff not on conda, you need the other configuration files included there.
Getting everything working can take some iterations on adding things, letting the image get built, and testing your libraries are available. Although you seem to think your situation is not overly complex.
The binder launch badges you see are just images where you modify the URL to point the MyBinder federation site at your repository. Look at the URL and you should see the pattern where you put studio at the end of the URL pointing at your repo. The form at MyBinder.org site can help with this; however, most often it is easier to just adapt a working launch badge's code copied from elsewhere. The form isn't set up at this time for making the URLs for launching to RStudio.
Download anything useful your create in a running session. The sessions timeout after 10 minutes, although RStudio usually keeps them active.
Lack of Persistence and limited memory, storage, & power can be drawbacks. The inherent reproducibility and portability are advantages.
MyBinder.org doesn't work with private repos. If you have code you don't want to share, you can upload it to the temporary session, using the repo for specifying the environment. You could host a private binderhub that does allow the use of private git repositories; however, that is probably overkill for your use case and exceed your ability level at this time.
GitHub isn't the only place to host repositories that can be pointed at the MyBinder system. If you go to the MyBinder.org page and click where it says 'GitHub' on the left side of the top line of the form, you can see a list of the sources at which you can host a repository and point the system to build an image and launch a container with that specified image.
Building the image from a source repository takes some minutes the first time. Once the image is built though on the service, launch is typically less than 30 seconds. Each time you make a change on the source repo, a build is necessary. Some changes don't cause the new build to be as long as the initial one as some optimizing is done to only build what is necessary after a change. Keep in mind there are several members of the federation around the workd and if traffic on the internet gets sent to where the built image isn't yet available, it will be built from scratch again first.
The Holepunch project is out there to offer some help for users working in the R ecosystem; however, with the R-Conda system that is now integrated into MyBinder it is pretty much as easy to do it the way I described. Last I knew, the Holepunch route makes a Dockerfile that isn't as easy to troubleshoot as using the current the R-Conda system route. Dockerfiles are essentially a last ditch configuration file that MyBinder can handle. The reason being the other configuration files are much easier and don't require knowing Dockerfile syntax. MyBinder aims to offer the ability to take advantage of Docker offering containers with a specified environment without users needing to know anything about Docker.
There is a Binder Help category for posting to get help at the Jupyter Discourse Forum. Some other examples of posts already there may help you troubleshoot.
Notice of a common pitfall
Most of the the configuration files for making a repository Binder-ready are simply text and can be edited right in the GitHub browser interface, without need to git or even cloning the repo locally.
Last I knew, there are two exceptions to this. The postBuild and start configuration files have settings that allow them to be run as scripts and these get altered in a way they no longer work if you edit them via the GitHub browser interface. (This was my experience when last I tried. Your mileage may vary or things may have changed now.) To edit those, you have to have git available on a system you have and pull one from some other source. Then edit that on your machine that has git working & add it your repo and push it back up from your local computer.
(If this is a problem, you can post in the Jupyter Discourse Forum Binder help category and you and I could coordinate where I fork and edit those files in your repo to your specifications and then make a pull request to update your source of the fork with those changes.)

If you are using Jupyter notebooks extensively then it may make sense to use Binder
But if you simply want to use R and Rstudio, then all you need is docker. A good resource is
https://github.com/rocker-org/rocker

Related

Logging into github from Jupyter notebook

I was trying to follow this tutorial do AI image generation on a remote server:
https://youtu.be/tgRiZzwSdXg
And I am getting stuck at this stage:
Apparently it needs me to log into github before it will download the images. Normally this would be trivial but this isn't running in a normal terminal, and I cannot give it any input to log me in. You can see in the attached screen shot with the output right below the code block. Does anyone know how I'm supposed to do this?
I think I've come up with what is going on and how to easily work around it for this case, yet still use git. (The developer offers an non-git route that I reference in the comment below the OP.) You shouldn't need credentials because this is a public repo. Since you have git working on your machine, you should be good. (I can tell git it working because it says it is cloning in that first line below the code in your image instead of saying git isn't recognized.) Try this in Jupyter on your remote machine:
dataset="style_ddim"
#!git clone https://github.com/aitrepreneur/SD-Regularization-Images-Style-Dreambooth-{dataset}.git
!git clone https://github.com/aitrepreneur/SD-Regularization-Images-Style-Dreambooth.git
!mkdir -p regularization_images/{dataset}
!mv -v SD-Regularization-Images-Style-Dreambooth/{dataset}/*.* regularization_images/{dataset}
That version of the steps should work in Jupyter on your remote server without making the cell keep sitting there running and asking you for a username. (Which you cannot provide post-execution because the way the exclamation point sends things to the shell and how Jupyter cells work.)
All I have done is comment out the original line that was getting a specific folder and replaced it with a line cloning the whole repository.
It seems Github treats the ability of getting a specific directory directly as an 'advanced' ability and wants your GitHub credentials for that. Like it would want them for cloning a private repository. Cloning an entire public repository doesn't trigger that need.
What I said in my comments about how you really would provide your credentials on one line holds and the associated security risks, but all that isn't necessary in this case. And you shouldn't need credentials for git cloning public repositories. So the paradigm of just using !git clone ... without credentials inside Jupyter cells should hold if you need to use a similar approach to get something from git. In this case we didn't need to adjust any of the subsequent handling of the cloned contents after; however, that may not always be the case.
Note:
I would have noted that you could just adjust the git clone line earlier if you had included code along with with your image. People trying to help you want it easy. They don't want to type a lot of code that you had as code text and thus could have easily provided in your post. In general, avoid images or only use them as a minor supplement to show behavior. This and more is touched on in How do I ask a good question?. (People trying to help also don't want to go digging elsewhere for code where only a link to a video is provided. Video is farther from code than a screenshot.)
I will say that your use of an image to show the output was definitely justified and helped me match up what I was seeing with yours and know you had git working. With all the symbols and weird characters there it would have been hard to get just right in text only. So why it is best to avoid images, there definitely is a case for 'reserved use' along with text descriptions and code.

How to extend docker environment generated by wp-env

I've been using wp-env for a while now for running local WordPress environments for development on my Mac. With the introduction of Monterey, Apple removed PHP from MacOS. There are a couple of ways I can think of to handle this situation. Many people seem to be using Homebrew and MAMP. However, I'd prefer not to have to use Homebrew, both because of past personal experience, but also because going down this path seems to create a whole other mess for how to handle PHP and Composer (see, for example, Using PHPCS with Homebrew On MacOS Monterey).
So, my thought was, maybe I can just start doing development inside of the docker container. The questions then:
how do I extend the wp-env npm module to add things by default to the docker container, without modifying the wp-env source? i.e., does docker have some sort of config I can write that will run wp-env and then add some other stuff to the image? (e.g., npm, git, eslint, etc... so that the docker container itself becomes a development environment).
as I'm actually writing this question, does it even make sense to do it this way? I've found hints that a few people are doing it this way (e.g., a commenter on Using Docker in development the right way talked about his setup where he has vim/tmux/vscode/zsh configuration and shortcuts baked in, and recommends running all services as dockers inside that volume (which he claims is a huge performance increase over host bind mount). Unfortunately, he linked to a git repo that either no longer exists or is at least no longer public.)
While I cannot assist you specifically with wp-env I would recommend using DDev https://ddev.readthedocs.io/en/stable/ As you will basically have the freedom of choosing custom PHP environments, plus it comes with pre defined configurations to use specific stacks e.g. Laravel, WordPress, Drupal, and is dead simple to use.
I understand you might like to continue with wp-env but maybe this will help you out.

Creating a publicly distributed Jupyter Notebook from Github repo

I have an online course (Performance Ninja), which I would like to turn into a publicly distributed Jupyter Notebook. The course is hosted on Github. There I have the source code for the lab assignments that students need to work on. They need to fix the issue by changing the code and submit (git push) their work. It will be picked up by Github Actions and sent to my remote server, which is appropriately configured for performance benchmarking. Thus, I don't rely on virtualized CI machines offered by Github for example, they are not suitable for performance measurements.
I want to make a Jupyter Notebook, which will be a view into my Github repo. It will provide nice interface, ability to focus on the part of the code that matters (kernel of the benchmark), and have a simple way of submitting solutions for automated benchmarking (just hit Shift-Enter).
I was looking at JupyterHub. It should work nicely, but then the issue is that I have to have a public static IP for the JupyterHub server.
Ideally, I would like to be able to trigger Github Actions workflow from the Jupyter Notebook itself. A user (student) would authenticate themselves with Github, change the code and hit Shift-Enter, which will trigger Github Actions (maybe pushing the code to a private branch).
I assume I’m not the first person facing a similar problem. I would like to hear from people with experience, what would be my best option here?

how can i share a Jupyter Notebook?

I am using Julia but didn't really like the IDE (more of a notebook guy). So I used for the first time Jupyter (lab and notebooks).
I started Jupyter from Anaconda and made my notebook. The thing is I want to share it. Like other people can access a link and get to run my code.
I don't really know how GitHub works, but I somehow managed to upload the notebook there. I saw this thing called "Binder" that could run my code on another computer. But I try to put my Github link there and just get an error.
Can someone that used Jupyter can explain it to me?
Ah, I almost forgot, when I google Jupyter Notebook and start one with Julia I can use this Binder Thing. But when I do it on my own I can't.
Here I put the screenshot I made on the Demo of Jupyter+binder so you can see it says to send a binder link
While there are many options, the best and the easiest way is through Jupyter's menu:
File -> Download as -> HTML
You end up with a HTML containing all code cells and all results (including pictures) which is perhaps the best for viewing by others.
Github can be used to natively publish a *.ipynb and show it to users as a static HTML, however I find it not very stables (rendering keeps failing from time to time) and hence I opt for generating the HTML file yourself and use eg. Github pages for hosting it.
Another interesting option is to share just the *.ipynb file and recommend people Open Source https://nteract.io/ as the viewer.
Yet another option that is sometimes use is to host a JupyterHub on an AWS EC2 instance (a single t2.micro is free for one year within the AWS free tier) and give my collaborators logins and passwords (this though requires quite a bit of configuration work).

Is there a convention for git version control between multiple operating systems in R?

Apologies if this isn't an appropriate question for SO - if not please let me know and I'll delete/move it. I just haven't found any resources on this myself. Anything I google related to "multiple operating systems git" gives me pages for applications that work on multiple OS's like GitHub or Tower.
I currently work regularly between two operating systems - PC at the office, Mac at home. I've been managing this with with git by using my master branch for PC/Windows R code, while using a OSXversion branch for Mac R code. This is fine for whenever I'm updating Windows or Mac specific code on each branch (such as package installation instructions in the comments). Where this gets tricky is for general improvement in my code that applies to both Mac & PC. What I've been doing is manually copy-pasting any general improvements between my Mac/PC code or cherry picking my merges. Is there a better way to be doing this?
It's fine to store code that runs on different operating systems in a Git repository. Simply check out the repository on each of the operating systems you're going to be working on.
The only thing you need to watch for is the line endings, which. Unless you're dealing with files that specifically require native CRLF / LF end-of-line styles, you're best to turn the automatic conversion off.
This can be done with:
git config --global core.autocrlf false
Further notes on autocrlf can be found on the GitHub help page itself.
As for your actual committing, you'll want to be following Git Flow, and simply have a develop branch that bases off of master. From here, you'll want to create individual feature branches. These feature branches can be worked on whilst on either Windows or Mac. The package installation instructions should really go in your README.md file.

Resources