Creating a publicly distributed Jupyter Notebook from Github repo - jupyter-notebook

I have an online course (Performance Ninja), which I would like to turn into a publicly distributed Jupyter Notebook. The course is hosted on Github. There I have the source code for the lab assignments that students need to work on. They need to fix the issue by changing the code and submit (git push) their work. It will be picked up by Github Actions and sent to my remote server, which is appropriately configured for performance benchmarking. Thus, I don't rely on virtualized CI machines offered by Github for example, they are not suitable for performance measurements.
I want to make a Jupyter Notebook, which will be a view into my Github repo. It will provide nice interface, ability to focus on the part of the code that matters (kernel of the benchmark), and have a simple way of submitting solutions for automated benchmarking (just hit Shift-Enter).
I was looking at JupyterHub. It should work nicely, but then the issue is that I have to have a public static IP for the JupyterHub server.
Ideally, I would like to be able to trigger Github Actions workflow from the Jupyter Notebook itself. A user (student) would authenticate themselves with Github, change the code and hit Shift-Enter, which will trigger Github Actions (maybe pushing the code to a private branch).
I assume I’m not the first person facing a similar problem. I would like to hear from people with experience, what would be my best option here?

Related

R and Rstudio Docker vs Binder

My problem is that I can't use R-studio at my work place as the IT does not support it . I want to use R and R-studio that installed on my personnel laptop on my company laptop ( using a modern browser which is behind firewall ) . Some of the options I am thinking of two two things
should I need to build a docker for R and R-studio (I see base images are already available) , I am mostly interested in basic R , Dplyr (haven ,xporter, and Reticulate ) packages .
Should I have to use a binder . I am not technical person and my programming skills are very limited can any one suggest me way .
What exactly are the difference between using Docker option vs Binder ?
I know I can use R-Studio online and get my work done but with the new paid account I am running out of project hours and very slow sometimes . Thanks in advance
Here are some examples beyond the modern RStudio MyBinder example:
https://github.com/fomightez/pythonista_skewedf
https://github.com/fomightez/r_phylogenetics_worshop
https://github.com/fomightez/chapter7/tree/master/binder
The modern RStudio MyBinder example has been set as a template on GitHub so you can use
The first one is for a special use of a package not on conda. And I started that one from square one.
The other two were converted from content by others to aid in making them Binder-ready.
You essentially list everything you need from conda in the environment.yml along with the appropriate channels. If you need special stuff not on conda, you need the other configuration files included there.
Getting everything working can take some iterations on adding things, letting the image get built, and testing your libraries are available. Although you seem to think your situation is not overly complex.
The binder launch badges you see are just images where you modify the URL to point the MyBinder federation site at your repository. Look at the URL and you should see the pattern where you put studio at the end of the URL pointing at your repo. The form at MyBinder.org site can help with this; however, most often it is easier to just adapt a working launch badge's code copied from elsewhere. The form isn't set up at this time for making the URLs for launching to RStudio.
Download anything useful your create in a running session. The sessions timeout after 10 minutes, although RStudio usually keeps them active.
Lack of Persistence and limited memory, storage, & power can be drawbacks. The inherent reproducibility and portability are advantages.
MyBinder.org doesn't work with private repos. If you have code you don't want to share, you can upload it to the temporary session, using the repo for specifying the environment. You could host a private binderhub that does allow the use of private git repositories; however, that is probably overkill for your use case and exceed your ability level at this time.
GitHub isn't the only place to host repositories that can be pointed at the MyBinder system. If you go to the MyBinder.org page and click where it says 'GitHub' on the left side of the top line of the form, you can see a list of the sources at which you can host a repository and point the system to build an image and launch a container with that specified image.
Building the image from a source repository takes some minutes the first time. Once the image is built though on the service, launch is typically less than 30 seconds. Each time you make a change on the source repo, a build is necessary. Some changes don't cause the new build to be as long as the initial one as some optimizing is done to only build what is necessary after a change. Keep in mind there are several members of the federation around the workd and if traffic on the internet gets sent to where the built image isn't yet available, it will be built from scratch again first.
The Holepunch project is out there to offer some help for users working in the R ecosystem; however, with the R-Conda system that is now integrated into MyBinder it is pretty much as easy to do it the way I described. Last I knew, the Holepunch route makes a Dockerfile that isn't as easy to troubleshoot as using the current the R-Conda system route. Dockerfiles are essentially a last ditch configuration file that MyBinder can handle. The reason being the other configuration files are much easier and don't require knowing Dockerfile syntax. MyBinder aims to offer the ability to take advantage of Docker offering containers with a specified environment without users needing to know anything about Docker.
There is a Binder Help category for posting to get help at the Jupyter Discourse Forum. Some other examples of posts already there may help you troubleshoot.
Notice of a common pitfall
Most of the the configuration files for making a repository Binder-ready are simply text and can be edited right in the GitHub browser interface, without need to git or even cloning the repo locally.
Last I knew, there are two exceptions to this. The postBuild and start configuration files have settings that allow them to be run as scripts and these get altered in a way they no longer work if you edit them via the GitHub browser interface. (This was my experience when last I tried. Your mileage may vary or things may have changed now.) To edit those, you have to have git available on a system you have and pull one from some other source. Then edit that on your machine that has git working & add it your repo and push it back up from your local computer.
(If this is a problem, you can post in the Jupyter Discourse Forum Binder help category and you and I could coordinate where I fork and edit those files in your repo to your specifications and then make a pull request to update your source of the fork with those changes.)
If you are using Jupyter notebooks extensively then it may make sense to use Binder
But if you simply want to use R and Rstudio, then all you need is docker. A good resource is
https://github.com/rocker-org/rocker

Can people without R installed run an R Notebook file successfully?

I have an R Notebook that I am building to provide an analysis for somebody, and I am wondering if I should choose another option as I don't know if she will be able to run the Notebook without having R installed.
Is it possible to run an R Notebook as a single entity or must you have R installed in order to do it?
To rerun the notebook they require R. But the whole point of R Notebooks is that they produce a static document as output. That document (usually in HTML format) can be shared in isolation, and does not require any additional software besides a web browser to be viewerd.
Notebook will need R to run. To distribute a notebook without the R dependency will be a bit more elaborate, like installing rstudio server within a docker container. User will, in this particular case, need to have Docker installed and know how to start a container. From there on the user can interact with the code through a web browser.
Another option would be to use the cloud solution that some companies offer. It offers sharing functionality and you don't have to worry about the infrastructure or distribution of your work. There are some free plans that may work for you, but the real power is in premium features.

how can i share a Jupyter Notebook?

I am using Julia but didn't really like the IDE (more of a notebook guy). So I used for the first time Jupyter (lab and notebooks).
I started Jupyter from Anaconda and made my notebook. The thing is I want to share it. Like other people can access a link and get to run my code.
I don't really know how GitHub works, but I somehow managed to upload the notebook there. I saw this thing called "Binder" that could run my code on another computer. But I try to put my Github link there and just get an error.
Can someone that used Jupyter can explain it to me?
Ah, I almost forgot, when I google Jupyter Notebook and start one with Julia I can use this Binder Thing. But when I do it on my own I can't.
Here I put the screenshot I made on the Demo of Jupyter+binder so you can see it says to send a binder link
While there are many options, the best and the easiest way is through Jupyter's menu:
File -> Download as -> HTML
You end up with a HTML containing all code cells and all results (including pictures) which is perhaps the best for viewing by others.
Github can be used to natively publish a *.ipynb and show it to users as a static HTML, however I find it not very stables (rendering keeps failing from time to time) and hence I opt for generating the HTML file yourself and use eg. Github pages for hosting it.
Another interesting option is to share just the *.ipynb file and recommend people Open Source https://nteract.io/ as the viewer.
Yet another option that is sometimes use is to host a JupyterHub on an AWS EC2 instance (a single t2.micro is free for one year within the AWS free tier) and give my collaborators logins and passwords (this though requires quite a bit of configuration work).

Using RStudio to Make Pull Requests in Git

My enterprise has a Git repository. To make changes, I have to make changes in my forked repository and then make a pull request.
I primarily use RStudio, so I have enabled its integration with Git. I can make changes to my forked repository and then push, pull, sync, etc. The problem is that I still have an additional step of logging into GitHub and making a pull request for my forked repository. Is there a way of doing this from RStudio?
I too use RStudio for R development and I do not believe there is a way to do this. The reason is because this is more than just adding code to a branch, you're requesting a management feature to take place which is pulling part of your code into another branch of the code base. RStudio appears to be limited to pulling, syncing and committing. Likely you need to use a separate, more full featured GitHub client.
This could be done via the GitHub API, which could be executed from an R package using the httr or curl package, after which such a package could have an addin for RStudio, which would let you check everything using a nice Shiny app!
Now we only need to look for someone who wants to develop this… Can’t seem to find it (Jan 2022).

What is the easiest way to create a webapp from an interactive Jupyter Notebook?

I have a Jupyter Notebook that plots some data and lets the user interact with it via a slider.
What would be the easiest way to make a web app with a similar functionality? (reusing as much of the code...)
I believe the easiest way is to use voilà.
After installing you just have to run:
voila <path-to-notebook> <options>
And you will have a server running your notebook as a web app, with all the input code omitted.
AppMode is "A Jupyter extension that turns notebooks into web applications".
From the README:
Appmode consist of a server-side and a notebook extension for Jupyter.
Together these two extensions provide the following features:
One can view any notebook in appmode by clicking on the Appmode button in the toolbar. Alternatively one can change the url from
baseurl/notebooks/foo.ipynb to baseurl/apps/foo.ipynb. This also
allows for direct links into appmode.
When a notebook is opened in appmode, all code cells are automatically executed. In order to present a clean UI, all code cells
are hidden and the markdown cells are read-only.
A notebook can be opened multiple times in appmode without interference. This is achieved by creating temporary copies of the
notebook for each active appmode view. Each appmode view has its
dedicated ipython kernel. When an appmode page is closed the kernel is
shutdown and the temporary copy gets removed.
To allow for passing information between notebooks via url parameters, the current url is injected into the variable
jupyter_notebook_url.
To be complete - there exists also https://www.streamlit.io/ .
I still dont understand the exact difference between voila and streamlit.
At the moment I just struggle with the possibility to re-run everything with new parameters... I have bad luck with voila still.
Edit: I see that streamlit requires a raw python, not .ipynb, this fact would mean that this answer is completely wrong, I will search a bit more on streamlit before further action/comment.
Edit2: Voila looks great. However, I found few things that uncover the underlying complexity and thus a troubles that may arise.
callbacks. Widgets work great in jupyter, but since it is not possible to re-run one cell, sometimes the logic must be modified to work in Voila.
interactive java objects need a special treatment, e.g. matplotlib has a cheap solution, but there was nothing for e.g. jsroot
links. It is easy to create (a file and) a download link in jupyter, Voila can also serve a file, but it needs another extra treatment.
After all, I pose myself a question - is it better to learn many tricks and modifications to jupyter or to use some other system? I am going to see if streamlit can give em some answer.
The Jupyter Dashboards Bundlers extension from the Jupyter Incubator is one way to do it while retaining interactivity.
EDIT: While pip installing this package will also install the cms package dependency, like dashboard_bundlers, cms needs to be explicitly enabled/quick-setup as a notebook extension for the dashboard tools to work.
#raphaelts has the right idea and should be the accepted answer. As of Dec 2019, Voila is the most appropriate method to deploy Jupyter notebooks to production as a stand alone webapp. Think internal datascience teams sharing their analytics workload with internal C-Suite teams using SPA stlye Notebooks with all the code hidden and custom GUI/interactions thrown in. Recently discussed on HN
https://news.ycombinator.com/item?id=20160634 and the official announcement from the Jupyter maintainer https://blog.jupyter.org/and-voil%C3%A0-f6a2c08a4a93enter link description here
As mentioned above, voilà is a very powerful tool which hides the input cells from your notebooks and therefore provides a clean interface. In order to deploy your notebook with voilà you need to follow the specific steps of your organization. But if you want to quickly run it on your machine, simply install it with pip install voila. Then you can enter start from the command-line: voila my_notebook.ipynb or use the "Voila" button which should have appeared in your Jupyter notebook.
Note, however, that using voilà is only one part of the story. You also need to build the required interactivity, ie. to set up how to respond to input changes. There are quite a few frameworks for this.
The simplest one is to use the interact function or the observe method from the Ipywidgets library. This is very direct, but things can easily get out of control as you start having more and more widgets and complexity.
There are complete frameworks, some of them mentioned above. E.g. streamlit, dash and holoviz. These are very powerful and suited for larger projects.
But if you want to keep it simple, I also recommend to check out autocalc. It is a very easy-to-use library, which lets you define the dependencies between your widgets/variables and let all the recalculation be triggered automatically. A tutorial can be found here.
Disclaimer: I am the author of the autocalc package.
The easiest way is to use the Mercury framework. You can reuse all your code. To convert the notebook to web app you will need to add the YAML header in the first cell of the notebook (very similar to R Markdown). The widgets are generated based on YAML. The end-user can tweak widgets values and click the Run button to execute the notebook from the top to the bottom. You can easily hide the notebook's code (if you want) by setting the show-code: False in the YAML. The example notebook and corresponding web app are below.
Example of the notebook with YAML header
Example web app generated from notebook with Mercury

Resources