How does jupyterhub work? - jupyter-notebook

I have to construct the infrastructure so that multiple users can work on the same jupyter(ipython notebook) service, yet via different sessions, so the users can't interrupt each other.
I thought jupyterhub( https://github.com/jupyter/jupyterhub) is there to control everything, yet it still seems like the session is bound to one since if I logout of it on one window, an instance on another window also logs out.
Is there a way to control multi-sessions on jupyter?

Jupyter doesn't support multiple users editing the same notebook at the same time without data loss. I don't believe it is meant to. I believe Jupyter is meant to provide a relatively easy to configure and install instance of python that contains the same installed modules and environment to minimize problems caused by environmental differences between developer workstations.
Also, it's meant to make the barrier for entry to programming python and working in data science much lower than it otherwise would be. That is, it's much easier to talk an analyst into visiting a website than learning a new programming language.
More to the point of your question, though: The way Jupyter handles 'sessions' is that (unless configured otherwise), every Jupyter user corresponds to a user on the on the server that is running Jupyter and every time you log in to Jupyter you are effectively creating a new login to that server's operating system. It immediately follows that if you log out of Jupyter from one window, you're logging out of not just that browser's session, but also the login to the Jupyter server's operating system as well, which would kill all other open browser windows.

You question is a bit unclear, JupyterHub is meant to support multi-user across many machines. If course if you use the same browser from the same machine, you get logged out too, as the browser is carrying the connexion information that get revoked.

Jupyterhub is a web based multiuser application, that provides session and authentication services.
Jupyterhub will be hosted in unix/linux server, the client can access it using the ip address and port number,Once it is accessed by client, the client must enter the userid and password which is associated with the sytem users in server (PAM authentication) which will redirect to the home directory of the current user.
You can build a infrastructure by using jupyterhub, which is meant for multi-user. The jupyterhub just provides multi user interface and PAM authentication, you have to configure security, file access permission everything in kernel level using shell script.
Normally, you host a jupyterhub or jupyter notebook in command line. In the same way you can write a shell script to setup multi-user environment.

Related

How to configure ports used for communication between R and RStudio?

First, I'm not a R/RStudio user at all. I'm a Windows admin with the task to configure R and RStudio on a multi-user Citrix environment. To identify users between the multiple sessions, we are using the Palo Alto Terminal Server agent which will allocate a range of ports for each user and use them to identify each users. That's then used to give limited and specific access to resources for each users.
The problem is that the TS Agent also intercept the localhost connection that's created when you start RStudio (process rsession) and RStudio then cannot connect to R. One possible solution to solve this problem is to have control on the ports used when this local session is started.
I have made multiple research on the Internet but I have been unable to find if/how you can change the ports that are used. I have found different config files but none that seem to allow me to fix a single port or a port range.
Any insights on the way to fix the ports for the rsession process so I can better control them? Or another way to look at the problem: do you know the port range used by R/RStudio when they communicate together through the rsession? I can simply avoid using these range with the TS Agent.
I have only skimmed through the RStudio Source code, but it seems that the port is assigned randomly:
https://github.com/rstudio/rstudio/blob/bcc8655ba3676e3155d80296c421362881340a0f/src/node/desktop/src/main/application.ts#L226
However, it also seems like there is a startup parameter --www-port to set the port:
https://github.com/rstudio/rstudio/blob/bcc8655ba3676e3155d80296c421362881340a0f/src/node/desktop/src/main/session-launcher.ts#L592

Launch separate RStudio session on a different port?

I have a RStudio server running on port, say, 8787, which is then accessible by a web browser.
The problem is, if my colleague wants to use RStudio, I'll be disconnected as only one user can use the RStudio.
I'm looking for how can we launch another instance of RStudio session on a separate port number, say, 8989.
This should allow at least 2 different users to run 2 separate RStudio sessions on the same server.
To be clear, I'm on RStudio server free version. I'm not sure whether features like multiple sessions on different ports require paid license or not.
If it helps, I'm using RHEL7.
Thanks!
You do not need a license for this. Even the free version of RStudio Server will allow you to run one session per user.
So you don't need to try to run multiple servers on multiple ports; just set up a regular Linux user account for your colleague on your server (using e.g. adduser), and they'll be able to log into RStudio and run their own R session.

Will Docker suffice for a Shiny app with ~100 connections or do I need Shiny Proxy?

I'm looking for a free and open source option for serving out a Shiny appl to ~100 of my students simultaneously. I tried to do this with Shiny Server Open and it throttled. Users got a message like
Too Many Users
Sorry, but this application has exceeded its quota of concurrent users. Please try again later.
After searching on that error message I now know that I can increase the number of concurrent connections, but I'm afraid of bottlenecks due to R's single threaded-ness. I'm aware of Shiny Proxy and I've been experimenting with this, but it seems like it may contain an extra layer of complexity that I don't need.
I've served Shiny apps before with Docker (but not to this large of an audience), so I'm wondering if it will be sufficient.
My question is this: if I don't need authentication (user logins), will Docker suffice for a single page application for ~100 simultaneous connections? Or do I really need Shiny Proxy?
Corollary: how can I test this and ensure that it will work (outside of getting in front of a 100 student class and testing on the fly)?
Do you care if they all share the same underlying R process?
The open-source version of shiny-server allows you to serve apps, but they all share a single R process. So if your app has a long-running simulation, while one user runs that, it would tie-up your R thread and block the other users until it finishes running.
I don't know that there is a limit on the concurrent connections, if you don't mind them sharing the R process as described above. You can try increasing the simple_scheduler setting, see Section 3.1.2 Simple Scheduler, in the documentation for shiny-server.conf (typically at /etc/shiny-server/).
If you don't care about them all being at the same URL, you could just use multiple instances of the open-source shiny-server, for instance in docker containers hosted on your machine at different ports.
If you want to do something like load-balance between instances of your application (horizontally scaling behind a single URL), you'll need either shiny-server pro, ShinyProxy, or to use a load balancer with sticky-sessions. This is because shiny apps handle state in-memory in a R session, so if you try to send your students to a URL and that URL is backed by n instances of your app, but there is no stickiness guarantee, than an individual student's action will not necessarily be on the same instance each time, and the apps won't work as you expect.
Shiny-Server pro and ShinyProxy handle this stickiness for you with cookies and headers. Depending on your cloud-services provider, they probably support a browser cookie which would work as long as you don't need your students to be able to open multiple tabs of your app with different instances.

How do I send keystrokes/interative gui to another Win10 machine?

I'm currently using AutoHotKey to create a variety of macros. I have two desktops side-by-side in a private (home) network. It is my desire to have the AHK Run command on PC1 make some sort of call to PC2. Both PCs are running Windows 10 (non-domain), and both use the same login credentials (same account via microsoft.com).
What I've tried: I have tried a few things, such as WMI, WinRM, schtasks. Each of these options work when dealing with non-interactive scripts. I am trying to call scripts that a) open GUI windows or b) send key strokes to PC2.
Other requirements:
The solution cannot require the password to be type in a prompt nor provided in the command-line call. The desired effect is that I press a button on my keyboard -> ahk command triggers -> script on PC2 is called.
As this network is shared with roommates (and whoever they allow to connect to our wifi), basic security is still a necessity.
This is not a language specific question - I am looking for the simplest/easiest/cleanest method. Thanks for reading.
Try a remote access connection app like TeamViewer. They allow you to control one PC from another across a network. https://www.teamviewer.com/en/
I have an astronomical observatory in my yard with four computers connected to all the observatory equipment. These four computers are controlled over my home network from one PC in the house.
The remote access app allows you to run an .exe on another computer which in my case is usually a compiled AHK script.
I have a number of tasks that require several PC's. A script running on the main PC will start secondary scripts on the observatory PC'c which in turn will send messages back and forth by sending text files to each others shared files. The PC receiving the text file will perform a specific action based on the message.
Here's a link to the observatory startup procedure. I a startup script on the main PC which turns on all the observatory equipment then starts a secondary startup script on each of the observatory PC's to load and position all the software and then connect all the cameras and associated equipment.
https://www.youtube.com/watch?v=UN4VoOKOcXo&feature=youtu.be
This just shows how the various scrips running on the observatory PC's load and position all the various app windows. Not exactly what you may need but it may give you some ideas about what you can do with the remote access software.
Lorence

Amazon EC2 / RStudio : Is there are a way to run a job without maintaining a connection?

I have a long running job that I'd like to run using EC2 + RStudio. I setup the EC2 instance then setup RStudio as a page on my web browser. I need to physically move my laptop that I use to setup the connection and run the web browser throughout the course of the day and my job gets terminated in RStudio but the instance is still running on the EC2 dashboard.
Is there a way to keep a job running without maintaining an active connection?
Does it have to be started / controlled via RStudio?
If you make your task a "normal" R script, executed via Rscript or littler, then you can run them from the shell ... and get to
use old-school tools like nohup, batch or at to control running in the background
use tools like screen, tmux or byobu to maintain one or multiple sessions in which you launch the jobs, and connect / disconnect / reconnect at leisure.
RStudio Server works in similar ways but AFAICT limits you to a single user per user / machine -- which makes perfect sense for interactive work but is limiting if you have a need for multiple sessions.
FWIW, I like byobu with tmux a lot for this.
My original concern that it needed to maintain a live connection was incorrect. It turns out the error was from running out of memory, it just coincided with being disconnected from the internet connection.
An instance is started from the AWS dashboard and stopped or terminated from there also. As long as it is still running it can be accessed from an RStudio tab by copying the public DNS to the address bar on the web page and logging in again.

Resources