Is there a way to install and run RStudio on Google Colab?
I am aware that it is possible to run R code on Google Colab. Thus, i was wondering if there is also a work-around to install and run RStudio?
I am using Google Colab and RStudio from time to time. To make the set up easier, I usually copy & paste the following setup script (and use alternative 2).
After creating a new Google Colab Notebook, execute the following commands:
# Add new user called "rstudio" and define password (here "password123")
!sudo useradd -m -s /bin/bash rstudio
!echo rstudio:password123 | chpasswd
# Install R and RStudio Server (Don't forget to update version to latest version)
!apt-get update
!apt-get install r-base
!apt-get install gdebi-core
!wget https://download2.rstudio.org/server/bionic/amd64/rstudio-server-1.4.1103-amd64.deb
!gdebi -n rstudio-server-1.4.1103-amd64.deb
#===============================================================================
# ALTERNATIVE 1: Use ngrok
#===============================================================================
# Advantage: Runs in the background
# Disadvantage: Not so stable
# (often 429 errors during RStudio usage due to max 20 connections without account)
# Optionally register for a free accoount which gets this number up to 40 connections:
# https://ngrok.com/pricing
# Install ngrok (https://ngrok.com/)
!wget -c https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip
# Run ngrok to tunnel RStudio app port 8787 to the outside world.
# This command runs in the background.
get_ipython().system_raw('./ngrok http 8787 &')
# Get the public URL where you can access the Dash app. Copy this URL.
! printf "\n\nClick on the following link: "
! curl -s http://localhost:4040/api/tunnels | python3 -c \
"import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"
# ==> To access to the RStudio server
# - click on this link and
# - use the username "rstudio" and
# - the password you defined at the first cell ("password123" in this example).
#===============================================================================
# ALTERNATIVE 2 (preferred): Use localtunnel
#===============================================================================
# (see also: https://github.com/naru-T/RstudioServer_on_Colab/blob/master/Rstudio_server.ipynb)
# Advantage: Stable usage of RStudio
# Disadvantage: Does not run in the background (i.e. Colab blocked)
# Install localtunnel
!npm install -g npm
!npm install -g localtunnel
# Run localtunnel to tunnel RStudio app port 8787 to the outside world.
# This command runs in the background.
!lt --port 8787
# ==> To access to the RStudio server
# - click on this link,
# - click button "Click to Continue" on the "friendly reminder" page,
# - use the username "rstudio" and
# - the password you defined at the first cell ("password123" in this example).
Related
I want run dask on EMR using YarnCluster.
I have used below bootstrap script but I have run these instructions in SSH console.
#!/bin/bash
HELP="Usage: bootstrap-dask [OPTIONS]
Example AWS EMR Bootstrap Action to install and configure Dask and Jupyter
By default it does the following things:
- Installs miniconda
- Installs dask, distributed, dask-yarn, pyarrow, and s3fs. This list can be
extended using the --conda-packages flag below.
- Packages this environment for distribution to the workers.
- Installs and starts a jupyter notebook server running on port 8888. This can
be disabled with the --no-jupyter flag below.
Options:
--jupyter / --no-jupyter Whether to also install and start a Jupyter
Notebook Server. Default is True.
--password, -pw Set the password for the Jupyter Notebook
Server. Default is 'dask-user'.
--conda-packages Extra packages to install from conda.
"
# Parse Inputs. This is specific to this script, and can be ignored
# -----------------------------------------------------------------
# -----------------------------------------------------------------------------
# 1. Check if running on the master node. If not, there's nothing do.
# -----------------------------------------------------------------------------
grep -q '"isMaster": true' /mnt/var/lib/info/instance.json \
|| { echo "Not running on master node, nothing to do" && exit 0; }
# -----------------------------------------------------------------------------
# 2. Install Miniconda
# -----------------------------------------------------------------------------
echo "Installing Miniconda"
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o /tmp/miniconda.sh
bash /tmp/miniconda.sh -b -p $HOME/miniconda
rm /tmp/miniconda.sh
echo -e '\nexport PATH=$HOME/miniconda/bin:$PATH' >> $HOME/.bashrc
source $HOME/.bashrc
conda update conda -y
# configure conda environment
#source ~/miniconda/etc/profile.d/conda.sh
#conda activate base
# -----------------------------------------------------------------------------
# 3. Install packages to use in packaged environment
#
# We install a few packages by default, and allow users to extend this list
# with a CLI flag:
#
# - dask-yarn >= 0.7.0, for deploying Dask on YARN.
# - pyarrow for working with hdfs, parquet, ORC, etc...
# - s3fs for access to s3
# - conda-pack for packaging the environment for distribution
# - ensure tornado 5, since tornado 6 doesn't work with jupyter-server-proxy
# -----------------------------------------------------------------------------
echo "Installing base packages"
conda install \
-c conda-forge \
-y \
-q \
dask-yarn \
s3fs \
conda-pack \
tornado
pip3 install pyarrow
# -----------------------------------------------------------------------------
# 4. Package the environment to be distributed to worker nodes
# -----------------------------------------------------------------------------
echo "Packaging environment"
conda pack -q -o $HOME/environment.tar.gz
# -----------------------------------------------------------------------------
# 5. List all packages in the worker environment
# -----------------------------------------------------------------------------
echo "Packages installed in the worker environment:"
conda list
# -----------------------------------------------------------------------------
# 6. Configure Dask
#
# This isn't necessary, but for this particular bootstrap script it will make a
# few things easier:
#
# - Configure the cluster's dashboard link to show the proxied version through
# jupyter-server-proxy. This allows access to the dashboard with only an ssh
# tunnel to the notebook.
#
# - Specify the pre-packaged python environment, so users don't have to
#
# - Set the default deploy-mode to local, so the dashboard proxying works
#
# - Specify the location of the native libhdfs library so pyarrow can find it
# on the workers and the client (if submitting applications).
# ------------------------------------------------------------------------------
echo "Configuring Dask"
mkdir -p $HOME/.config/dask
cat <<EOT >> $HOME/.config/dask/config.yaml
distributed:
dashboard:
link: "/proxy/{port}/status"
yarn:
environment: /home/hadoop/environment.tar.gz
deploy-mode: local
worker:
env:
ARROW_LIBHDFS_DIR: /usr/lib/hadoop/lib/native/
client:
env:
ARROW_LIBHDFS_DIR: /usr/lib/hadoop/lib/native/
EOT
# Also set ARROW_LIBHDFS_DIR in ~/.bashrc so it's set for the local user
echo -e '\nexport ARROW_LIBHDFS_DIR=/usr/lib/hadoop/lib/native' >> $HOME/.bashrc
# -----------------------------------------------------------------------------
# 8. Install jupyter notebook server and dependencies
#
# We do this after packaging the worker environments to keep the tar.gz as
# small as possible.
#
# We install the following packages:
#
# - notebook: the Jupyter Notebook Server
# - ipywidgets: used to provide an interactive UI for the YarnCluster objects
# - jupyter-server-proxy: used to proxy the dask dashboard through the notebook server
# -----------------------------------------------------------------------------
echo "Installing Jupyter"
conda install \
-c conda-forge \
-y \
-q \
notebook \
ipywidgets \
jupyter-server-proxy \
jupyter
# -----------------------------------------------------------------------------
# 9. List all packages in the client environment
# -----------------------------------------------------------------------------
echo "Packages installed in the client environment:"
conda list
# -----------------------------------------------------------------------------
# 10. Configure Jupyter Notebook
# -----------------------------------------------------------------------------
echo "Configuring Jupyter"
mkdir -p $HOME/.jupyter
JUPYTER_PASSWORD="dask-user"
HASHED_PASSWORD=`python -c "from notebook.auth import passwd; print(passwd('$JUPYTER_PASSWORD'))"`
cat <<EOF >> $HOME/.jupyter/jupyter_notebook_config.py
c.NotebookApp.password = u'$HASHED_PASSWORD'
c.NotebookApp.open_browser = False
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.port = 8888
EOF
# -----------------------------------------------------------------------------
# 11. Define an upstart service for the Jupyter Notebook Server
#
# This sets the notebook server up to properly run as a background service.
# -----------------------------------------------------------------------------
echo "Configuring Jupyter Notebook Upstart Service"
cat <<EOF > /tmp/jupyter-notebook.service
[Unit]
Description=Jupyter Notebook
[Service]
ExecStart=$HOME/miniconda/bin/jupyter-notebook --allow-root --config=$HOME/.jupyter/jupyter_notebook_config.py
Type=simple
PIDFile=/run/jupyter.pid
WorkingDirectory=$HOME
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
sudo mv /tmp/jupyter-notebook.service /etc/systemd/system/
sudo systemctl enable jupyter-notebook.service
# -----------------------------------------------------------------------------
# 12. Start the Jupyter Notebook Server
# -----------------------------------------------------------------------------
echo "Starting Jupyter Notebook Server"
sudo systemctl daemon-reload
sudo systemctl restart jupyter-notebook.service
#$HOME/miniconda/bin/jupyter-notebook --allow-root --config=$HOME/.jupyter/jupyter_notebook_config.py
after this i start jupyter notebook using $HOME/miniconda/bin/jupyter-notebook --allow-root --config=$HOME/.jupyter/jupyter_notebook_config.py
jupyter notebook start successfully.
When i run this code on notebook
from dask_yarn import YarnCluster
from dask.distributed import Client
# Create a cluster
cluster = YarnCluster()
# Connect to the cluster
client = Client(cluster)
it gives error like
AttributeError Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 client = Client(cluster)
File ~/miniconda/lib/python3.9/site-packages/distributed/client.py:835, in Client.__init__(self, address, loop, timeout, set_as_default, scheduler_file, security, asynchronous, name, heartbeat_interval, serializers, deserializers, extensions, direct_to_workers, connection_limit, **kwargs)
832 elif isinstance(getattr(address, "scheduler_address", None), str):
833 # It's a LocalCluster or LocalCluster-compatible object
834 self.cluster = address
--> 835 status = getattr(self.cluster, "status")
836 if status and status in [Status.closed, Status.closing]:
837 raise RuntimeError(
838 f"Trying to connect to an already closed or closing Cluster {self.cluster}."
839 )
AttributeError: 'YarnCluster' object has no attribute 'status'
Also when I use LocalCluster instead of YarnCluster it run perfectly. I am stuck here for days please help. Also how we configure worker nodes.
I built a Docker Image for an R Shiny App and ran the corresponding container with Docker Toolbox on Windows 10 Home. When trying to open the App with my web browser, only the index is shown. I don't know why the app isn't executed.
The log shows me this:
*** warning - no files are being watched ***
[2019-08-12T15:34:42.688] [INFO] shiny-server - Shiny Server v1.5.12.1 (Node.js v10.15.3)
[2019-08-12T15:34:42.704] [INFO] shiny-server - Using config file "/etc/shiny-server/shiny-server.conf"
[2019-08-12T15:34:43.100] [INFO] shiny-server - Starting listener on http://[::]:3838
I already specified the app host-to-container path by executing the following command which refers to a docker hub image:
docker run --rm -p 3838:3838 -v /C/Docker/App/:/srv/shinyserver/ -v /C/Docker/shinylog:/var/log/shiny-server/ didsh123/ps_app:heatmap
My Docker File looks like the following:
# get shiny serves plus tidyverse packages image
FROM rocker/shiny-verse:latest
# system libraries of general use
RUN apt-get update && apt-get install -y \
sudo \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev \
libxt-dev \
libssl-dev \
libssh2-1-dev
##Install R packages that are required--> were already succesfull
RUN R -e "install.packages(c('shinydashboard','shiny', 'plotly', 'dplyr', 'magrittr'))"
#Heatmap related packages
RUN R -e "install.packages('gpclib', type='source')"
RUN R -e "install.packages('rgeos', type='source')"
RUN R -e "install.packages('rgdal', type='source')"
# copy app to image
COPY ./App /srv/shiny-server/App
# add .conf file to image/container to preserve log file
COPY ./shiny-server.conf /etc/shiny-server/shiny-server.conf
##When run image and create a container, this container will listen on port 3838
EXPOSE 3838
###Avoiding running as root --> run container as user instead
# allow permission
RUN sudo chown -R shiny:shiny /srv/shiny-server
# execute in the following as user --> imortant to give permission before that step
USER shiny
##run app
CMD ["/usr/bin/shiny-server.sh"]
So when I address the docker ip and the assessed port in the browser, the app should run there, but only the index is displayed. I use the following line:
http://192.168.99.100:3838/App/
I'm glad for every hint or advice. I'm new to Docker, so I'm also happy for detailed explanations.
To use shiny with docker, I suggest you use the golem package. golem provides a framework for builing shiny apps. If you have an app developed according to their framework, the function golem::add_dockerfile() can be used to create dockerfiles automatically.
If you are not interested in a framework, You can still have a look at the source for add_dockerfile() to see how they manage the deployment. Their strategy is to use shiny::runApp() with the port argument. Therefore, shiny-server is not necessary in this case.
The Dockerfile in golem looks roughly like this
FROM rocker/tidyverse:3.6.1
RUN R -e 'install.packages("shiny")'
COPY app.R /app.R
EXPOSE 3838
CMD R -e 'shiny::runApp("app.R", port = 3838, host = "0.0.0.0")'
This will make the app available on port 3838. Of course, you will have to install any other R packages and system dependencies.
Additional tips
To increase reproducibility, I would suggest you use remotes::install_version() instead of install.packages().
If you are going to deploy several applications with similar dependencies (for example shinydashboard), it makes sense to write your own base image that can be used in place of rocker/tidyverse:3.6.1. This way, your builds will be much quicker.
Minimal Reproducible Example
Create an empty directory (can be called anything you want)
Inside it, create two things:
i. A file called Dockerfile
ii. An empty directory called app
Place your shiny app inside the directory called app.
Your shiny app can be a single app.R file, or, for older shiny apps, ui.R and server.R. Either way is fine (see here for more on that).
If unsure about any of the above, just copy the files found here.
Place this in Dockerfile
FROM rocker/shiny:latest
COPY ./app/* /srv/shiny-server/
CMD ["/usr/bin/shiny-server"]
In the terminal, cd to the root of the empty directory you created in step 1, and build the image with
docker build -t shinyimage .
Run the container with
docker run -p 3838:3838 shinyimage
Finally, visit this url to see your shiny app: http://localhost:3838/
Here's a copy of all of the above in case anything's unclear.
Check the logs for any useful information? And exec into the container to verify if the App content is copied to the correct location.
Because the way /App content is copied looks incorrect
The content of /App is copied into the image to /srv/shiny-server/App during the build phase and you are trying to override /srv/shiny-server content using -v option when running the container.
Looks like during runtime the App data copied is being overwritten.
Try without -v /C/Docker/App/:/srv/shinyserver/ or use -v /C/Docker/App/:/srv/shinyserver/App/
docker run --rm -p 3838:3838 -v /C/Docker/shinylog:/var/log/shiny-server/ didsh123/ps_app:heatmap
I'm trying to put my shiny app in docker container. My shiny app works totally fine on my local computer. But after dockerize my shiny app, I always have error message on my localhost like The application failed to start. The application exited during initialization..
I have no idea why that happens. I'm new to docker. How can I find the error logs when I run the docker image? I need the log to know what goes wrong.
Here is my dockfile:
# Install R version 3.6
FROM r-base:3.6.0
# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev/unstable \
libxt-dev \
libssl-dev
# Download and install ShinyServer (latest version)
RUN wget --no-verbose https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/VERSION -O "version.txt" && \
VERSION=$(cat version.txt) && \
wget --no-verbose "https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/shiny-server-$VERSION-amd64.deb" -O ss-latest.deb && \
gdebi -n ss-latest.deb && \
rm -f version.txt ss-latest.deb
# Install R packages that are required
# TODO: add further package if you need!
RUN R -e "install.packages(c( 'tidyverse', 'ggplot2','shiny','shinydashboard', 'DT', 'plotly', 'RColorBrewer'), repos='http://cran.rstudio.com/')"
# Copy configuration files into the Docker image
COPY shiny-server.conf /etc/shiny-server/shiny-server.conf
COPY /app /srv/shiny-server/
# Make the ShinyApp available at port 80
EXPOSE 80
# Copy further configuration files into the Docker image
COPY shiny-server.sh /usr/bin/shiny-server.sh
CMD ["/usr/bin/shiny-server.sh"]
I built image and ran like below:
docker build -t myshinyapp .
docker run -p 80:80 myshinyapp
Usually the logs for any (live or dead) container can be found by just using:
docker logs full-container-name
or
docker logs CONTAINERID
(replacing the actual ID of your container)
As first said, this usually works as well even for stopped (not still removed) containers, which you can list with:
docker container ls -a
or just
docker ps -a
However, sometimes you won't even have a log, since the container was never created at all (which I think, by experience, fits more to your case)
And it can be happening simply because the docker engine is unable to allocate all of the resources that your service definition is requiring to have available.
The application failed to start. The application exited during initialization
is usually reflect of your docker engine being unable to get the required resources.
And the most common case for that, is just as simple as your host ports:
If you have another service (being dockerized or not) using (for example) that port that you want to use for your service (in your case, port 80) then Docker would just be unable to start your container.
So... in short... the easiest fix for that situation (and your first try whenever you face this kind of issues) is just to bind any other port from your host (say: 8080), to that 80 port that your service will be listening to internally (inside your container):
docker run -p 8080:80 myshinyapp
The same principle applies to unallocatable volumes (e.g.: trying to bind a volume as read-only that doesn't actually exist in the host)
As an aside comment/trick:
Since you're not setting a name for your container, you will need to use the container id instead when looking for its logs.
But instead of typing (or copy-pasting) the full container id (usually something like: 1283c66babea or even larger) you can just type in a few first digits instead, and it will still work as expected:
docker logs 1283c6 or docker logs 1283 or even docker logs 128
(of course... as long as you don't have any other 128***** container)
This Q&A is a response to this comment. The answer to the question in the comment is not trivial, is too big for a comment, and not suitable as an answer to the question in that thread (answering my own question is officially encouraged). If you have a better answer please post it!
The question is: How to install R on Solaris on a VirtualBox virtual machine?
A more up-to-date version is available from csw: r_base. To install, see the example in Getting started where you replace vim with r_base:
pkgadd -d http://get.opencsw.org/now
/opt/csw/bin/pkgutil -U
/opt/csw/bin/pkgutil -a r_base
/opt/csw/bin/pkgutil -y -i r_base
To install a development environment, you might also want:
/opt/csw/bin/pkgutil -y -i gcc4g++
/opt/csw/bin/pkgutil -y -i texlive
Start by downloading and installing Oracle VM VirtualBox.
Then download and unzip the Oracle Solaris 11.1 VirtualBox Template. After you unzip the Oracle template you should see a file called OracleSolaris11_1.ova, that's what you'll open in VirtualBox.
Start VirtualBox, click on File, then Import Appliance, then navigate to chose the ova file you just extracted. It will take some time to import.
Start the Solaris virtual machine by clicking on the start button on VirtualBox. It will take some time to start up and you'll be prompted to add a root password, user name and user password. You'll then use those details to log in, wait for the system to load, choose gnome to ensure you get a desktop environment, and choose your time zone, keyboard layout and language (mine seems to highlight Chinese as the default choice, so be careful not to click through that one too quickly).
Eventually you'll get a desktop, right-click on the desktop and click open terminal, then in the terminal type (or paste):
sudo wget https://oss.oracle.com/ORD/ord-3.0.1-sol10-x86-64-sunstudio12u3.tar.gz && sudo wget https://oss.oracle.com/ORD/ord-3.0.1-supporting-sol10-x86-64-sunstudio12u3.tar.gz
That will connect to the internet and download two files you need. The next line will unpack those two archives:
sudo tar -xzvf ord-3.0.1-sol10-x86-64-sunstudio12u3.tar.gz && sudo tar -xzvf ord-3.0.1-supporting-sol10-x86-64-sunstudio12u3.tar.gz
And then this next line installs R, watch for the prompts after you run the line:
sudo bash install.sh
A lot will flash by in the terminal, concluding with Installation of <ORD> was successful
Now the next bit is where I deviate from the instructions here because I didn't understand them. You'll move all files beginning with lib from the archives that you unpacked into another directory where they are needed by R:
sudo mv lib* /usr/lib/64/R/lib/
That will return nothing in the terminal. Then we can run R simply by typing in the terminal like so
R
And now you should have a regular R session running in the terminal.
I have the following script for some number crunching
#!/bin/bash
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y r-base r-base-dev htop s3cmd p7zip-full
wget https://s3.amazonaws.com/#######/###.7z
7z e ###.7z
sudo R CMD BATCH --slave --no-timing --vanilla "--args 0 1 100 200 500 2" SOME-ROUTINE.R
s3cmd put *.results s3://#########/
on EC2. I upload the script as file at the Launch Instance->Instance Details->User Data
The machine fires up, updates and upgrades but then it does not execute wget and does not download the file. When i SSH in the Instance and run the exact same commands the process completes without problems.
Any ideas why wget does not work?
Any other alternatives?
EC
It is always a bit of guessing, but here is how I would debug this:
My first suggestion would be to check for special characters in the S3 URL. This might cause the wget call to fail.
Second, I would give an explicit output path to wget with the -O option. While you are editing the command, you can also add -o to output logging information.
Last step is to check your access rights to the S3 bucket. Perhaps you can try to put the file on another webspace to see if the command executes then.