Building a Docker Image a little more quickly [closed] - r

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I plan on using a service to receive and grade student code submissions for a class I'll be teaching next semester.
For each assignment (there are many) a shell script runs to build a Docker image. I upload a zip file to the website, and among all the compressed files, there is this one:
#!/usr/bin/env bash
# these lines install R on the virtual machine
apt-get install -y libxml2-dev libcurl4-openssl-dev libssl-dev
apt-get install -y r-base
# these lines install the packages that is needed both
# 1. the student code
# 2. the autograding code
# Note that
# a. devtools is for install_github This is temporary and will be changed once the updates to gradeR have made it to CRAN.
Rscript -e "install.packages('devtools')"
Rscript -e "library(devtools); install_github('tbrown122387/gradeR')"
# These are packages that many students in the class will use
Rscript -e "install.packages('tidyverse')"
Rscript -e "install.packages('stringr')"
The problem though is that this takes about 20 minutes. How do I speed this up? I'm totally new to Docker containers.

First, I'd suggest building a base image containing all of the tools and packages that you think you'll need. There's no need to be picky, because you only need to do this once. That's kind of the whole point of Docker -- portability and reuse.
FROM ubuntu:bionic
RUN apt-get update && apt-get install -y libxml2-dev libcurl4-openssl-dev libssl-dev r-base
RUN Rscript -e "install.packages('tidyverse')"
RUN Rscript -e "install.packages('stringr')"
...
Build that image and tag it as grader:1.0.0 or whatever.
Then, when it's time to grade, just mount the assignments and grading code using the -v, --volume option to docker run. You don't need to alter the container to make files accessible within it.
docker run \
--rm \
-it \
-v /path/to/assignments:/data/assignments \
-v /path/to/autograder:/data/autograder \
grader:1.0.0 \
/bin/bash
If at some point you need to add some packages, you can rebuild it by modifying the original Dockerfile or extend it by using it as the base of your next image:
FROM grader:1.0.0
RUN apt-get update && apt-get install -y the-package-i-forgot
Build it, tag it.

Use rocker/tidyverse image from Docker Hub instead of whatever image you're using.
First:
docker pull rocker/tidyverse
Then add this line:
FROM rocker/verse

Related

Shiny app does not appear when deployed using Shinyproxy

I'm trying to learn how to deploy a shiny app using Shinyproxy, and I'm using the templated "euler app" (from this repo), but the application does not appear when I navigate to http://localhost:4445. Here's the most similar question I could find, but unfortunately not helpful to my issue: link.
Background
All installations seem fine, and I successfully installed Docker and Java.
The Dockerfile and Docker image work locally, no issues there. The command docker run --rm -p 3838:3838 shiny-euler-app works.
Here is my Dockerfile (copied from the repo):
FROM openanalytics/r-base
MAINTAINER Tobias Verbeke "tobias.verbeke#openanalytics.eu"
# system libraries of general use
RUN apt-get update && apt-get install -y \
sudo \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev \
libxt-dev \
libssl-dev \
libssh2-1-dev \
libssl1.1
# system library dependency for the euler app
RUN apt-get update && apt-get install -y \
libmpfr-dev
# basic shiny functionality
RUN R -e "install.packages(c('shiny', 'rmarkdown'), repos='https://cloud.r-project.org/')"
# install dependencies of the euler app
RUN R -e "install.packages('Rmpfr', repos='https://cloud.r-project.org/')"
# copy the app to the image
RUN mkdir /root/euler
COPY euler /root/euler
COPY Rprofile.site /usr/lib/R/etc/
EXPOSE 3838
CMD ["R", "-e", "shiny::runApp('/root/euler')"]
As well, Shinyproxy works fine with the default openanalytics/shinyproxy-demo Docker image, as you can see:
Problem
The issue I have is when I try and supply a different Shiny app and its accompanying application.yml. Here is the application.yml file I'm using (I've tried to make it as basic as possible, with no authentication, etc):
proxy:
title: Standalone Docker Engine
port: 4445
authentication: none
docker:
url: http://localhost:2375
specs:
- id: euler
display-name: Euler's number
container-cmd: ["R", "-e", "shiny::runApp('/root/euler')"]
container-image: shiny-euler-app
Unfortunately, when I run java -jar shinyproxy-2.4.2.jar (in the directory which contains the shinyproxy-2.4.2.jar file and the application.yml file) I get this blank webpage:
For some reason, I am able to access the Shinyproxy webpage, but the Dockerized Shiny app does not appear.
Would really appreciate any helpful suggestions on where/how I could try and solve this issue. Thanks!

Application logs to stdout with Shiny Server and Docker

I have a Docker container running a shiny app (Dockerfile here).
Shiny server logs are output to stdout and application logs are written to /var/log/shiny-server. I'm deploying this container to AWS Fargate and logging applications only display stdout which makes debugging an application when deployed challenging. I'd like to write the application logs to stdout.
I've tried a number of potential solutions:
I've tried the solution provided here, but have had no luck.. I added the exec xtail /var/log/shiny-server/ to my shiny-server.sh as the last line in the file. App logs are not written to stdout
I noticed that writing application logs to stdout is now the default behavior in rocker/shiny, but as I'm using rocker/verse:3.6.2 (upgraded from 3.6.0 today) along with RUN export ADD=shiny, I don't think this is standard behavior for the rocker/verse:3.6.2 container with Shiny add-on. As a result, I don't get the default behavior out of the box.
This issue on github suggests an alternative method of forcing application logging to stdout by way of an environment variable SHINY_LOG_STDERR=1 set at runtime but I'm not Linux-savvy enough to know where that env variable needs to be set to be effective. I found this documentation from Shiny Server v1.5.13 which gave suggestions in which file to set the environment variable depending on Linux distro; however, the output from my container when I run cat /etc/os-release is:
which doesn't really line up with any of the distributions in the Shiny Server documentation, thus making the documentation unhelpful.
I tried adding adding the environment variable from the github issue above in the docker run command, i.e.,
docker run --rm -e SHINY_LOG_STDERR=1 -p 3838:3838 [my image]
as well as
docker run --rm -e APPLICATION_LOGS_TO_STDOUT=true -p 3838:3838 [my image]
and am still not getting the logs to stdout.
I must be missing something here. Can someone help me identify how to successfully get application logs to stdout successfully?
You can add the line ENV SHINY_LOG_STDERR=1 to your Dockerfile (at least, this works with rocker/shiny, not sure about rocker/verse), such as with your Dockerfile:
FROM rocker/verse:3.6.2
## Add shiny capabilities to container
RUN export ADD=shiny && bash /etc/cont-init.d/add
## Install curl and xtail
RUN apt-get update && apt-get install -y \
curl \
xtail
## Add pip3 and other Python packages
RUN sudo apt-get update -y && apt-get install -y python3-pip
RUN pip3 install boto3
## Add R packages
RUN R -e "install.packages(c('shiny', 'tidyverse', 'tidyselect', 'knitr', 'rmarkdown', 'jsonlite', 'odbc', 'dbplyr', 'RMySQL', 'DBI', 'pander', 'sciplot', 'lubridate', 'zoo', 'stringr', 'stringi', 'openxlsx', 'promises', 'future', 'scales', 'ggplot2', 'zip', 'Cairo', 'tinytex', 'reticulate'), repos = 'https://cran.rstudio.com/')"
## Update and install
RUN tlmgr update --self --all
RUN tlmgr install ms
RUN tlmgr install beamer
RUN tlmgr install pgf
#Copy app dir and theme dirs to their respective locations
COPY iarr /srv/shiny-server/iarr
COPY iarr/reports/interim_annual_report/theme/SwCustom /opt/TinyTeX/texmf-dist/tex/latex/beamer/
#Force texlive to find my custom beamer theme
RUN texhash
EXPOSE 3838
## Add shiny-server information
COPY shiny-server.sh /usr/bin/shiny-server.sh
COPY shiny-customized.config /etc/shiny-server/shiny-server.conf
## Add dos2unix to eliminate Win-style line-endings and run
RUN apt-get update -y && apt-get install -y dos2unix
RUN dos2unix /usr/bin/shiny-server.sh && apt-get --purge remove -y dos2unix && rm -rf /var/lib/apt/lists/*
# Enable Logging from stdout
ENV SHINY_LOG_STDERR=1
RUN ["chmod", "+x", "/usr/bin/shiny-server.sh"]
CMD ["/usr/bin/shiny-server.sh"]

How to add plugins via docker compose cammand option?

I need to download some free plugins from wordpress website and move that folders to plugins folder via command property in the docker-compose
Is there any way to execute shell script after the docker compose is completed?
> command: apt-get install -y curl curl -SL
https://downloads.wordpress.org/plugin/advanced-custom-fields.5.7.12.zip
or a bash script for this?
The documentation says:
The command can also be a list, in a manner similar to dockerfile:
command: ["bundle", "exec", "thin", "-p", "3000"]
You can:
docker-compose run <your_service> apt-get install -y curl
docker-compose run <your_service> curl -SL https://downloads.wordpress.org/plugin/advanced-custom-fields.5.7.12.zip
You can even do it in 1 command but I'm guessing the installation is one off and you might include that in your image ;)
Hope this helps.

Run a service automatically in a docker container

I'm setting up a simple image: one that holds Riak (a NoSQL database). The image starts the Riak service with riak start as a CMD. Now, if I run it as a daemon with docker run -d quintenk/riak-dev, it does start the Riak process (I can see that in the logs). However, it closes automatically after a few seconds. If I run it using docker run -i -t quintenk/riak-dev /bin/bash the riak process is not started (UPDATE: see answers for an explanation for this). In fact, no services are running at all. I can start it manually using the terminal, but I would like Riak to start automatically. I figure this behavior would occur for other services as well, Riak is just an example.
So, running/restarting the container should automatically start Riak. What is the correct approach of setting this up?
For reference, here is the Dockerfile with which the image can be created (UPDATE: altered using the chosen answer):
FROM ubuntu:12.04
RUN apt-get update
RUN apt-get install -y openssh-server curl
RUN curl http://apt.basho.com/gpg/basho.apt.key | apt-key add -
RUN bash -c "echo deb http://apt.basho.com precise main > /etc/apt/sources.list.d/basho.list"
RUN apt-get update
RUN apt-get -y install riak
RUN perl -p -i -e 's/(?<=\{http,\s\[\s\{")127\.0\.0\.1/0.0.0.0/g' /etc/riak/app.config
EXPOSE 8098
CMD /bin/riak start && tail -F /var/log/riak/erlang.log.1
EDIT: -f changed to -F in CMD in accordance to sesm his remark
MY OWN ANSWER
After working with Docker for some time I picked up the habit of using supervisord to tun my processes. If you would like example code for that, check out https://github.com/Krijger/docker-cookbooks. I use my supervisor image as a base for all my other images. I blogged on using supervisor here.
To keep docker containers running, you need to keep a process active in the foreground.
So you could probably replace that last line in your Dockerfile with
CMD /bin/riak console
Or even
CMD /bin/riak start && tail -F /var/log/riak/erlang.log.1
Note that you can't have multiple lines of CMD statements, only the last one gets run.
Using tail to keep container alive is a hack. Also, note, that with -f option container will terminate when log rotation happens (this can be avoided by using -F instead).
A better solution is to use supervisor. Take a look at this tutorial about running Riak in a Docker container.
The explanation for:
If I run it using docker run -i -t quintenk/riak-dev /bin/bash the riak process is not started
is as follows. Using CMD in the Dockerfile is actually the same functionality as starting the container using docker run {image} {command}. As Gigablah remarked only the last CMD is used, so the one written in the Dockerfile is overwritten in this case.
By using CMD /bin/riak start && tail -f /var/log/riak/erlang.log.1 in the Buildfile, you can start the container as a background process using docker run -d {image}, which works like a charm.
"If I run it using docker run -i -t quintenk/riak-dev /bin/bash the riak process is not started"
It sounds like you only want to be able to monitor the log when you attach to the container. My use case is a little different in that I want commands started automatically, but I want to be able to attach to the container and be in a bash shell. I was able to solve both of our problems as follows:
In the image/container, add the commands you want automatically started to the end of the /etc/bash.bashrc file.
In your case just add the line /bin/riak start && tail -F /var/log/riak/erlang.log.1, or put /bin/riak start and tail -F /var/log/riak/erlang.log.1 on separate lines depending on the functionality desired.
Now commit your changes to your container, and run it again with: docker run -i -t quintenk/riak-dev /bin/bash. You'll find the commands you put in the bashrc are already running as you attach.
Because I want a clean way to have the process exit later I make the last command a call to the shell's read which causes that process to block until I later attach to it and hit enter.
arthur#macro:~/docker$ sudo docker run -d -t -i -v /raid:/raid -p 4040:4040 subsonic /bin/bash -c 'service subsonic start && read -p "waiting"'
WARNING: Docker detected local DNS server on resolv.conf. Using default external servers: [8.8.8.8 8.8.4.4]
f27229a260c9
arthur#macro:~/docker$ sudo docker ps
[sudo] password for arthur:
ID IMAGE COMMAND CREATED STATUS PORTS
35f253bdf45a subsonic:latest /bin/bash -c service 2 days ago Up 2 days 4040->4040
arthur#macro:~/docker$ sudo docker attach 35f253bdf45a
arthur#macro:~/docker$ sudo docker ps
ID IMAGE COMMAND CREATED STATUS PORTS
as you can see the container exits after you attach to it and unblock the read.
You can of course use a more sophisticated script than read -p if you need to do other clean up, such as stopping services and saving logs etc.
I use a simple trick whenever I start building a new docker container. To keep it alive, I use a ping in the entrypoint script.
So in the Dockerfile, when using debian, for instance, I make sure I can ping.
This is btw, always nice, to check what is accessible from within the container.
...
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -y iputils-ping
...
ENTRYPOINT ["entrypoint.sh"]
And in the entrypoint.sh file
#!/bin/bash
...
ping 10.10.0.1 >/dev/null 2>/dev/null
I use this instead of CMD bash, as I always wind up using a startup file.

EC2 startup script gets stuck on wget

I have the following script for some number crunching
#!/bin/bash
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y r-base r-base-dev htop s3cmd p7zip-full
wget https://s3.amazonaws.com/#######/###.7z
7z e ###.7z
sudo R CMD BATCH --slave --no-timing --vanilla "--args 0 1 100 200 500 2" SOME-ROUTINE.R
s3cmd put *.results s3://#########/
on EC2. I upload the script as file at the Launch Instance->Instance Details->User Data
The machine fires up, updates and upgrades but then it does not execute wget and does not download the file. When i SSH in the Instance and run the exact same commands the process completes without problems.
Any ideas why wget does not work?
Any other alternatives?
EC
It is always a bit of guessing, but here is how I would debug this:
My first suggestion would be to check for special characters in the S3 URL. This might cause the wget call to fail.
Second, I would give an explicit output path to wget with the -O option. While you are editing the command, you can also add -o to output logging information.
Last step is to check your access rights to the S3 bucket. Perhaps you can try to put the file on another webspace to see if the command executes then.

Resources