Adding DT R packages prevents Docker from running - r

I use Docker to run Shiny app, and I install some R packages in the Dockerfile (here is the portion of the Dockerfile, I omitted some lines, marking it with <...>):
FROM r-base:latest
RUN apt-get update && apt-get install -y -t unstable \
sudo \
gdebi-core \
make \
git \
gcc \
<...>
R -e "install.packages(c('shiny', 'rmarkdown'), repos='https://cran.rstudio.com/')" && \
R -e "install.packages(c('ada','bsplus','caret','ddalpha','diptest','doMC','dplyr','e1071','evtree','fastAdaboost','foreach','GGally','ggplot2','gridExtra','iterators','kernlab','lattice','markdown','MASS','mboost','nnet','optparse','partykit','plyr','pROC','PRROC','randomForest','recipes','reshape2','RSNNS','scales','shinyBS','shinyFiles','shinythemes'))"
This works fine. But if I add one more R package (DT), the container still builds fine (and I can see that the package gets installed properly) but when I try to run the container I get:
Loading required package: shiny
Error in dir.exists(lib) : invalid filename argument
Calls: <Anonymous> ... load_libraries -> get_package -> install.packages -> dir.exists
Execution halted
This error is not informative at all and I can't figure out what possibly can be wrong. I will appreciate any ideas! Thank you.

Related

Docker does not find R despite pre-built layers

I create a docker image to run R scripts on a VM server with no access to the internet.
For the first layer I load R and all libraries
Dockerfile1
FROM r-base
## Needed to access R
ENV R_HOME /usr/lib/R
## install required libraries
RUN apt-get update
RUN apt-get -y install libgdal-dev
## install R-packages
RUN R -e "install.packages('dplyr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
...
and create it
docker build -t mycreate_od/libraries -f Dockerfile1 .
Then I use this library layer to load the R script
Dockerfile2
FROM mycreate_od/libraries
## Create directory
RUN mkdir -p /home/analysis/
## Copy files
COPY my_script_dir /home/analysis/
## Run the script
CMD R -e "source('/home/analysis/my_script_dir/main.R')"
Create the analysis layer
docker build -t mycreate_od/analysis -f vault/Dockerfile2 .
On my master VM, this runs and suceeds, but on the fresh VM I get
docker run mycreate_od/analysis
R docker ERROR: R_HOME ('/usr/lib/R') not found - Recherche Google
From a previous bug search I have set the ENV variable in the Docker (see Dockerfile1),
but it looks like docker installs R on some other place.
Thanks to Dirk advice I managed to get it done using r-apt (see Dockerfile below).
The image get then built and can be run without the R_HOME error.
BTW much faster and with a significantly smaller resulting image.
FROM rocker/r-apt:bionic
RUN apt-get update && \
apt-get -y install libgdal-dev && \
apt-get install -y -qq \
r-cran-dplyr \
r-cran-rjson \
r-cran-data.table \
r-cran-crayon \
r-cran-geosphere \
r-cran-lubridate \
r-cran-sp \
r-cran-R.utils
RUN R -e 'install.packages("tools")'
RUN R -e 'install.packages("rgdal", repos="http://R-Forge.R-project.org")'
This is unfortunately a cargo-boat solution, as I am unable to explain why the previous Dockerfile failed.

Why renv in docker is not installing and loading packages?

This is the first time I'm using Docker. I have tried a few times but this is the first time where I'm containerizing the project. The project generates reports using latex and officer. However, after building the container and running the image, Rstudio shows that none of the packages are installed and loaded.
Dockerfile
FROM rocker/verse:4.0.5
ENV RENV_VERSION 0.15.1
RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "remotes::install_github('rstudio/renv#${RENV_VERSION}')"
WORKDIR /REAL
COPY . .
RUN apt-get -y update --fix-missing
#RUN apt-get -y install build-essential \
# libssl-dev \
# libcurl4-openssl-dev \
# libxml2-dev \
# libfontconfig1-dev \
# libcairo2-dev
RUN R -e 'options(renv.consent = TRUE); renv::restore(lockfile = "/REAL/renv.lock"); renv::activate()'
Building Docker image
docker build -t reports/real .
Running docker with rstudio support
docker run --rm -d -p 8787:8787 --name real_report -e ROOT=TRUE -e DISABLE_AUTH=TRUE -v C:/Users/test/sources/REAL_Reports:/home/rstudio/REAL_Reports reports/real:latest
What am I doing wrong and how do I fix it so that all packages that renv has discovered are installed and loaded?
RUN R -e 'options(renv.consent = TRUE); renv::restore(lockfile = "/REAL/renv.lock"); renv::activate()'
You likely need to activate / load your project before calling restore(), or just forego using a project-local library altogether.
https://rstudio.github.io/renv/articles/docker.html may also be useful here.

Sparklyr fails to download Spark from apache in Dockerfile

I am trying to create a dockerfile that builds an image from Rocker/tidyverse and include Spark from sparklyr. Previously, on this post: Unable to install spark with sparklyr in Dockerfile, I was trying to figure out why spark wouldn't download from my dockerfile. After playing with it for the past 5 days I think I have found the reason but have no idea how to fix it.
Here is my Dockerfile:
# start with the most up-to-date tidyverse image as the base image
FROM rocker/tidyverse:latest
# install openjdk 8 (Java)
RUN apt-get update \
&& apt-get install -y openjdk-8-jdk
# Install devtools
RUN Rscript -e 'install.packages("devtools")'
# Install sparklyr
RUN Rscript -e 'devtools::install_version("sparklyr", version = "1.5.2", dependencies = TRUE)'
# Install spark
RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
RUN mv /root/spark /opt/ && \
chown -R rstudio:rstudio /opt/spark/ && \
ln -s /opt/spark/ /home/rstudio/
RUN apt-get install unixodbc unixodbc-dev --install-suggests
RUN apt-get install odbc-postgresql
RUN install2.r --error --deps TRUE DBI
RUN install2.r --error --deps TRUE RPostgres
RUN install2.r --error --deps TRUE dbplyr
It has no problem downloading everything up until this line:
RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
Which then gives me the error:
Step 5/11 : RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
---> Running in 739775db8f12
Error in download.file(installInfo$packageRemotePath, destfile = installInfo$packageLocalPath, :
download from 'https://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop3.2.tgz' failed
Calls: <Anonymous>
Execution halted
ERROR: Service 'rocker_sparklyr' failed to build : The command '/bin/sh -c Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'' returned a non-zero code: 1
After doing some research I thought that it was a timeout error, in which case I ran beforehand:
RUN Rscript -e 'options(timeout=600)'
This did not increase the time it took to error out again. I installed everything onto my personal machine through Rstudio and it installed with no problems. I think the problem is specific to docker in that it isn't able to download from https://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop3.2.tgz
I have found very little documentation on this problem and am relying heavily on this post to figure it out. Thank you in advance to anyone with this knowledge for reaching out.
download the version yourself and then use this function to install
sparklyr::spark_install_tar(tarfile ="~/spark/spark-3.0.1-bin-hadoop3.2.tgz")

cannot install devtools in r shiny docker

I'm trying to build a docker image for my shiny app. Below is my dockerfile. When I build my images, everything else seems fine, except I got error message Error in library(devtools) : there is no package called ‘devtools’ Execution halted. I also tried devtools::install_github('nik01010/dashboardthemes') with no success. I have non clue why? What could go wrong? Do anyone know what is wrong with my dockerfile? Thanks a lot.
# Install R version 3.6
FROM r-base:3.6.0
# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev/unstable \
libxt-dev \
libssl-dev
# Download and install ShinyServer (latest version)
RUN wget --no-verbose https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/VERSION -O "version.txt" && \
VERSION=$(cat version.txt) && \
wget --no-verbose "https://s3.amazonaws.com/rstudio-shiny-server-os-build/ubuntu-12.04/x86_64/shiny-server-$VERSION-amd64.deb" -O ss-latest.deb && \
gdebi -n ss-latest.deb && \
rm -f version.txt ss-latest.deb
# Install R packages that are required
RUN R -e "install.packages(c('devtools', 'shiny','shinythemes','shinydashboard','shinyWidgets','shinyjs', 'tidyverse', 'dplyr', 'ggplot2','rlang','DT','lubridate', 'plotly', 'leaflet', 'mapview', 'tigris', 'rgdal', 'visNetwork', 'wordcloud2', 'arules'), repos='http://cran.rstudio.com/')"
RUN R -e "library(devtools)"
RUN R -e "install_github('nik01010/dashboardthemes')"
# Copy configuration files into the Docker image
COPY shiny-server.conf /etc/shiny-server/shiny-server.conf
COPY /app /srv/shiny-server/
# Make the ShinyApp available at port 80
EXPOSE 80
# Copy further configuration files into the Docker image
COPY shiny-server.sh /usr/bin/shiny-server.sh
CMD ["/usr/bin/shiny-server.sh"]
There are a few approaches you might try.
Easiest:
Use remotes::install_github instead of devtools. remotes has many fewer dependencies if you don't need the other functionality.
Second Easiest:
Use rocker/tidyverse image from Docker Hub instead of baseR image.
docker pull rocker/tidyverse
Change line 2:
FROM rocker/verse
Hardest:
Otherwise, you will need to figure out which dependencies you need to install inside your docker image before you can install devtools. It will likely be obvious if you try to install it interactively.
Make sure the container is running
Get the container name using docker ps
Start a shell with docker exec -it <container name> /bin/bash
Start R and try to install devtools interactively

adding R package for installation from Github in Dockerfile

How must I add installation instruction for a R package using github with my Dockerfile.
The usual command in R environment is:
devtools::install_github("smach/rmiscutils")
But no success so far. Tried to add github repo to installation instructions:
RUN install2.r --error \
-r 'http://cran.rstudio.com' \
-r 'http://github.com/smach/rmiscutils'
But I get an error:
error in download. Status was '404 Not found'
Maybe using vanilla R call but can't figure the command.
Any Hint?
Try this:
RUN installGithub.r smach/rmiscutils \
&& rm -rf /tmp/downloaded_packages/
A good practice is to remove all downloaded packages to reduce image size.
Here added command when installing from Bioconductor (in case someone ends here searching for help)
RUN install2.r -r http://bioconductor.org/packages/3.0/bioc --deps TRUE \
phyloseq \
&& rm -rf /tmp/downloaded_packages/

Resources