Installed packages are not available inside container image - r

The code below didn't install some of the packages (tidyverse", "odbc") in docker. When running dockerfile it doesn't show an error but afterward running DAG (running my r script), it shows an error saying that couldn't find the package (tidyverse", "odbc").
Why is that? 'http://cran.rstudio.com' kinda should contain "tidyverse" and "odbc".
Dockerfile:
FROM apache/airflow:1.10.12-python3.8
USER root
RUN apt update -y && apt install -y vim
RUN pip install --upgrade pip
RUN apt-get install -y r-base
RUN echo "r <- getOption('repos'); r['CRAN'] <- 'http://cran.rstudio.com'; options(repos = r);" > ~/.Rprofile
RUN Rscript -e "install.packages('DBI')"
RUN Rscript -e "install.packages('data.table')"
RUN Rscript -e "install.packages('dplyr')"
RUN Rscript -e "install.packages('dbplyr')"
RUN Rscript -e "install.packages('magrittr')"
RUN Rscript -e "install.packages('furrr')"
RUN Rscript -e "install.packages('lubridate')"
RUN Rscript -e "install.packages('future')"
RUN Rscript -e "install.packages('jsonlite')"
RUN Rscript -e "install.packages('odbc')"
RUN Rscript -e "install.packages('tidyverse')"
USER airflow
# Copy files
COPY . ./
# Install dependencies
RUN pip install -r requirements.txt
# Python path
ENV PYTHONPATH "${PYTHONPATH}:/opt/airflow"
RUN airflow initdb
As I'm struggling with some packages, the R.file can be as easy as:
library("dplyr")
library("tidyverse")
print("text")
The error after running R.file in DAG:
INFO - Error in library("tidyverse") : there is no package called ‘tidyverse’.
So there is "dplyr", but there isn't "tidyverse".
A

Related

Install R package without biocmanager

I am using bioconductor image in order to install R packages. The problem I am facing is that I can't install specific version of the package.
I have the following Dockerfile:
FROM bioconductor/bioconductor_docker:bioc2020
RUN apt-get update \
&& apt-get install -y python3-pip python3-dev \
&& cd /usr/local/bin \
&& ln -s /usr/bin/python3 python \
&& pip3 install --upgrade pip
RUN Rscript -e "BiocManager::install('ggplot2')"
RUN Rscript -e "BiocManager::install('DESeq2')"
RUN Rscript -e "BiocManager::install('RColorBrewer')"
RUN Rscript -e "BiocManager::install('ggrepel')"
RUN Rscript -e "BiocManager::install('factoextra')"
RUN Rscript -e "BiocManager::install('FactoMineR')"
RUN Rscript -e "BiocManager::install('apeglm')"
The installation of DESeq2 failed because for locfit package R version>4.1.0 is required. I want to install previous version of locfit, but it seems that I can't because even if I used the following command:
RUN Rscript -e "install.packages('locfit', version='1.5-9.4')"
It actually use BiocManager.
Any help will be useful!

Sparklyr fails to download Spark from apache in Dockerfile

I am trying to create a dockerfile that builds an image from Rocker/tidyverse and include Spark from sparklyr. Previously, on this post: Unable to install spark with sparklyr in Dockerfile, I was trying to figure out why spark wouldn't download from my dockerfile. After playing with it for the past 5 days I think I have found the reason but have no idea how to fix it.
Here is my Dockerfile:
# start with the most up-to-date tidyverse image as the base image
FROM rocker/tidyverse:latest
# install openjdk 8 (Java)
RUN apt-get update \
&& apt-get install -y openjdk-8-jdk
# Install devtools
RUN Rscript -e 'install.packages("devtools")'
# Install sparklyr
RUN Rscript -e 'devtools::install_version("sparklyr", version = "1.5.2", dependencies = TRUE)'
# Install spark
RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
RUN mv /root/spark /opt/ && \
chown -R rstudio:rstudio /opt/spark/ && \
ln -s /opt/spark/ /home/rstudio/
RUN apt-get install unixodbc unixodbc-dev --install-suggests
RUN apt-get install odbc-postgresql
RUN install2.r --error --deps TRUE DBI
RUN install2.r --error --deps TRUE RPostgres
RUN install2.r --error --deps TRUE dbplyr
It has no problem downloading everything up until this line:
RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
Which then gives me the error:
Step 5/11 : RUN Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'
---> Running in 739775db8f12
Error in download.file(installInfo$packageRemotePath, destfile = installInfo$packageLocalPath, :
download from 'https://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop3.2.tgz' failed
Calls: <Anonymous>
Execution halted
ERROR: Service 'rocker_sparklyr' failed to build : The command '/bin/sh -c Rscript -e 'sparklyr::spark_install(version = "3.0.0", hadoop_version = "3.2")'' returned a non-zero code: 1
After doing some research I thought that it was a timeout error, in which case I ran beforehand:
RUN Rscript -e 'options(timeout=600)'
This did not increase the time it took to error out again. I installed everything onto my personal machine through Rstudio and it installed with no problems. I think the problem is specific to docker in that it isn't able to download from https://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop3.2.tgz
I have found very little documentation on this problem and am relying heavily on this post to figure it out. Thank you in advance to anyone with this knowledge for reaching out.
download the version yourself and then use this function to install
sparklyr::spark_install_tar(tarfile ="~/spark/spark-3.0.1-bin-hadoop3.2.tgz")

Minimizing the size of docker image R shiny app

Good afternoon to everybody.
I deployed a shiny app in Docker.
I need Bioconductor packages/ I find the way and the app working properly.
But now I've got shown that is very big size the app.
probably this is due to 2 layers ( 1 of the app and 1 of the Bioconductor packages.)
I have read also that I must remove the second docker layer and try to install Bioconductor on the rocker/shiny image. But I do not know how to realize that information. I attach the dockerfile.
Does anybody have idea please to make lighter the app in docker?
# Base image https://hub.docker.com/u/rocker/
FROM rocker/shiny:latest
# system libraries of general use
## install debian packages
RUN apt-get update -qq && \
apt-get upgrade -y && \
apt-get -y --no-install-recommends install \
libxml2-dev \
libcairo2-dev \
libsqlite3-dev \
libmariadbd-dev \
libpq-dev \
libssh2-1-dev \
unixodbc-dev \
libcurl4-openssl-dev \
libssl-dev \
coinor-libcbc-dev coinor-libclp-dev libglpk-dev && \
apt-get clean
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
RUN apt-get update
RUN R -e 'BiocManager::install(ask = F)' && R -e 'BiocManager::install(c("rtracklayer", \
"GenomicAlignments", "Biostrings", "SummarizedExperiment", "Rsamtools", ask = F))'
# copy necessary files
## app folder
COPY ./folder ./app
# install renv & restore packages
RUN Rscript -e 'install.packages("renv")'
RUN Rscript -e 'install.packages("devtools")'
RUN Rscript -e 'install.packages("shiny")'
RUN Rscript -e 'install.packages("shinyBS")'
RUN Rscript -e 'install.packages("ggvis")'
RUN Rscript -e 'install.packages("shinycssloaders")'
RUN Rscript -e 'install.packages("shinyWidgets")'
RUN Rscript -e 'install.packages("plotly")'
RUN Rscript -e 'install.packages("RSQLite")'
RUN Rscript -e 'install.packages("knitr")'
RUN Rscript -e 'install.packages("knitcitations")'
RUN Rscript -e 'install.packages("Matrix")'
RUN Rscript -e 'install.packages("plotly")'
RUN Rscript -e 'install.packages("igraph")'
RUN Rscript -e 'install.packages("ggthemes")'
RUN Rscript -e 'install.packages("evaluate")'
RUN Rscript -e 'install.packages("psych")'
RUN Rscript -e 'install.packages("kableExtra")'
RUN Rscript -e 'install.packages("ggjoy")'
RUN Rscript -e 'install.packages("gtools")'
RUN Rscript -e 'install.packages("gridExtra")'
RUN Rscript -e 'install.packages("ggrepel")'
RUN Rscript -e 'install.packages("data.table")'
RUN Rscript -e 'install.packages("stringr")'
RUN Rscript -e 'install.packages("rmarkdown")'
RUN Rscript -e 'install.packages("shinyjqui")'
RUN Rscript -e 'install.packages("V8")'
RUN Rscript -e 'devtools::install_github("ThomasSiegmund/D3TableFilter")'
RUN Rscript -e 'devtools::install_github("leonawicz/apputils")'
RUN Rscript -e 'devtools::install_github("dirkschumacher/ompr")'
RUN Rscript -e 'devtools::install_github("dirkschumacher/ompr.roi")'
RUN Rscript -e 'install.packages("shinydashboard")'
RUN Rscript -e 'install.packages("dplyr")'
RUN Rscript -e 'install.packages("shinyjs")'
RUN Rscript -e 'install.packages("DT")'
RUN Rscript -e 'install.packages("rhandsontable")'
RUN Rscript -e 'renv::consent(provided = TRUE)'
RUN Rscript -e 'renv::restore()'
# expose port
EXPOSE 8080
# run app on container start
CMD ["R", "-e", "shiny::runApp('/app', host = '0.0.0.0', port = 8080)"]

Deploy shinyapp in Docker Error in shinyAppDir(x)

I have the following dockfile
# Base image https://hub.docker.com/u/rocker/
FROM rocker/shiny:latest
# system libraries of general use
## install debian packages
RUN apt-get update -qq && apt-get -y --no-install-recommends install \
libxml2-dev \
libcairo2-dev \
libsqlite3-dev \
libmariadbd-dev \
libpq-dev \
libssh2-1-dev \
unixodbc-dev \
libcurl4-openssl-dev \
libssl-dev \
coinor-libcbc-dev coinor-libclp-dev libglpk-dev
## update system libraries
RUN apt-get update && \
apt-get upgrade -y && \
apt-get clean
# copy necessary files
## app folder
COPY ./app ./app
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
RUN apt-get update
RUN R -e 'BiocManager::install(ask = F)' && R -e 'BiocManager::install(c("rtracklayer", \
"GenomicAlignments", "Biostrings", "SummarizedExperiment", "Rsamtools", ask = F))'
# install renv & restore packages
RUN Rscript -e 'install.packages("renv")'
RUN Rscript -e 'install.packages("devtools")'
RUN Rscript -e 'install.packages("shiny")'
RUN Rscript -e 'install.packages("shinyBS")'
RUN Rscript -e 'install.packages("ggvis")'
RUN Rscript -e 'install.packages("shinydashboardPlus")'
RUN Rscript -e 'install.packages("shinycssloaders")'
RUN Rscript -e 'install.packages("shinyWidgets")'
RUN Rscript -e 'install.packages("plotly")'
RUN Rscript -e 'install.packages("RSQLite")'
RUN Rscript -e 'install.packages("forecast", dependencies = TRUE)'
RUN Rscript -e 'install.packages("tsutils")'
RUN Rscript -e 'install.packages("readxl")'
RUN Rscript -e 'install.packages("tidyverse")'
RUN Rscript -e 'install.packages("knitr")'
RUN Rscript -e 'install.packages("knitcitations")'
RUN Rscript -e 'install.packages("nycflights13")'
RUN Rscript -e 'install.packages("Matrix")'
RUN Rscript -e 'install.packages("plotly")'
RUN Rscript -e 'install.packages("igraph")'
RUN Rscript -e 'install.packages("ggthemes")'
RUN Rscript -e 'install.packages("evaluate")'
RUN Rscript -e 'install.packages("psych")'
RUN Rscript -e 'install.packages("kableExtra")'
RUN Rscript -e 'install.packages("ggjoy")'
RUN Rscript -e 'install.packages("gtools")'
RUN Rscript -e 'install.packages("gridExtra")'
RUN Rscript -e 'install.packages("cowplot")'
RUN Rscript -e 'install.packages("ggrepel")'
RUN Rscript -e 'install.packages("data.table")'
RUN Rscript -e 'install.packages("stringr")'
RUN Rscript -e 'install.packages("rmarkdown")'
RUN Rscript -e 'install.packages("shinyjqui")'
#RUN Rscript -e 'install.packages("BiocManager")'
#RUN R -e 'BiocManager::install("rtracklayer")'
#RUN R -e 'BiocManager::install("GenomicAlignments")'
#RUN R -e 'BiocManager::install("Biostrings")'
#RUN R -e 'BiocManager::install("SummarizedExperiment")'
#RUN R -e 'BiocManager::install("Rsamtools")'
RUN Rscript -e 'install.packages("V8")'
RUN Rscript -e 'devtools::install_github("ThomasSiegmund/D3TableFilter")'
RUN Rscript -e 'devtools::install_github("leonawicz/apputils")'
RUN Rscript -e 'devtools::install_github("Marlin-Na/trewjb")'
RUN Rscript -e 'devtools::install_github("dirkschumacher/ompr")'
RUN Rscript -e 'devtools::install_github("dirkschumacher/ompr.roi")'
RUN Rscript -e 'install.packages("ROI.plugin.glpk")'
RUN Rscript -e 'install.packages("shinydashboard")'
RUN Rscript -e 'install.packages("dplyr")'
RUN Rscript -e 'install.packages("dashboardthemes")'
RUN Rscript -e 'install.packages("shinyjs")'
RUN Rscript -e 'install.packages("magrittr")'
RUN Rscript -e 'install.packages("DT")'
RUN Rscript -e 'install.packages("rhandsontable")'
RUN Rscript -e 'renv::consent(provided = TRUE)'
RUN Rscript -e 'renv::restore()'
# expose port
EXPOSE 3838
# run app on container start
CMD ["R", "-e", "shiny::runApp('/app', host = '0.0.0.0', port = 3838)"]
When I try to upload without Bioconductor packages the app running
But when I am trying to upload as the dockfile is I receive the following error.
Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
shiny::runApp('/app', host = '0.0.0.0', port = 3838)
Error in shinyAppDir(x) : No Shiny application exists at the path "/app"
Calls: ... as.shiny.appobj -> as.shiny.appobj.character -> shinyAppDir
Execution halted
I visit some solution but I do not understand them....please I need your help
Check out the docs on multistage builds
You have a COPY statement, and right after that a FROM statement. After that last statement you no longer have access to whatever was in there in previous stage. You can copy files from one stage to the next if needed with --from=stagename where you named the stage with FROM somerepo/someimage as stagename.
In this case it means that everything you do in the first stage is never used or available again.
Normally this is used something like
FROM baseimage as build
RUN install some dependencies
COPY some sourcefiles
RUN install stuff eg mvn install
FROM runImage as run
COPY --from=build somepath/executable
CMD ["do", "something"]
This way you keep the final image small, removing everything you need at build time, but not at runtime.
In your case you have to rethink this a bit and see what you need at which stage.
I do not know the specifics of your app but the way you have it set up now is that you start with a base image, then install debian suff, then system stuff and then start all over with a "empty" new baseimage. You probably want to use just one image and go from there, with just one FROM instruction. Then if you want to optimize it for production you can always consider the multistage build if that is worth it in your case.

Dockerfile not installing ggmap

I'm getting errors when I'm trying to run a project through a docker container. The image fails and says that ggmap was not installed, despite it being called in the Dockerfile.
Here's a link to my repository: https://github.com/TedHaley/tree_value.git
This is what my dockerfile looks like:
FROM rocker/tidyverse
RUN Rscript -e "install.packages('devtools')"
RUN Rscript -e "install.packages('ezknitr')"
RUN Rscript -e "install.packages('lubridate')"
RUN Rscript -e "install.packages('dplyr')"
RUN Rscript -e "install.packages('readr')"
RUN Rscript -e "install.packages('ggplot2')"
RUN Rscript -e "install.packages('rgdal')"
RUN Rscript -e "install.packages('broom')"
RUN Rscript -e "install.packages('maptools')"
RUN Rscript -e "install.packages('gpclib')"
RUN Rscript -e "install.packages('packrat')"
RUN Rscript -e "install.packages('MASS')"
RUN Rscript -e "install.packages('scales')"
RUN Rscript -e "install.packages('stringr')"
RUN Rscript -e "install.packages('hexbin')"
RUN Rscript -e "install.packages('reshape2')"
RUN Rscript -e "install.packages('ggmap', repos = 'http://cran.us.r-project.org')"
It'd be a huge help if anyone has any ideas as to why ggmap is not installedi correctly.
Your image is based on rocker/tidyverse, itself based on rocker/rstudio, based on rocker/-base.
None of them have ggmap installed.
See for instance achubaty/r-spatial-devel, which does in
## install R spatial packages && cleanup
RUN xvfb-run -a install.r \
geoR \
ggmap \
Try starting from an image where you can test that ggmap is present.

Resources