Update SSL Certificates inside dockerfile - r

I have the following dockerfile:
FROM rocker/tidyverse:3.5.2
RUN apt-get update
# System dependices for R packages
RUN apt-get install -y \
git \
make \
curl \
libcurl4-openssl-dev \
libssl-dev \
pandoc \
libxml2-dev \
unixodbc \
libsodium-dev \
tzdata
# Clean up package installations
RUN apt-get clean
# ODBC system dependencies
RUN apt-get install -y gnupg apt-transport-https
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/9/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update
RUN ACCEPT_EULA=Y apt-get install msodbcsql17 -y
# Install renv (package management)
ENV RENV_VERSION 0.11.0
RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "remotes::install_github('rstudio/renv#${RENV_VERSION}')"
# Specify USER for rstudio session
ENV USER rstudio
COPY ./renv.lock /renv/tmp/renv.lock
WORKDIR /renv/tmp
RUN R -e 'renv::consent(provided = TRUE)'
RUN R -e "renv::restore()"
WORKDIR /home/$USER
I use this image to recreate environments for R scripting purposes. This was working for a number of months up until the end of September when I started getting:
Error in curl::curl_fetch_memory(url, handle = handle) :
SSL certificate problem: certificate has expired
This occured when using GET request to query a website. How do I update my certificate now and in the future to avoid certificates expiring...I do not want to use the "config(ssl_verifypeer = FALSE)" workaround.

This happened to me too. Any chance you are working on MacOS or Linux based machine? It seems to be a bug:
https://security.stackexchange.com/questions/232445/https-connection-to-specific-sites-fail-with-curl-on-macos
ca-certificates Mac OS X
SSL Certificates - OS X Mavericks
Instead of adjusting the certificates as is suggested here, you can simply set R to not verify the peer. Just add the following line to the beginning of your code.
httr::set_config(config(ssl_verifypeer = FALSE, ssl_verifyhost = FALSE))

Related

Copying R Libraries to Docker Image from Local Libraries

I created a renv for my Shiny app. As you know that when you build a dockerfile, it has to install every package and its dependencies. If there are many packages, it takes long time.
I already installed packages my local machine. Also, the app will run on the remote machine. Is there any chance to copy the all libraries without restoring the renv? My local machine is macOS and my remote machine is Ubuntu. Due to the different Operating Systems might not work to use libraries. Probably, there are different dependencies between macOS and Ubuntu for the R packages.
Which is the best way to install R packages in a Dockerfile? Should I wait long docker building every time?
Directory
FROM openanalytics/r-base
# system libraries of general use
RUN apt-get update && apt-get install -y \
sudo \
nano \
automake \
pandoc \
pandoc-citeproc \
libcurl4-openssl-dev \
libcairo2-dev \
libxt-dev \
libssl-dev \
libssh2-1-dev \
libssl1.1 \
libmpfr-dev \
# RPostgres
libpq-dev \
libssl-dev \
# rvest
libxml2-dev \
libssl-dev
# Install Packages
# RUN R -e "install.packages(c('janitor','DBI','RPostgres','httr', 'rvest','xml2', 'shiny', 'rmarkdown', 'tidyverse', 'lubridate','shinyWidgets', 'shinyjs', 'shinycssloaders', 'openxlsx', 'DT'), repos='https://cloud.r-project.org/')"
# copy the app to the image
RUN mkdir /root/appfile
COPY appfile /root/appfile
# renv restore
RUN Rscript -e 'install.packages("renv")'
RUN Rscript -e "renv::restore(lockfile = '/root/appfile/renv.lock')"
EXPOSE 3838
CMD ["R", "-e", "shiny::runApp('/root/appfile')"]
Thanks.

Why renv in docker is not installing and loading packages?

This is the first time I'm using Docker. I have tried a few times but this is the first time where I'm containerizing the project. The project generates reports using latex and officer. However, after building the container and running the image, Rstudio shows that none of the packages are installed and loaded.
Dockerfile
FROM rocker/verse:4.0.5
ENV RENV_VERSION 0.15.1
RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN R -e "remotes::install_github('rstudio/renv#${RENV_VERSION}')"
WORKDIR /REAL
COPY . .
RUN apt-get -y update --fix-missing
#RUN apt-get -y install build-essential \
# libssl-dev \
# libcurl4-openssl-dev \
# libxml2-dev \
# libfontconfig1-dev \
# libcairo2-dev
RUN R -e 'options(renv.consent = TRUE); renv::restore(lockfile = "/REAL/renv.lock"); renv::activate()'
Building Docker image
docker build -t reports/real .
Running docker with rstudio support
docker run --rm -d -p 8787:8787 --name real_report -e ROOT=TRUE -e DISABLE_AUTH=TRUE -v C:/Users/test/sources/REAL_Reports:/home/rstudio/REAL_Reports reports/real:latest
What am I doing wrong and how do I fix it so that all packages that renv has discovered are installed and loaded?
RUN R -e 'options(renv.consent = TRUE); renv::restore(lockfile = "/REAL/renv.lock"); renv::activate()'
You likely need to activate / load your project before calling restore(), or just forego using a project-local library altogether.
https://rstudio.github.io/renv/articles/docker.html may also be useful here.

Azure container - simple R plumber deployment question?

I am deploying simple plumber API in Azure container which executes simple SQL query and returns results. Here is my Dockerfile.
# start from the rocker/r-ver:4.0.2 image
# minor change to one branch to test deployment
FROM rocker/r-ver:4.0.2
# install the linux libraries needed for plumber
RUN apt-get update -qq && apt-get install -y \
libssl-dev \
libcurl4-gnutls-dev \
libsodium-dev \
unixodbc-dev \
r-cran-rodbc \
curl\
libxml2-dev
# libiodbc2-dev
# We need to update Dockerfile to install ODBC Driver 17 for SQL Server for
# Ubuntu 20.04 or different version
# here is link: https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15#ubuntu17
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/ubuntu/20.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update
RUN ACCEPT_EULA=Y apt-get install -y msodbcsql17
RUN ACCEPT_EULA=Y apt-get install -y mssql-tools
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
# RUN source /etc/bash.bashrc
RUN mkdir -p ~/application
# copy everything from the current directory into the container
COPY "/" "application/"
# I probably do not even have to set up WORKDIR
WORKDIR "application/"
# open port 80 to traffic
EXPOSE 80
# install plumber
RUN R -e "install.packages('plumber')"
RUN R -e "install.packages('renv')"
RUN R -e "install.packages('RODBC')"
RUN R -e 'renv::restore()'
# when the container starts, start the main.R script
ENTRYPOINT ["Rscript", "execute_plumber.R"]
API works fine. However, when I connect to container, run R and execute library(RODBC) I get back error about library not found. This is puzzling since it is already mentioned in Dockerfile, but maybe it has something to do with renv. I proceed with installation install.packages('RODBC') and it generates error which I can fix by running apt-get install unixodbc-dev and apt-get install r-cran-rodbc. Those lines are again already present in Dockerfile. Then when I try to connect to SQL server I get error about driver. And again I can fix it by re-executing lines from Dockerfile (related to SQL Server 17 driver).
So I am puzzled why I need to re-execute some items already present in Dockerfile. Or maybe when I click "Connect" in Azure it actually connects me to host and not to container?

How to install R packages in parallel way via docker?

I am installing several R packages from CRAN via docker file. Below is my docker file:
FROM r-base:4.0.2
RUN apt-get update \
&& apt-get install -y --auto-remove \
build-essential \
libcurl4-openssl-dev \
libpq-dev \
libssl-dev \
libxml2-dev \
&& R -e "system.time(install.packages(c('shiny', 'rmarkdown', 'Hmisc', 'rjson', 'caret','DBI', 'RPostgres','curl', 'httr', 'xml2', 'aws.s3'), repos='https://cloud.r-project.org/'))"
RUN mkdir /shinyapp
COPY . /shinyapp
EXPOSE 5000
CMD ["R", "-e", "shiny::runApp('/shinyapp/src/shiny', port = 5000, host = '0.0.0.0')"]
The docker build process is taking too much time (25 to 30 minutes). Below are the execution time details after completion of build.
user system elapsed
1306.268 232.438 1361.374
Is there any way to optimize above Dockerfile? Any way to install packages in parallel manner?
Note: I have also tried rocker/r-base, but didn't find any luck in installation speed.
‘pak’ performs package download and installation in parallel.
Unfortunately the current CRAN version of ‘pak’ (0.1.2.1) is arguably broken: it has tons of dependencies. By contrast, the development version on GitHub has no external dependencies, as it should. So we need to install that one.
So you could change your Dockerfile as follows:
…
&& Rscript -e "install.packages('pak', repos = 'https://r-lib.github.io/p/pak/dev/'); pak::pkg_install(c('shiny', 'rmarkdown', 'Hmisc', 'rjson', 'caret','DBI', 'RPostgres','curl', 'httr', 'xml2', 'aws.s3'))"
…
But, frankly, that’s quite unreadable. A better approach would be to use ARG or ENV to supply the packages to be installed (this is regardless of whether we use ‘pak’ to install packages):
FROM r-base:4.0.2
ARG PKGS="shiny, rmarkdown, Hmisc, rjson, caret, DBI, RPostgres, curl, httr, xml2, aws.s3"
RUN apt-get update \
&& apt-get install -y --auto-remove \
build-essential \
libcurl4-openssl-dev \
libpq-dev \
libssl-dev \
libxml2-dev
RUN Rscript -e 'install.packages("pak", repos = "https://r-lib.github.io/p/pak/dev/")' \
&& echo "$PKGS" \
| Rscript -e 'pak::pkg_install(strsplit(readLines("stdin"), ", ?")[[1L]])'
RUN mkdir /shinyapp
COPY . /shinyapp
EXPOSE 5000
CMD ["Rscript", "-e", "shiny::runApp('/shinyapp/src/shiny', port = 5000, host = '0.0.0.0')"]
Also note that R shouldn’t be invoked via the R binary for scripted use — that’s what Rscript is for. Amongst other things it handles stdout better.

Shiny Docker Unable to Connect to API/Internet

I've setup a Shiny App using docker shiny image and all works well until I attempt to call API using httr package. From what I can get I'm unable to access internet from the App as I get the Timeout message from multiple sites, and verified the work from going through web browser. Wondering what setting I need to change?
From the RStudio Console within the App:
> library(httr)
> r <- GET("http://httpbin.org/get")
Error in curl::curl_fetch_memory(url, handle = handle) :
Timeout was reached: [httpbin.org] Connection timed out after 10000 milliseconds
Here is my DockerFile:
FROM rocker/tidyverse
RUN apt-get update
RUN apt-get install -y libpq-dev
RUN R -e 'install.packages("RPostgres")'
RUN R -e 'install.packages("DT")'
I just needed to add the following libraries in Dockerfile and worked
RUN apt-get install -y libpq-dev \
libssl-dev \
libcurl4-openssl-dev \
curl \
sudo \
pandoc \
pandoc-citeproc \
libcairo2-dev \
libxt-dev \
libssl-dev \
libssh2-1-dev

Resources