default path in a singularity container - recipe

I'm very new to the container space and after following some tutorials I'm trying to get my own singularity container up and running.
My recipe is as follows:
BootStrap: debootstrap
OSVersion: trusty
MirrorURL: http://us.archive.ubuntu.com/ubuntu/
%post
#install strelka2.9.2 - these commands get run during the container build stage
apt-get -y --force-yes install wget bzip2 python-dev
wget https://github.com/Illumina/strelka/releases/download/v2.9.2/strelka-2.9.2.centos6_x86_64.tar.bz2
tar xvjf strelka-2.9.2.centos6_x86_64.tar.bz2
%environment
#What to put here to find the strelka-2.9.2.centos6_x86_64/bin/ folder?
I'm trying to figure out how to add the downloaded binary to the executable path. I expected the downloaded files in the post section to show up in the /home/ or similar inside the container, but I can't seem to find them when I shell in with singularity shell myImage.simg.

By default, the PATH used in the container is the PATH of the environment you run it from. An easy way to ensure the path is what you want is set: PATH=/path/to/strelka/bin:$PATH under %environment.
A simple definition file you can you to quickly play around with:
Bootstrap: docker
From: debian:buster-slim
%environment
PATH=/some/weird/path/bin:$PATH
%runscript
echo "PATH is: $PATH"

Related

How do I find where a file is located inside container?

I'm trying to give special permissions to a file located inside a container but I'm getting a "No such file or directory" error message.
The Dockerfile basically runs a R Script that generates an output.pptx file located inside an output folder created inside the container.
I want to send that output into a s3 bucket but for some reason it isn't finding the file inside the container.
# Make the output directory
RUN mkdir output
# Process main file
CMD ["RScript", "Script.R"]
# install AWS CLI
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install -i /usr/local/bin/aws -b /usr/local/bin
# run AWS CLI to push output file to s3 folder
RUN chmod 775 ./output/*.pptx
RUN aws s3 cp ./output/*.pptx s3://bucket
Could this be related to the path I'm using for the file?
(Edited to fix a word-swap brain-fart in the first version.)
I get the idea that there is a misunderstanding of how the image should be used. That is, a DOCKERFILE creates an image, and the CMD is not actually run when building the image.
Up front:
an image is really just a tarball with filesystems; multiple "layers" are there, to indicate the layers of the build process (which can be squashed); an image has no "running" component, no processes are active in an image; and
a container is an image that is in a running state. It might be the CMD you specify, or it might be something else (e.g., docker run -it --rm myimage /bin/bash to run a bash shell with the container as the filesystem/environment). When the running command finishes and exits, the container is stopped.
Typically, you create an image "once" (security updates and such notwithstanding), and then run it as needed (either manually or via some trigger, e.g., cron or CI triggers). That might look like
docker run --rm myimage # using the default `CMD`
docker run --rm myimage R Script.R # specifying the command manually
with a couple assumptions:
the image has R installed and within the PATH ... though you could specify the full /path/to/bin/R instead; and
the default working directory (dockerfile's WORKDIR directive) contains Script.R ... or you can specify the full internal path to Script.R
With this workflow, I suggest a few changes:
from the DOCKERFILE, remove the lines after # run AWS CLI;
add to Script.R steps to copy the file to your S3 instance, either using the awscli you installed in the image, or by using the aws.s3 R package (which might preclude the need to install the awscli);
I don't use AWS S3, but I suspect that you need credentials to be able to access the bucket; there are many ways for dealing with images and "secrets" like S3 credentials, the most naïve approaches involve hard-coding the credentials into the container, which is a security risk; others involve "docker secrets" or environment variables. For instance,
docker run -it --rm -e "S3TOKEN=asldkfjlaskdf"
though even that might be intercepted by neighboring users on the docker host.

copy libpq.5.dylib to /usr/lib/libpq.5.dylib

I can't load packages in R because the file libpq.5.dylib is not in /usr/lib/libpq.5.dylib. It is in /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib
I tried this line: sudo ln -s /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib /usr/lib/libpq.5.dylib but I get this response: ln: /usr/lib/libpq.5.dylib: Operation not permitted
What can I do to get the file in /usr/lib/libpq.5.dylib without causing issues? This solution suggests that I may face problems down the line so I don't understand what to do.
You really don't want it in /usr/lib. Apple declared that as off-limits, and on newer macOS versions it lives on a read-only volume. Unless you're willing to go into recovery mode and manually tamper with the volume (and possibly repeat that on future OS updates), this is not the way to go.
Instead, let's address the core issue:
Dynamic libraries on macOS embed their own install path inside the binary, and the linker copies that into binaries linking against them. This information can be changed with install_name_tool (see man install_name_tool).
Examine the install name of the dylib:
otool -l /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib | fgrep -A2 LC_ID_DYLIB
If the printed path already points to the dylib itself (or a path that is symlinked to it), use this path as [new_path] below, and skip step 2.
If the dylib's install name does not point back to itself, run this:
sudo install_name_tool -id /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib
And use /usr/local/Cellar/libpq/13.0/lib/libpq.5.dylib for [new_path] below.
For binaries that link against the dylib, run:
sudo install_name_tool -change /usr/lib/libpq.5.dylib [new_path] [path_to_binary]
I had the same issue building a container through docker for API use : RPostgres was installed but the library couldn't load, same error message.
Since I had installed Postgres on my machine, I figure the problem was worked around therefore I had no such message on local ; but here's how I solved this in my dockerfile, 100% verified on a machine with nothing related to R installed :
RUN apt-get update && apt-get install libpq5 -y
So executing apt-get update && apt-get install libpq5 -y on your terminal should do the trick. Light and efficient.
It tried to load libpq.5.dylib from the symlink /opt/homebrew/opt/postgresql/lib/libpq.5.dylib but could not find the file, so you need to update it:
# TODO: get this from the error, after "Library not loaded:"
SYMLINK_PATH="/opt/homebrew/opt/postgresql/lib/libpq.5.dylib"
# TODO: find this in your machine. The version maybe different than mine
DESTINATION_PATH="/opt/homebrew/opt/postgresql/lib/postgresql#14/libpq.5.dylib"
sudo mv $SYMLINK_PATH $SYMLINK_PATH.old
sudo ln -s $DESTINATION_PATH $SYMLINK_PATH

Rscript not finding installed packages in container

I am trying to schedule and R script to run inside a container. I have a docker file like this:
# Install R version 3.5
FROM rocker/tidyverse:3.5.1
USER root
# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
sudo \
gdebi-core \
pandoc \
pandoc-citeproc \
libcurl4-gnutls-dev \
libcairo2-dev \
libxt-dev \
libssl-dev \
xtail \
wget \
cron
# Install R packrat, which we'll then use to install the other packages
RUN R -e 'install.packages("packrat", repos="http://cran.rstudio.com", dependencies=TRUE);'
# copy packrat files
COPY packrat/ /home/project/packrat/
# copy .Rprofile so that it know where to look for packages
COPY .Rprofile /home/project/
RUN R -e 'packrat::restore(project="/home/project");'
# Copy DB query script into the Docker image
COPY 002_query_db_for_kpis.R /home/project/002_query_db_for_kpis.R
# copy crontab for db query
COPY db_query_cronjob /etc/crontabs/db_query_cronjob
# give execution rights
RUN chmod 644 /etc/crontabs/db_query_cronjob
# run the job
RUN crontab /etc/crontabs/db_query_cronjob
# start cron in the foreground
CMD ["cron", "-f"]
It builds ok and then the cron job fails silently. When I investigate with:
docker exec -it 19338f50b4ed Rscript `/home/project/002_query_db_for_kpis.R`
The output I get is:
Error in library(zoo) : there is no package called ‘zoo’
Execution halted
Now, the first part of the scripts looks like:
#!/usr/local/bin/env Rscript --default-packages=zoo,RcppRoll,lubridate,broom,magrittr,tidyverse,rlang,RPostgres,DBI
library(zoo)
...
So, clearly it's not finding the packages. They are in there though. That was the whole point of packrat and copying the .Rprofile, and it seemed to work because if I run a shell inside the container while it's running I can find them in:
root#d2b4f6e7eade:/usr/local/lib/R/site-library#
and all the packrat files seem in the right place as well.. could it be that the .Rprofile file isn't being seen because it starts with a '.'? Can I change that?
UPDATE
If I don't use packrat, but install packages normally, it works. Digging around inside the container's files, I can see that /usr/local/lib/R/site-library doesn't have the packages needed in it, whereas /home/project/packrat/src does. So, it must be to do with Rscript looking in the wrong place. I thought the .Rprofile in /home/project would solve that but it doesn't.. maybe something else I didn't copy over? Although I've got the script running now, it's not ideal since, those packages might be different versions (hence why I want to use packrat), so if anyone can figure out how to get it to work with packrat I'll mark that answer as correct.
A couple things to try based on problem and update:
have you ignored your packrat/lib* and packrat/src/ directories in .dockerignore? i am worried you are copying over all the built packages and so restore() thinks the packages already been built in your container.
does your root container have executable privs on the packrat.lock file? obviously would prevent restore from running.
change docker install user to the rocker rstudio image's default "rstudio", moves just the packrat.lock and packrat.opts files
USER rstudio
COPY --chown=rstudio:rstudio packrat/packrat.* /home/project/packrat/
A good reference for these options: https://rviews.rstudio.com/2018/01/18/package-management-for-reproducible-r-code/

How do you create a fake install of a debian package for use in testing?

I have a package that previously only targeted RPM based distros for which I am now building .deb packages for Debian based distros.
The aim is to simulate a test installation from user-space that is isolated from the system you are building on. It may be multi-user and you do not want to require root access just to build the software. Many of our tests simulate the installation directory structure already. This is for the next step up to simulate an actual installation using packages built.
For the RPM packages I was able to create test installations using:
WSDIR=/where/I/want/my/tests/to/run
rpmdb --initdb --dbpath "$WSDIR"/rpmdb
rpm --relocate /opt="$WSDIR"/opt --dbpath $WSDIR/rpmdb -i <package>.rpm
The equivalent in the Debian world is something like:
dpkg --force-not-root --admindir=$WSDIR/dpkg --root=$WSDIR/install --install "$DEB"
However, I am stuck over the equivalent to the rpmdb --initdb step.
Note that I can just unpack the archive using:
dpkg-deb -x "$DEB" $WSDIR/install
But I would prefer to be closer to how a real package is installed.
Also I don't think this will run preinstall and postinstall scripts.
Similar questions have suggested using deboostrap to create a chroot environment but this creates a complete new installation. As well as being overkill it is too slow for an automated test. I intend to use this for quick tests of the installation package prior to further testing in actual test environments.
My experiments so far:
(cd $WSDIR/dpkg && mkdir alternatives info parts triggers updates)
cp /var/lib/dpkg/status $WSDIR/dpkg/status
have at best resulted in:
dpkg: error: unable to access dpkg status area: No such file or directory
which does not indicate clear what is wrong.
So how do you create a dpkg admin directory?
Cross posted as https://superuser.com/questions/1271145/how-do-you-create-a-dpkg-admin-directory
Update 24/11/2017
I've tried copying using the dpkg dir from an environment created by [cowdancer][1] (which uses deboostrap under the hood) or copying the real one from /var/lib/dpkg but I still get the same error message so perhaps the error (and/or the --admindir option) doesn't mean quite what I think it means.
Note that:
sudo dpkg --force-not-root --root=$WSDIR/install --admindir=/var/lib/dpkg --install "$DEB"
does work. So it is something to do with the admin dir.
I've also retitled the question as "How do you create a dpkg admin directory" is interesting question but the answer is not necessarily the solution to my problem.
The minimal way to create a dpkg database is something like this:
$ mkdir -p db/{updates,info}
$ touch db/{status,diversions,statoverride}
If you want to use that as non-root, currently the best way is to use fakeroot.
$ mkdir -p fsys
$ PATH=/sbin:/usr/sbin:$PATH fakeroot dpkg --log=/dev/null --admindir=db --instdir=fsys -i pkg.deb
But take into account that passing --root after --admindir or --instdir will reset those paths, which is I think the problem you have been having here.
Also using sudo and --force-not-root does not make much sense? :) And is definitely less confined than using just fakeroot. In the near future it will be possible to run dpkg fully unprivileged in some local tree.
I eventually found an answer for this. Thanks to Guillem Jover for some of this.
Pasting a copy of it here:
mkdir fake
mkdir fake/install
mkdir -p fake/dpkg/info
mkdir -p fake/dpkg/updates
touch fake/dpkg/status
PATH=/sbin:/usr/sbin:$PATH fakeroot dpkg --force-script-chrootless --log=`pwd`/fake/dpkg.log --root=`pwd`/fake --instdir `pwd`/fake --admindir=`pwd`/fake/dpkg --install *.deb
Some points to note:
--force-not-root is not enough. fakeroot is required.
ldconfig and start-stop-daemon must be on the path.
(hence PATH=/sbin:/usr/sbin:$PATH)
The log file needs to be relocated from the default /var/log/dpkg.log
The order of arguments is significant. If used --root must be before --instdir and --admindir.
The admindir is supposed to have a the installation dir as a prefix.
If the package contains any pre or post installation scripts (preinst,postinst) then --force-script-chrootless is required as these scripts are normally run via chroot() which gives operation not permitted when attempted under fakeroot.
For a quick test of trivial dependencies, you can directly install on the system using 'dpkg -i' then 'dpkg -P' and 'apt-get autoremove' to purge the package and clean the dependencies.
An other more secure but slower solution could be to use the autopkgtest package:
https://people.debian.org/~mpitt/autopkgtest/README.package-tests.html

Install nginx dynamic module using docker compose

Usually in nginx to compile a third part module you should use this command:
./configure --add--module=path/to/your/new/module/directory
Then using:
make
And finally:
make install
But using docker I can't go into nginx path and run these commands. How could I add "configure" command to my docker-compose.yml file?
EDIT:
I've tried to create a simple Dockerfile like this:
FROM nginx
RUN ./configure --add-module=./module/
make && \
make install
And including it into my docker-compose.yml.
And it gave me this error:
/bin/sh: 1: ./configure: not found The command '/bin/sh -c ./configure --add-module=./module/' returned a non-zero code: 127
I've also tried to use "configure" instead of "./configure", but same result. I don't know how to set configure command.
I am not sure I understand the question correctly, but I think configure, make and make install should be done as part of docker build using the RUN directive (in your Dockerfile). docker-compose will simply run the resultant image (probably in-tandem with other docker images).
Sample Dockerfile (not verified, may contain errors!):
FROM centos:latest
COPY nginx /root/nginx
WORKDIR /root/nginx
RUN ./configure && make && make install

Resources