"execvp error on file singularity" while use sigularity with pbs script - mpi

I tried to start the singularity container with pbs script. Here is my .def file and pbs script
#!/bin/bash
# ph.sh
export MPI_DIR=/opt/mpich
module load singularity/3.7.5
mpirun -n $num_cores -hostfile ./hostlist singularity exec --bind "MPI_DIR" ./bind.sif /usr/local/bin/PHengLEIv3d0-5720-tianhe > cfd.log
Bootstrap: docker
From: centos:7
%files
/usr/local/bin/PHengLEIv3d0-5720-tianhe /usr/local/bin/PHengLEIv3d0-5720-tianhe
%environment
export PATH="$MPI_DIR/bin:$PATH"
export LD_LIBRARY_PATH="$MPI_DIR/lib:$LD_LIBRARY_PATH"
%post
export DEBIAN_FRONTEND=noninteractive
yum update -y && yum install -y gcc-c++ && yum install -y gcc-gfortran
The error is:
[proxy:0:0#phdev1] HYDU_create_process (utils/launch/launch.c:74): execvp error on file singularity (No such file or directory)
use this shell script will be ok
export MPI_DIR="/opt/mpich"
mpirun -n 1 singularity exec --bind "$MPI_DIR" bind.sif /usr/local/bin/PHengLEIv3d0-5720-tianhe
but when I use pbs command is failed
qsub -N Pro325_Job281 -W sandbox=PRIVATE -q workq -l nodes=1:ppn=1 ph_ys144.sh
In my opinion, it's because pbs didn't find the singularity file. So I tried using relative and absolute paths, but it's failed. I looked up the information, there is very little information about how to start singularity with pbs, almost all of them are slurm. I don't know if this has anything to do with my use of bind mode, I want to create a lightweight mirror

Related

podman mounted volume issue

Bottom line: output from container is not appearing in mounted local directory
I have read the documentation for bind mounts and on another project had success with this.
My docker file:
FROM ubuntu:18.04
RUN apt-get update && apt-get install -yq build-essential autoconf libnetcdf-dev libxml2-dev libproj-dev valgrind wget unzip git nano
# pulls ADBM from github and unzips in folder ADMBcode
RUN mkdir /ADMBcode
RUN wget https://github.com/admb-project/admb/releases/download/admb-12.2/admb-12.2-linux.zip
RUN mv admb-12.2-linux.zip /ADMBcode
RUN unzip ADMBcode/admb-12.2-linux.zip -d /ADMBcode
# pulls hydra repo from github into folder HYDRA
RUN mkdir /HYDRA
RUN git clone https://github.com/NOAA-EDAB/hydra_sim.git /HYDRA
# compiles and runs model
WORKDIR /HYDRA
RUN /ADMBcode/admb-12.2/admb hydra_sim.tpl
RUN ./hydra_sim
# create dir for output and move output
#RUN mkdir -p /HYDRA/output/diagnostics
#RUN mkdir /HYDRA/output/indices
# moves output to folder to be mounted
RUN mv *.out /HYDRA/output/diagnostics
RUN mv *.txt /HYDRA/output/indices
I build the image
podman build -t hydra .
and run the container using the following :
podman run --rm --name hydra --mount "type=bind,src=/path_on_local_machine/test,dst=/HYDRA/output" hydra
I have test folder on my local machine but the output is not mounted.
I have entered the container
podman run -it hydra
and checked that the output is there
I have done this before for another model and everything behaved. Not sure why this is not.
Any ideas what i am doing wrong?
Thanks
However

SageMaker fails when trying to add Lifecycle Configuration for keeping custom environments persistent after restart

I want to create environment in SageMaker on AWS with miniconda, and make it available as kernels in Jupyter when I restart the session. But the SageMaker keep failing.
I followed the instructions found in here:
https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-lifecycle-script-timeout/
basically it says:
"Create a custom, persistent Conda installation on the notebook instance's Amazon Elastic Block Store (Amazon EBS) volume: Run the on-create script in the terminal of an existing notebook instance. This script uses Miniconda to create a separate Conda installation on the EBS volume (/home/ec2-user/SageMaker/). Then, run the on-start script as a lifecycle configuration to make the custom environment available as a kernel in Jupyter. This method is recommended for more technical users, and it is a better long-term solution."
I run this on-create.sh script on the terminal on Jupyter:
on-create.sh:
#!/bin/bash
set -e
sudo -u ec2-user -i <<'EOF'
unset SUDO_UID
# Install a separate conda installation via Miniconda
WORKING_DIR=/home/ec2-user/SageMaker/custom-environments
mkdir -p "$WORKING_DIR"
wget https://repo.anaconda.com/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh -O "$WORKING_DIR/miniconda.sh"
bash "$WORKING_DIR/miniconda.sh" -b -u -p "$WORKING_DIR/miniconda"
rm -rf "$WORKING_DIR/miniconda.sh"
# Create a custom conda environment
source "$WORKING_DIR/miniconda/bin/activate"
KERNEL_NAME="conda-test-env"
PYTHON="3.6"
conda create --yes --name "$KERNEL_NAME" python="$PYTHON"
conda activate "$KERNEL_NAME"
pip install --quiet ipykernel
# Customize these lines as necessary to install the required packages
conda install --yes numpy
pip install --quiet boto3
EOF
and it creates the "conda-test-env" environment as expected.
Then I add the on-start.sh as lifestyle configuration:
#!/bin/bash
set -e
sudo -u ec2-user -i <<'EOF'
unset SUDO_UID
source "/home/ec2-user/SageMaker/custom-environments/miniconda/bin/activate"
conda activate conda-test-env
python -m ipykernel install --user --name "conda-test-env" --display-name "conda-test-env"
# Optionally, uncomment these lines to disable SageMaker-provided Conda functionality.
# echo "c.EnvironmentKernelSpecManager.use_conda_directly = False" >> /home/ec2-user/.jupyter/jupyter_notebook_config.py
# rm /home/ec2-user/.condarc
EOF
then I update the instance with the new configuration,
and when I start my notebook instance, after few minutes it fails.
I'll appreciate any help.

Application logs to stdout with Shiny Server and Docker

I have a Docker container running a shiny app (Dockerfile here).
Shiny server logs are output to stdout and application logs are written to /var/log/shiny-server. I'm deploying this container to AWS Fargate and logging applications only display stdout which makes debugging an application when deployed challenging. I'd like to write the application logs to stdout.
I've tried a number of potential solutions:
I've tried the solution provided here, but have had no luck.. I added the exec xtail /var/log/shiny-server/ to my shiny-server.sh as the last line in the file. App logs are not written to stdout
I noticed that writing application logs to stdout is now the default behavior in rocker/shiny, but as I'm using rocker/verse:3.6.2 (upgraded from 3.6.0 today) along with RUN export ADD=shiny, I don't think this is standard behavior for the rocker/verse:3.6.2 container with Shiny add-on. As a result, I don't get the default behavior out of the box.
This issue on github suggests an alternative method of forcing application logging to stdout by way of an environment variable SHINY_LOG_STDERR=1 set at runtime but I'm not Linux-savvy enough to know where that env variable needs to be set to be effective. I found this documentation from Shiny Server v1.5.13 which gave suggestions in which file to set the environment variable depending on Linux distro; however, the output from my container when I run cat /etc/os-release is:
which doesn't really line up with any of the distributions in the Shiny Server documentation, thus making the documentation unhelpful.
I tried adding adding the environment variable from the github issue above in the docker run command, i.e.,
docker run --rm -e SHINY_LOG_STDERR=1 -p 3838:3838 [my image]
as well as
docker run --rm -e APPLICATION_LOGS_TO_STDOUT=true -p 3838:3838 [my image]
and am still not getting the logs to stdout.
I must be missing something here. Can someone help me identify how to successfully get application logs to stdout successfully?
You can add the line ENV SHINY_LOG_STDERR=1 to your Dockerfile (at least, this works with rocker/shiny, not sure about rocker/verse), such as with your Dockerfile:
FROM rocker/verse:3.6.2
## Add shiny capabilities to container
RUN export ADD=shiny && bash /etc/cont-init.d/add
## Install curl and xtail
RUN apt-get update && apt-get install -y \
curl \
xtail
## Add pip3 and other Python packages
RUN sudo apt-get update -y && apt-get install -y python3-pip
RUN pip3 install boto3
## Add R packages
RUN R -e "install.packages(c('shiny', 'tidyverse', 'tidyselect', 'knitr', 'rmarkdown', 'jsonlite', 'odbc', 'dbplyr', 'RMySQL', 'DBI', 'pander', 'sciplot', 'lubridate', 'zoo', 'stringr', 'stringi', 'openxlsx', 'promises', 'future', 'scales', 'ggplot2', 'zip', 'Cairo', 'tinytex', 'reticulate'), repos = 'https://cran.rstudio.com/')"
## Update and install
RUN tlmgr update --self --all
RUN tlmgr install ms
RUN tlmgr install beamer
RUN tlmgr install pgf
#Copy app dir and theme dirs to their respective locations
COPY iarr /srv/shiny-server/iarr
COPY iarr/reports/interim_annual_report/theme/SwCustom /opt/TinyTeX/texmf-dist/tex/latex/beamer/
#Force texlive to find my custom beamer theme
RUN texhash
EXPOSE 3838
## Add shiny-server information
COPY shiny-server.sh /usr/bin/shiny-server.sh
COPY shiny-customized.config /etc/shiny-server/shiny-server.conf
## Add dos2unix to eliminate Win-style line-endings and run
RUN apt-get update -y && apt-get install -y dos2unix
RUN dos2unix /usr/bin/shiny-server.sh && apt-get --purge remove -y dos2unix && rm -rf /var/lib/apt/lists/*
# Enable Logging from stdout
ENV SHINY_LOG_STDERR=1
RUN ["chmod", "+x", "/usr/bin/shiny-server.sh"]
CMD ["/usr/bin/shiny-server.sh"]

Docker and Analytics Install

I have a docker file called quasar.dockerfile. I built the docker file and everything loaded successfully.
#quasar.dockerfile
FROM java:8
WORKDIR /app
ADD docker/quasar-config.json quasar-config.json
RUN apt-get update && \
apt-get install -y wget && \
wget https://github.com/quasar-analytics/quasar/releases/download/v2.3.3-SNAPSHOT-2121-web/web_2.11-2.3.3-SNAPSHOT-one-jar.jar
EXPOSE 8080
CMD java -jar web_2.11-2.2.3-SNAPSHOT-one-jar.jar -c /app/quasar-config.json
I then tried running the docker and I get this error saying that I am unable to access the jarfile.
[test]$ docker build -f docker/quasar.dockerfile -t quasar_fdw_test/quasar .
Sending build context to Docker daemon 1.851 MB
Successfully built a7d4bc6c906f
[test]$ docker run -d --name quasar_fdw_test-quasar --link quasar_fdw_test-mongodb:mongodb quasar_fdw_test/quasar
6af2f58bf446560507bdf4a2db8ba138de9ed94a408492144e7fdf6c1fe05118
[test]$ docker ps -l
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6af2f58bf446 quasar_fdw_test/quasar "/bin/sh -c 'java -ja" 5 seconds ago Exited (1) 4 seconds ago quasar_fdw_test- quasar
[test]$ docker logs 6af2f58bf446
Error: Unable to access jarfile web_2.11-2.2.3-SNAPSHOT-one-jar.jar
How come the process keeps getting killed? Seems like it has to do with being unable to run the jarfile but the build needed to access that file and happened successfully. Is this a linking issue?
Try to use full path on Dockerfile
CMD java -jar /web_2.11-2.2.3-SNAPSHOT-one-jar.jar -c /app/quasar-config.json

EC2 startup script gets stuck on wget

I have the following script for some number crunching
#!/bin/bash
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y r-base r-base-dev htop s3cmd p7zip-full
wget https://s3.amazonaws.com/#######/###.7z
7z e ###.7z
sudo R CMD BATCH --slave --no-timing --vanilla "--args 0 1 100 200 500 2" SOME-ROUTINE.R
s3cmd put *.results s3://#########/
on EC2. I upload the script as file at the Launch Instance->Instance Details->User Data
The machine fires up, updates and upgrades but then it does not execute wget and does not download the file. When i SSH in the Instance and run the exact same commands the process completes without problems.
Any ideas why wget does not work?
Any other alternatives?
EC
It is always a bit of guessing, but here is how I would debug this:
My first suggestion would be to check for special characters in the S3 URL. This might cause the wget call to fail.
Second, I would give an explicit output path to wget with the -O option. While you are editing the command, you can also add -o to output logging information.
Last step is to check your access rights to the S3 bucket. Perhaps you can try to put the file on another webspace to see if the command executes then.

Resources