mkdir and mount in initramfs - mount

I am writing an initramfs, executed in busybox, in which I mount a partition using those commands:
/bin/busybox mount -n -t proc proc /proc
mount -n -t devtmpfs devtmpfs /dev
mount -n -t sysfs sysfs /sys
mount -n -t tmpfs inittemp /mnt
mkdir /mnt/saved
mount -n -t "${rootfstype}" -o "${rootflags}" ${device} /mnt/saved
But when the system starts up, I have this error:
mount: mounting /dev/mmcblk0p2 on /mnt/saved failed: No such file or directory
I know that when the device is not found, there is a message like Device does not exist, so I think the problem is coming from the directory /mnt/saved that is not correctly created yet.
I tried adding an ls -l /mnt after the mkdir to check that the directory is correctly created, but most of the time, if I do so, the error disappears. So I though the problem might be synchronization problem (of the tmpfs, weird!) So I tried some other things like creating a dummy file in the directory to force a kind of synchronisation. This works, but is a dirty workaround and I want to find the real cause of the problem to build a clean solution.

By the time I was writing my question, I finally found the solution by myself… I post it anyway just in case somebody is stuck like me.
Actually, the mount command of busybox does not show a message about device, if it cannot find it, but always show No such file or directory.
My problem was actually coming from the root device which was not ready yet, and so not in the /dev directory yet. In order to make it work correctly, I simply added this line before the mount:
while ${rootwait} && ! [ -b "${device}" ]; do sleep 1; done

Related

Configuring a docker container to use host UID and generate files on the host system - Preferably at runtime

I am currently working on a research tool that is supposed to be containerized using docker to hopefully be run on as many different systems as possible. This works fine for the most part, we have run into a permission problem because of the workflow though: The tool takes an input file (which we mount into the container), evaluates it using R scripts and is then supposed to generate a report on the input file exactly where the file was taken from on the host system.
The latter part is problematic as at least in our university context, the internal container user lacks write permissions in the (non-root) user home folders, which we are currently taking our testing data from. This would obviously also be bad in a production context as we don't know how the potential users' system is set up, which is why we are trying to dynamically and temporarily set the permissions of the container user to the host user.
I have found different solutions that involve passing the UID/GID to the docker daemon when building the container in some way or another:
docker build --build-arg USER_ID=$(id -u ${USER}) --build-arg GROUP_ID=$(id -g ${USER}) -t IMAGE .
I also changed the dockerfile accordingly using a tutorial that suggested replacing the internal www-data user:
[...Package installation steps that are supposed to be run as root...]
ARG USER_ID
ARG GROUP_ID
RUN if [ ${USER_ID:-0} -ne 0 ] && [ ${GROUP_ID:-0} -ne 0 ]; then \
userdel -f www-data &&\
if getent group www-data ; then groupdel www-data; fi &&\
groupadd -g ${GROUP_ID} www-data &&\
useradd -l -u ${USER_ID} -g www-data www-data &&\
install -d -m 0755 -o www-data -g www-data /work/ &&\
chown --changes --silent --no-dereference --recursive \
--from=33:33 ${USER_ID}:${GROUP_ID} \
/work \
;fi
USER www-data
WORKDIR /work
RUN mkdir files
COPY data/ /opt/MTB/data/
COPY helpers/ /opt/MTB/helpers/
COPY src/www/ /opt/MTB/www/
COPY tmp/ /opt/MTB/tmp/
COPY example_data/ /opt/MTB/example_data/
COPY src/ /opt/MTB/src/
EXPOSE 8080
ENTRYPOINT ["/opt/MTB/src/starter_s_c.sh"]
The entrypoint script starter_s_c.sh is a small bashscript that feeds the trailing argument to the corresponding R script as an input file - the R script writes the report.
This works, but requires the container to be built again for every new user. What we are looking for is a solution that handles the dynamic permission setting at runtime, so that we only have to build the container once and can use it with many different user configurations.
I have found this but I am not entirely sure how to implement it as it would replace our entrypoint script and I'm not sure how to integrate this solution into our project.
Here is our current entrypoint script which already needs the permissions to be set so localmaster.r can generate the report in the host directory:
#!/bin/sh
file="$1"
cd $(dirname $0)/..
if [ $# -eq 0 ]; then
echo '.libPaths(c("~/lib/R/library", .libPaths())); library(shiny); library(shinyjs); runApp("src")' | R --vanilla
else
echo "Rscript --vanilla /opt/MTB/src/localmaster.r "$file""
Rscript --vanilla /opt/MTB/src/localmaster.r "$file"
fi
(If no arguments are given, it starts a shiny app, just to avoid confusion)
Any help or tips would be much appreciated! Thank you.

How to create and mount file for read and write as a HFS+ filesystem in linux?

I am trying to mount a file that will act as a read/write HFS+ filesystem. I am using arch linux based distro so I installed hfsprogs and hfsutils. In debian based distros hfsprogs should be enough.
I created a 8G file like this:
dd if=/dev/zero of=test.img bs=1024 count=0 seek=$[1000*8000]
Then I did the formatting:
mkfs.hfsplus -v TestImg test.img
After that when I try to mount the file I get:
mkdir /tmp/sun
sudo mount -t hfsplus -o loop,rw,offset=0 test.img /tmp/sun
mount: /tmp/sun: mount failed: Operation not permitted
Parted shows that offset it ok:
sudo parted -m test.img unit B print
1:0B:8191999999B:8192000000B:hfs+::;
I also tried to use fdisk with the file creating sun partition table but that did not help either. Can you help me please with creating HFS+ rw filesystem as a file?
I used loop device inappropriately.
The correct steps are:
Create file
dd if=/dev/zero of=test.img bs=100MB count=10 seek=$[10*8]
Create blocked device mapped to that file:
losetup -fP test.img
At this point blocked device /dev/loop0 got created.
Create filesystem:
mkfs.hfsplus test.img
Mount to your folder
mount -o rw,loop /dev/loop0 /tmp/loop_test

mount -a does not mount Fstab file

I have the following in my /etc/fstab file:
proc /proc proc defaults 0 0
/dev/mmcblk0p1 /boot vfat defaults 0 2
/dev/mmcblk0p2 / ext4 defaults,noatime 0 1
sv-01:/mnt/UEF/home/user/Videos/complete /home/user/Videos nfs defaults,noauto,user 0 0
and when I issue the command sudo mount -a -v, I get the following output
mount: proc already mounted on /proc
mount: /dev/mmcblk0p1 already mounted on /boot
nothing was mounted
but when I copy paste the above segment of code and issue the below command, the folder mounts perfectly.
sudo mount sv-01:/mnt/UEF/home/user/Videos/complete /home/user/Videos
What could possibly be causing this?
You specified "noauto" parameter for sv-01:/mnt/UEF/home/user/Videos/complete.
From mount manual:
mount -a [-t type] [-O optlist]
(usually given in a bootscript) causes all filesystems mentioned in
fstab (of the proper type and/or having or not having the proper
options) to be mounted as indicated, except for those whose line
contains the noauto keyword. Adding the -F option will make mount
fork, so that the filesystems are mounted simultaneously.

Run multiple instances of RStudio in a web browser

I have RStudio server installed on a remote aws server (ubuntu) and want to run several projects at the same time (one of which takes lots of time to finish). On Windows there is a simple GUI solution like 'Open Project in New Window'. Is there something similar for rstudio server?
Simple question, but failed to find a solution except this related question for Macs, which offers
Run multiple rstudio sessions using projects
but how?
While running batch scripts is certainly a good option, it's not the only solution. Sometimes you may still want interactive use in different sessions rather than having to do everything as batch scripts.
Nothing stops you from running multiple instances of RStudio server on your Ubuntu server on different ports. (I find this particularly easy to do by launching RStudio through docker, as outlined here. Because an instance will keep running even when you close the browser window, you can easily launch several instances and switch between them. You'll just have to login again when you switch.
Unfortunately, RStudio-server still prevents you having multiple instances open in the browser at the same time (see the help forum). This is not a big issue as you just have to log in again, but you can work around it by using different browsers.
EDIT: Multiple instances are fine, as long as they are not on the same browser, same browser-user AND on the same IP address. e.g. a session on 127.0.0.1 and another on 0.0.0.0 would be fine. More importantly, the instances keep on running even if they are not 'open', so this really isn't a problem. The only thing to note about this is you would have to log back in to access the instance.
As for projects, you'll see you can switch between projects using the 'projects' button on the top right, but while this will preserve your other sessions I do not think the it actually supports simultaneous code execution. You need multiple instances of the R environment running to actually do that.
UPDATE 2020 Okay, it's now 2020 and there's lots of ways to do this.
For running scripts or functions in a new R environment, check out:
the callr package
The RStudio jobs panel
Run new R sessions or scripts from one or more terminal sessions in the RStudio terminal panel
Log out and log in to the RStudio-server as a different user (requires multiple users to be set up in the container, obviously not a good workflow for a single user but just noting that many different users can access the same RStudio server instance no problem.
Of course, spinning up multiple docker sessions on different ports is still a good option as well. Note that many of the ways listed above still do not allow you to restart the main R session, which prevents you from reloading installed packages, switching between projects, etc, which is clearly not ideal. I think it would be fantastic if switching between projects in an RStudio (server) session would allow jobs in the previously active project to keep running in the background, but have no idea if that's in the cards for the open source version.
Often you don't need several instances of Rstudio - in this case just save your code in .R file and launch it using ubuntu command prompt (maybe using screen)
Rscript script.R
That will launch a separate R session which will do the work without freezing your Rstudio. You can pass arguments too, for example
# script.R -
args <- commandArgs(trailingOnly = TRUE)
if (length(args) == 0) {
start = '2015-08-01'
} else {
start = args[1]
}
console -
Rscript script.R 2015-11-01
I think you need R Studio Server Pro to be able to log in with multiple users/sessions.
You can see the comparison table below for reference.
https://www.rstudio.com/products/rstudio-server-pro/
Installing another instance of rstudio server is less than ideal.
Linux server admins, fear not. You just need root access or a kind admin.
Create a group to use: groupadd Rwarrior
Create an additional user with same home directory as your primary Rstudio login:
useradd -d /home/user1 user2
Add primary and new user into Rwarrior group:
gpasswd -a user2 Rwarrior
gpasswd -a user1 Rwarrior
Take care of the permissions for your primary home directory:
cd /home
chown -R user1:Rwarrior /home/user1
chmod -R 770 /home/user1
chmod g+s /home/user1
Set password for the new user:
passwd user2
Open a new browser window in incognito/private browsing mode and login to Rstudio with the new user you created. Enjoy.
I run multiple RStudio servers by isolating them in Singularity instances. Download the Singularity image with the command singularity pull shub://nickjer/singularity-rstudio
I use two scripts:
run-rserver.sh:
Find a free port
#!/bin/env bash
set -ue
thisdir="$(dirname "${BASH_SOURCE[0]}")"
# Return 0 if the port $1 is free, else return 1
is_port_free(){
port="$1"
set +e
netstat -an |
grep --color=none "^tcp.*LISTEN\s*$" | \
awk '{gsub("^.*:","",$4);print $4}' | \
grep -q "^$port\$"
r="$?"
set -e
if [ "$r" = 0 ]; then return 1; else return 0; fi
}
# Find a free port
find_free_port(){
local lower_port="$1"
local upper_port="$2"
for ((port=lower_port; port <= upper_port; port++)); do
if is_port_free "$port"; then r=free; else r=used; fi
if [ "$r" = "used" -a "$port" = "$upper_port" ]; then
echo "Ports $lower_port to $upper_port are all in use" >&2
exit 1
fi
if [ "$r" = "free" ]; then break; fi
done
echo $port
}
port=$(find_free_port 8080 8200)
echo "Access RStudio Server on http://localhost:$port" >&2
"$thisdir/cexec" \
rserver \
--www-address 127.0.0.1 \
--www-port $port
cexec:
Create a dedicated config directory for each instance
Create a dedicated temporary directory for each instance
Use the singularity instance mechanism to avoid that forked R sessions are adopted by PID 1 and stay around after the rserver has shut down. Instead, they become children of the Singularity instance and are killed when that shuts down.
Map the current directory to the directory /data inside the container and set that as home folder (this step might not be nessecary if you don't care about reproducible paths on every machine)
#!/usr/bin/env bash
# Execute a command in the container
set -ue
if [ "${1-}" = "--help" ]; then
echo <<EOF
Usage: cexec command [args...]
Execute `command` in the container. This script starts the Singularity
container and executes the given command therein. The project root is mapped
to the folder `/data` inside the container. Moreover, a temporary directory
is provided at `/tmp` that is removed after the end of the script.
EOF
exit 0
fi
thisdir="$(dirname "${BASH_SOURCE[0]}")"
container="rserver_200403.sif"
# Create a temporary directory
tmpdir="$(mktemp -d -t cexec-XXXXXXXX)"
# We delete this directory afterwards, so its important that $tmpdir
# really has the path to an empty, temporary dir, and nothing else!
# (for example empty string or home dir)
if [[ ! "$tmpdir" || ! -d "$tmpdir" ]]; then
echo "Error: Could not create temp dir $tmpdir"
exit 1
fi
# check if temp dir is empty (this might be superfluous, see
# https://codereview.stackexchange.com/questions/238439)
tmpcontent="$(ls -A "$tmpdir")"
if [ ! -z "$tmpcontent" ]; then
echo "Error: Temp dir '$tmpdir' is not empty"
exit 1
fi
# Start Singularity instance
instancename="$(basename "$tmpdir")"
# Maybe also superfluous (like above)
rundir="$(readlink -f "$thisdir/.run/$instancename")"
if [ -e "$rundir" ]; then
echo "Error: Runtime directory '$rundir' exists already!" >&2
exit 1
fi
mkdir -p "$rundir"
singularity instance start \
--contain \
-W "$tmpdir" \
-H "$thisdir:/data" \
-B "$rundir:/data/.rstudio" \
-B "$thisdir/.rstudio/monitored/user-settings:/data/.rstudio/monitored/user-settings" \
"$container" \
"$instancename"
# Delete the temporary directory after the end of the script
trap "singularity instance stop '$instancename'; rm -rf '$tmpdir'; rm -rf '$rundir'" EXIT
singularity exec \
--pwd "/data" \
"instance://$instancename" \
"$#"

inotify and rsync on large number of files

I am using inotify to watch a directory and sync files between servers using rsync. Syncing works perfectly, and memory usage is mostly not an issue. However, recently a large number of files were added (350k) and this has impacted performance, specifically on CPU. Now when rsync runs, CPU usage spikes to 90%/100% and rsync takes long to complete, there are 650k files being watched/synced.
Is there any way to speed up rsync and only rsync the directory that has been changed? Or alternatively to set up multiple inotifywaits on separate directories. Script being used is below.
UPDATE: I have added the --update flag and usage seems mostly unchanged
#! /bin/bash
EVENTS="CREATE,DELETE,MODIFY,MOVED_FROM,MOVED_TO"
inotifywait -e "$EVENTS" -m -r --format '%:e %f' /var/www/ --exclude '/var/www/.*cache.*' | (
WAITING="";
while true; do
LINE="";
read -t 1 LINE;
if test -z "$LINE"; then
if test ! -z "$WAITING"; then
echo "CHANGE";
WAITING="";
rsync --update -alvzr --exclude '*cache*' --exclude '*.git*' /var/www/* root#secondwebserver:/var/www/
fi;
else
WAITING=1;
fi;
done)
I ended up removing the compression option (z) and upping the WAITING var to 10 (seconds). This seems to have helped, rsync still spikes CPU load but it is shorter lived. Credit goes to an answer on unix stackexchange
You're using rsync to synchronize the root directory of a large tree, so I'm not surprised at the performance loss.
One possible solution is to only synchronize the changed files/directories, instead of the whole root directory.
For instance, file1, file2 and file3 lay under from/dir. When changes are made to these 3 files, use
rsync --update -alvzr from/dir/file1 from/dir/file2 from/dir/file3 to/dir
rather than
rsync --update -alvzr from/dir/* to/dir
But this has a potential bug: rsync won't create directories automatically if target folders don't exist. However, you can use ssh to execute remote command and create directories by yourself.
You may need to set SSH public-key authentication as well, but according to the rsync command line you paste, I assume you've already done this.
reference:
rsync - create all missing parent directories?
rsync: how can I configure it to create target directory on server?
How to use SSH to run a shell script on a remote machine?
SSH error when executing a remote command: "stdin: is not a tty"

Resources