ec2-automate-backup purge not working - ec2-api-tools

We are running this script below and this doesnt delete anything older than one day what are we missing ?
We got the script from github.com/colinbjohnson/aws-missing-tools/tree/master/ec2-automate-backup
ec2-automate-backup -r "us-west-2" -s tag -t "Backup,Values=true" -k 1 -p -h > /data/scripts/ec2-automate-backup.log
Snapshots taken by ec2-automate-backup will be eligible for purging after the following date (the purge after date given in seconds from epoch): 1458239434.
Tagging Snapshot snap-b9fffbe6 with the following Tags: Key=CreatedBy,Value=ec2-automate-backup Key=InitiatingHost,Value='ip-10-220-5-100' Key=PurgeAfterFE,Value=1458239434 Key=PurgeAllow,Value=true
Tagging Snapshot snap-8c457dc9 with the following Tags: Key=CreatedBy,Value=ec2-automate-backup Key=InitiatingHost,Value='ip-10-220-5-100' Key=PurgeAfterFE,Value=1458239434 Key=PurgeAllow,Value=true

The below cron job entry is working great for me. First take snapshots, then create tags and finally purge the old ones, here you go (every day at 0:00 am):
0 0 * * * /path/to/script/ec2-automate-backup.sh -r "<your-region>" -s tag -t "Backup,Values=true" -k 15 -p -h >> /path/to/log/ec2-automate-backup.log 2>&1
here:
-r - the region that contains the EBS volumes for which you wish to have a snapshot created.
-s - the selection method by which EBS volumes will be selected. Currently supported selection methods are "volumeid" and "tag." The selection method "volumeid" identifies EBS volumes for which a snapshot should be taken whereas the selection method "tag" identifies EBS volumes for which a snapshot should be taken by a filter that utilizes a Key and Value pair.
-t - the "tag" parameter is required if the "method" of selecting EBS volumes for snapshot is by tag (-s tag). The format for tag is key,Values=$desired_values (example: Backup,Values=true) and the correct method for running ec2-automate-backup in this manner is ec2-automate-backup -s tag -t Backup,Values=true". (You have to tag "Backup=true" all the volumes that you want them to backup)
-k - the period after which a snapshot can be purged. For example, running "ec2-automate-backup.sh -v "vol-6d6a0527 vol-636a0112" -k 31" would allow snapshots to be removed after 31 days. purge_after_days creates two tags for each volume that was backed up - a PurgeAllow tag which is set to PurgeAllow=true and a PurgeAfter tag which is set to the present day (in UTC) + the value provided by -k.
-p - the -p flag will purge (meaning delete) all snapshots that were created more than "purge after days" ago. ec2-automate-backup looks at two tags to determine which snapshots should be deleted - the PurgeAllow and PurgeAfter tags. The tags must be set as follows: PurgeAllow=true and PurgeAfter=YYYY-MM-DD where YYYY-MM-DD must be before the present date.
-h - tag snapshots "InitiatingHost" tag to specify which host ran the script
The example tags after the snapshot is created:

Related

AWS Code Deploy - Script at specified location: scripts/validate_service.sh failed with exit code 1

My deployments fail on last step Validate Service with error message:
The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems.
Events log
No lines are selected.
My validate_service.sh contain
#!/bin/bash
# verify we can access our webpage successfully
curl -v --silent localhost:80 2>&1 | grep Welcome
Can someone advice what should I change ?
Script return value matters. Yours looks good to me. I just added couple of seconds to wait until application starts up.
In case you use bash -x together with pipeline of commands, you better add shopt -s pipefail so all pipeline fails when one of the commands fails.
Checkout my script:
#!/bin/bash
sleep 5
curl http://localhost:3009 | grep Welcome

Dynamic exclusion list in lsyncd

Our cloud platform is powered by opennebula. So we have two instances of the frontend in "cold swap". We use lsyncd daemon trying to keep instances in datastores synced, but there are some points: we don't want to sync VM's images that have an extension .bak cause of the other script moves all the .bak to other storage on schedule. The sync script logic looks like find all the .bak in /var/lib/one/datastores/ then create exclude.lst and then start lsyncd. Seems OK until we take a look at the datastores:
oneadmin#nola:~/cluster$ dir /var/lib/one/datastores/1/
006e099c57061d87d4b8f78ec7199221
008a10fa0764c9ac8d6fb9206c9b69bd
069299977f2fea243a837efed271182f
0a73a9adf74d92b4f175abcb578cabac
0b1cc002e370e1acd880cf781df0a6fb
0b470b182ac6d554774a3615ce87e292
0c0d98d1e0aabc23ef548ddb564c578d
0c3fad9c92a8efc7e13a73d8ae85caa3
..and so on.
We solved it with this monstrous function:
function create_exclude {
oneimage list -x | \
xmlstarlet sel -t -m "IMAGE_POOL/IMAGE" -v "ID" -o ";" -v "NAME" -o ";" -v "SOURCE" -o ":" | \
sed s/:/'\n'/g | \
awk -F";" '/.bak;\/var\/lib/ {print $3}' | \
cut -d / -f8 > /var/lib/one/cluster/exclude.lst
}
The result is the list which contains VM IDs with .bak images inside so we can exclude the whole VM folder from syncing. That's not kinda what we wanted, as the original image stays not synced. But it could be solved by restart the lsyncd script at the moment when other script moves all the .bak to other storage.
Now we get to the topic of the question.
It works until a new .bak will created. No way to add new string in exclude.lst "on the go" but to stop lsync and restart script which re-creates exclude.lst. But there is also no possibility to check the moment of creation a new .bak except another script that will monitor it in some period.
I believe that less complicated solution exists. It depends on opennebula of course, particularly in the way of the /datastores/ folder stores VMs.
Glad to know you are using OpenNebula to run your cloud :) Have you tried to use our Community Forum for support? I'm sure the rest of the Community will be happy to give a hand!
Cheers!

How zsh does store history? History file format

I am not sure if I fully understand how zsh stores its history. Example line:
: 1458291931:0;ls -l
I guess we have here:
timestamp: 1458291931
command: ls -l
but what this mystical 0 in between means?
This is the so-called *extended history format, which is enabled by the EXTENDED_HISTORY shell option. The second number (the "mystical 0") is the duration of the command. "0" either means that the command finished quickly or - depending on your settings - that the duration is not saved. If either of the shell options INC_APPEND_HISTORY or SHARE_HISTORY is enabled (you can check this with setopt | grep -E '^(incappend|share)history$'), then zsh will write the history entry to the history file immediately after confirming the command. The duration will be saved as "0" in that case.
If you want to make use of the duration metric while still saving the history to file during shell sessions, you can set the option INC_APPEND_HISTORY_TIME, in which case zsh will wait for command completion before writing the entry. Obviously this will otherwise behave like INC_APPEND_HISTORY.
Note: only one of the options INC_APPEND_HISTORY, INC_APPEND_HISTORY_TIME and SHARE_HISTORY should be active

Run multiple instances of RStudio in a web browser

I have RStudio server installed on a remote aws server (ubuntu) and want to run several projects at the same time (one of which takes lots of time to finish). On Windows there is a simple GUI solution like 'Open Project in New Window'. Is there something similar for rstudio server?
Simple question, but failed to find a solution except this related question for Macs, which offers
Run multiple rstudio sessions using projects
but how?
While running batch scripts is certainly a good option, it's not the only solution. Sometimes you may still want interactive use in different sessions rather than having to do everything as batch scripts.
Nothing stops you from running multiple instances of RStudio server on your Ubuntu server on different ports. (I find this particularly easy to do by launching RStudio through docker, as outlined here. Because an instance will keep running even when you close the browser window, you can easily launch several instances and switch between them. You'll just have to login again when you switch.
Unfortunately, RStudio-server still prevents you having multiple instances open in the browser at the same time (see the help forum). This is not a big issue as you just have to log in again, but you can work around it by using different browsers.
EDIT: Multiple instances are fine, as long as they are not on the same browser, same browser-user AND on the same IP address. e.g. a session on 127.0.0.1 and another on 0.0.0.0 would be fine. More importantly, the instances keep on running even if they are not 'open', so this really isn't a problem. The only thing to note about this is you would have to log back in to access the instance.
As for projects, you'll see you can switch between projects using the 'projects' button on the top right, but while this will preserve your other sessions I do not think the it actually supports simultaneous code execution. You need multiple instances of the R environment running to actually do that.
UPDATE 2020 Okay, it's now 2020 and there's lots of ways to do this.
For running scripts or functions in a new R environment, check out:
the callr package
The RStudio jobs panel
Run new R sessions or scripts from one or more terminal sessions in the RStudio terminal panel
Log out and log in to the RStudio-server as a different user (requires multiple users to be set up in the container, obviously not a good workflow for a single user but just noting that many different users can access the same RStudio server instance no problem.
Of course, spinning up multiple docker sessions on different ports is still a good option as well. Note that many of the ways listed above still do not allow you to restart the main R session, which prevents you from reloading installed packages, switching between projects, etc, which is clearly not ideal. I think it would be fantastic if switching between projects in an RStudio (server) session would allow jobs in the previously active project to keep running in the background, but have no idea if that's in the cards for the open source version.
Often you don't need several instances of Rstudio - in this case just save your code in .R file and launch it using ubuntu command prompt (maybe using screen)
Rscript script.R
That will launch a separate R session which will do the work without freezing your Rstudio. You can pass arguments too, for example
# script.R -
args <- commandArgs(trailingOnly = TRUE)
if (length(args) == 0) {
start = '2015-08-01'
} else {
start = args[1]
}
console -
Rscript script.R 2015-11-01
I think you need R Studio Server Pro to be able to log in with multiple users/sessions.
You can see the comparison table below for reference.
https://www.rstudio.com/products/rstudio-server-pro/
Installing another instance of rstudio server is less than ideal.
Linux server admins, fear not. You just need root access or a kind admin.
Create a group to use: groupadd Rwarrior
Create an additional user with same home directory as your primary Rstudio login:
useradd -d /home/user1 user2
Add primary and new user into Rwarrior group:
gpasswd -a user2 Rwarrior
gpasswd -a user1 Rwarrior
Take care of the permissions for your primary home directory:
cd /home
chown -R user1:Rwarrior /home/user1
chmod -R 770 /home/user1
chmod g+s /home/user1
Set password for the new user:
passwd user2
Open a new browser window in incognito/private browsing mode and login to Rstudio with the new user you created. Enjoy.
I run multiple RStudio servers by isolating them in Singularity instances. Download the Singularity image with the command singularity pull shub://nickjer/singularity-rstudio
I use two scripts:
run-rserver.sh:
Find a free port
#!/bin/env bash
set -ue
thisdir="$(dirname "${BASH_SOURCE[0]}")"
# Return 0 if the port $1 is free, else return 1
is_port_free(){
port="$1"
set +e
netstat -an |
grep --color=none "^tcp.*LISTEN\s*$" | \
awk '{gsub("^.*:","",$4);print $4}' | \
grep -q "^$port\$"
r="$?"
set -e
if [ "$r" = 0 ]; then return 1; else return 0; fi
}
# Find a free port
find_free_port(){
local lower_port="$1"
local upper_port="$2"
for ((port=lower_port; port <= upper_port; port++)); do
if is_port_free "$port"; then r=free; else r=used; fi
if [ "$r" = "used" -a "$port" = "$upper_port" ]; then
echo "Ports $lower_port to $upper_port are all in use" >&2
exit 1
fi
if [ "$r" = "free" ]; then break; fi
done
echo $port
}
port=$(find_free_port 8080 8200)
echo "Access RStudio Server on http://localhost:$port" >&2
"$thisdir/cexec" \
rserver \
--www-address 127.0.0.1 \
--www-port $port
cexec:
Create a dedicated config directory for each instance
Create a dedicated temporary directory for each instance
Use the singularity instance mechanism to avoid that forked R sessions are adopted by PID 1 and stay around after the rserver has shut down. Instead, they become children of the Singularity instance and are killed when that shuts down.
Map the current directory to the directory /data inside the container and set that as home folder (this step might not be nessecary if you don't care about reproducible paths on every machine)
#!/usr/bin/env bash
# Execute a command in the container
set -ue
if [ "${1-}" = "--help" ]; then
echo <<EOF
Usage: cexec command [args...]
Execute `command` in the container. This script starts the Singularity
container and executes the given command therein. The project root is mapped
to the folder `/data` inside the container. Moreover, a temporary directory
is provided at `/tmp` that is removed after the end of the script.
EOF
exit 0
fi
thisdir="$(dirname "${BASH_SOURCE[0]}")"
container="rserver_200403.sif"
# Create a temporary directory
tmpdir="$(mktemp -d -t cexec-XXXXXXXX)"
# We delete this directory afterwards, so its important that $tmpdir
# really has the path to an empty, temporary dir, and nothing else!
# (for example empty string or home dir)
if [[ ! "$tmpdir" || ! -d "$tmpdir" ]]; then
echo "Error: Could not create temp dir $tmpdir"
exit 1
fi
# check if temp dir is empty (this might be superfluous, see
# https://codereview.stackexchange.com/questions/238439)
tmpcontent="$(ls -A "$tmpdir")"
if [ ! -z "$tmpcontent" ]; then
echo "Error: Temp dir '$tmpdir' is not empty"
exit 1
fi
# Start Singularity instance
instancename="$(basename "$tmpdir")"
# Maybe also superfluous (like above)
rundir="$(readlink -f "$thisdir/.run/$instancename")"
if [ -e "$rundir" ]; then
echo "Error: Runtime directory '$rundir' exists already!" >&2
exit 1
fi
mkdir -p "$rundir"
singularity instance start \
--contain \
-W "$tmpdir" \
-H "$thisdir:/data" \
-B "$rundir:/data/.rstudio" \
-B "$thisdir/.rstudio/monitored/user-settings:/data/.rstudio/monitored/user-settings" \
"$container" \
"$instancename"
# Delete the temporary directory after the end of the script
trap "singularity instance stop '$instancename'; rm -rf '$tmpdir'; rm -rf '$rundir'" EXIT
singularity exec \
--pwd "/data" \
"instance://$instancename" \
"$#"

Crontab - Run in directory

I would like to set a job to run daily in the root crontab. But I would like it to execute it from a particular directory so it can find all the files it needs, since the application has a bunch of relative paths.
Anyway, can I tell crontab to run from a particular directory?
All jobs are executed by a shell, so start that shell snippet by a command to change the directory.
cd /path/to/directory && ./bin/myapp
Concerning the use of && instead of ;: normally it doesn't make a difference, but if the cd command fails (e.g. because the directory doesn't exist) with && the application isn't executed, whereas with ; it's executed (but not in the intended directory).
Reading man 5 crontab should tell you that there's a HOME variable set which can be redefined in the file. It becomes your working directory. You can set PATH for the command(s) too. Of course this affects all the cron schedule lines.
E.G.
Several environment variables are set up automatically by the cron(8)
daemon. SHELL is set to /bin/sh, and LOGNAME and HOME are set from
the /etc/passwd line of the crontab´s owner. HOME and SHELL can be
overridden by settings in the crontab; LOGNAME can not.
(Note: the LOGNAME variable is sometimes called USER on BSD systems
and is also automatically set).
Depending on your cron of course, but mine also has MAILTO, MAILFROM CONTENT_TYPE, CRON_TZ, RANDOM_DELAY, and MLS_LEVEL.
So for your hypothetical app I'd recommend a file name /etc/cron.d/hypothetical containing:
# Runs hypothetical app # 00:01Z in its local path for reading its config or something.
SHELL=/bin/sh
HOME=/where/the/app/is
PATH=/where/the/app/is:/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
CRON_TZ=UTC
1 0 * * * theappuser hypothetical --with arguments
For example with docker-compose relying on the cwd docker-compose.yml:
SHELL=/bin/sh
HOME=/path/to/composed-app
5 5 * * * root docker-compose restart -t 10 service-name

Resources