Compress and encrypt multiple files individually while preserving the file structure - encryption

I have several directories that look like this:
dir1/
|_foo.txt
|_bar.txt
dir2/
|_qux.txt
|_bar.txt
For each of this directory I want to compress the files inside it into some encrypted format
then copy the structure to a new location (somewhere online). So finally we hope to get something like this in the new location:
dir1/
|_foo.rar
|_bar.rar
dir2/
|_qux.rar
|_bar.rar
Is there a simple Unix way to do it?
P.S. I was looking a rar for the encrypted and compressed files but if there is anything better let me know.
EDIT: In case I wasn't clear, this is for backup purposes.

Here is how I would do it.
First, create this helper script and put it somewhere. I put mine in /Users/martin/1temp/stackoverflow/26960080/encrypt-and-compress.sh. Don't forget to make it executable with chmod +x, and don't forget to update the paths to make them match your system. Note the file temp-password-for-batch-encryption.txt which is a simple text file with one line with the password used for encryption. Without this file you have to manually enter the password for each file encrypted, which quickly becomes a bore. Be careful with who has access to read that file though.
#!/bin/bash
# The relative path of the file to encrypt, passed as a parameter by find
file_to_encrypt="$1"
# The directory where the mirrored, encrypted directory structure shall end up
dest_dir="$HOME/1temp/stackoverflow/26960080/result-dir"
# Relative path to the dir of the file to encrypt, used to create the
# same directory structure elsewhere
dir_of_file_to_encrypt="$(dirname ${file_to_encrypt})"
# In what dir to put the result file, used to be able to create that
# dir before we encrypt the file
dest_dir_of_file_to_encrypt="${dest_dir}/${dir_of_file_to_encrypt}"
# Path to where the encrypted file shall be put
dest_file="${dest_dir}/${file_to_encrypt}"
# To not have to input the password manually each time, put it in this
# file temporarily (make sure to now allow anywone else to access this
# file...)
password_file="$HOME/1temp/stackoverflow/26960080/temp-password-for-batch-encryption.txt"
# Make sure the dest dir exists
mkdir -p "${dest_dir_of_file_to_encrypt}"
echo "About to encrypt ${file_to_encrypt} and putting it in ${dest_file} using password in ${password_file}"
# Encrypt the file and put it in the dest dir
# --symetric: Use simple so called symmetric encryption
# --cipher-algo: Select encryption algorithm
# --passphrase-fd 0: make "password from a file" work
# --yes: Overwrite any existing files
cat "${password_file}" | gpg --symmetric --cipher-algo AES256 --passphrase-fd 0 --yes --output "${dest_file}" "${file_to_encrypt}"
Then, cd into the root of the directory structure you want to encrypt.
cd /Users/martin/1temp/stackoverflow/26960080/input-dir/
Then run the script for each file using find, like this:
find . -type f -exec /Users/martin/1temp/stackoverflow/26960080/encrypt-and-compress.sh {} \;
This will mirror the input-dir tree to result-dir. To decrypt a file, use:
gpg --decrypt /Users/martin/1temp/stackoverflow/26960080/result-dir/./dir1/foo.txt
To mirror the resulting encrypted directory structure, I suggest you use rsync. The specifics depends a lot on the setup you have/want, but it's pretty easy to google.
Good luck!

Related

Encrypt zip file in folder with PGP

I have been testing with PGP command line to create a batch file that encrypts all zip files in a folder
So far I have tried
GPG -e -r username c:\foldername*.zip
But when I run the bat file nothing happens.
Do I need the path to where PGP is installed adding?
I would also like to delete the zip once it’s encrypted and on a previous batch file used -SDEL, will this work here
Thanks
gpg doesn't seem to work file wildcards. You should use for cycle, see this answer for the examples: How to do something to each file in a directory with a batch script

Can I used HDFS mv on encrypted folders

I need to load hive partitions from staging folders. Currently we copy and delete. Can I use mv?
I am told that I can not use mv if the folders are EAR (Encryption At Rest). How to tell if a folder is EAR'ed?
I'm assuming the feature you are using for encryption at rest is HDFS transparent encryption (see cloudera 5.14 docs).
There is a command to get all the zones configured for encryption, listZones, but that command requires admin privileges. However, if you just need to check the permission of one file at a time, you should be able to run getFileEncryptionInfo without these permissions.
For example
hdfs crypto -getFileEncryptionInfo -path /path/to/my/file
As for whether you can move files, it looks like the answer to that is no. From the "Rename and Trash considerations" section of the transparent encryption documentation:
HDFS restricts file and directory renames across encryption zone boundaries. This includes renaming an encrypted file / directory into an unencrypted directory (e.g., hdfs dfs mv /zone/encryptedFile /home/bob), renaming an unencrypted file or directory into an encryption zone (e.g., hdfs dfs mv /home/bob/unEncryptedFile /zone), and renaming between two different encryption zones (e.g., hdfs dfs mv /home/alice/zone1/foo /home/alice/zone2).
and
A rename is only allowed if the source and destination paths are in the same encryption zone, or both paths are unencrypted (not in any encryption zone).
So it looks like using cp and rm is your best bet.

Creating multiple directories for users

I am using gatekeeper for access to pages on server.
This is done by creating directories with an index file in them. This then directs whoever inputted the password to a specific page.
I would like to be able to produce lots of directories with either not long random names or assigned names from say a database as creating many by a manual process is not practical.
Can someone tell me how to generate lots of directories on the fly?
Would be even better if users could create their own directory but thats probably something else.
Thanks
If you have bash (shell) access on your server, you can execute a simple bash script to create directories with a file in each.
for f in foo/bar{00..50}; do mkdir -p $f && touch $f/index.txt; done
Replace:
foo/bar with your directory
50 with the number of directories
index.txt with the name of the file
If you want to additionally write text to each file, then do this instead
for f in foo/bar{00..50}; do mkdir -p $f && printf "text\n goes\n here" > $f/index.txt; done

shell built in pwd versus /bin/pwd

I would like to know the code implementation of built-in pwd and /bin/pwd especially when the directory path is a symbolic link.
Example:
hita#hita-laptop:/home$ ls -l
lrwxrwxrwx 1 root root 31 2010-06-13 15:35 my_shell -> /home/hita/shell_new/hita_shell
hita#hita-laptop:/home$ cd my_shell
hita#hita-laptop:/home/my_shell$ pwd <SHELL BUILT-IN PWD>
/home/my_shell
hita#hita-laptop:/home/my_shell$ /bin/pwd
/home/hita/shell_new/hita_shell
The output is different in both the cases. Any clue?
thanks
The kernel maintains a current directory (by inode) and when you need the current directory, it determines its name by walking up the directory tree (using ..) to find the names of all the path components. This is the 'real' or sometimes called 'physical' working directory. There is a library function getcwd(3) which does this for you; on more-recent Linux systems this is actually a system call, which helps with getting a consistent view should the parent directories be in the process of being renamed.
Some shells, notably bash, maintain a environment variable PWD to keep track of where you are, and if you changed directory through a symbolic link, this environment variable will show that. They call this the 'logical' path.
/bin/pwd shows the result of getcwd(3), ie the real path; if you give it -L it will tell you the value of PWD (unless it's rubbish, then you get the real path). (Gnu's version of /bin/pwd does more work than this to deal with complexities of parent directories without read permission and very long path names.)
Bash's built-in pwd shows you the 'logical' path with whatever symlinks you used to get there; even if it's now rubbish (ie deleted or renamed since you used it). The default of the built-in pwd can be changed with set -o physical (on) or set +o physical (off is plus!) The default prompt (containing the current directory) follows the option too.
# make a directory with a symlink alias
cd /tmp
mkdir real
ln -s real sym
cd sym
pwd # will say sym
pwd -L # will say sym
pwd -P # will say real
/bin/pwd # will say real
/bin/pwd -L # will say sym
/bin/pwd -P # will say real
rm /tmp/sym
pwd # says sym, though link no longer exists
/bin/pwd -L # will say real!
rmdir /tmp/real
pwd # says sym, though no directory exists
/bin/pwd # says error, as there isn't one
For what it's worth, my opinion is that all the 'logical' business is just adding to the confusion; the old way was the better way. It's true that symbolic links can be confusing, but this makes it more confusing, because any file operations which open .. don't do the same thing as any directory changes which use .. for example in this rather nasty example:
mkdir -p /tmp/dir/subdir
ln -s /tmp/dir/subdir /tmp/a
cd /tmp/a
ls .. # shows contents of /tmp/dir
(cd .. ; ls) # shows contents of /tmp
To avoid all this, you can put the following in your ~/.bashrc
set -o physical
Hope that helps!
Kind regards,
J.
PS The above is pretty specific to Linux and Gnu bash; other shells and systems are similar but different.
The shell's builtin pwd has the advantage of being able to remember how you accessed the symlinked directory, so it shows you that information. The standalone utility just knows what your actual working directory is, not how you changed to that directory, so it reports the real path.
Personally, I dislike shells that do what you're describing because it shows a reality different than that which standalone tools will see. For example, how a builtin tool parses a relative path will differ from how a standalone tool parses a relative path.
The shell keeps track in its own memory what your currenct directory is by concatenating it with whatever you cd to (and eliminating . and .. entries). It does this so that symbolic links don't mess up cd ... The /bin/pwd implementation walks the directory tree upwards trying to find inodes with the right names.
The built-in pwd shows symbolic links by default, but won't do if you give it the -P option.
In contrast, the pwd command doesn't show symbolic links by default, but will do if given the -L option.

Add last n lines of files to tar/zip

I need to regularly send a collection of log files that can grow quite large, so I would like to only send the last n lines of the each of the files.
for example:
/usr/local/data_store1/file.txt (500 lines)
/usr/local/data_store2/file.txt (800 lines)
Given a file with a list of needed files named files.txt, I would like to create an archive (tar or zip) with the last 100 lines of each of those files.
I can do this by creating a separate directory structure with the tail-ed files, but that seems like a waste of resources when there's probably some piping magic that can happen to accomplish it. Full directory structure also must be preserved since files can have the same names in different directories.
I would like the solution to be a shell script if possible, but perl (without added modules) is also acceptable (this is for Solaris machines that don't have ruby/python/etc.. installed on them.)
You could try
tail -n 10 your_file.txt | while read line; do zip /tmp/a.zip $line; done
where a.zip is the zip file and 10 is n or
tail -n 10 your_file.txt | xargs tar -czvf test.tar.gz --
for tar.gz
You are focusing in an specific implementation instead of looking at the bigger picture.
If the final goal is to have an exact copy of the files on the target machine while minimizing the amount of data transfered, what you should use is rsync, which automatically sends only the parts of the files that have changed and also can automatically compress while sending and decompress while receiving.
Running rsync doesn't need any more daemons on the target machine that the standard sshd one, and to setup automatic transfers without passwords you just need to use public key authentication.
There is no piping magic for that, you will have to create the folder structure you want and zip that.
mkdir tmp
for i in /usr/local/*/file.txt; do
mkdir -p "`dirname tmp/${i:1}`"
tail -n 100 "$i" > "tmp/${i:1}"
done
zip -r zipfile tmp/*
Use logrotate.
Have a look inside /etc/logrotate.d for examples.
Why not put your log files in SCM?
Your receiver creates a repository on his machine from where he retrieves the files by checking them out.
You send the files just by commiting them. Only the diff will be transmitted.

Resources