'rsync' - Create automatically folders and subfolders without using 'mkdir' when sending files on remote - rsync

I need to rsync specific files from local on specific locations with a specific nomenclature on the remote machine.
As I need to create subfolders on the remote in order to put the file, I have first used mkdir (like below).
rsync --protect-args -i -avuz --rsync-path="mkdir -p 'FROM/LOCAL/TO/REMOTE/PATH/TO/NEW/FILE' && rsync" --stats --progress "/FROM/LOCAL/PATH/TO/FILE/file.txt" user#IP:"FROM/LOCAL/TO/REMOTE/PATH/TO/NEW/FILE/new_file.txt"
However, as 'mkdir' is disliked (banned) on the remote machine, I would like to know if there's any way to rsync a file at a specific location on the remote without using 'mkdir' (for which the folders can change frequently)

Related

File Transfer to Hadoop HDFS from remote linux server

I need to transfer the Files from remote Linux server to directly HDFS.
I have keytab placed on remote server , after kinit command its activated however i cannot browse the HDFS folders. I know from edge nodes i can directly copy files to HDFS however i need to skip the edge node and directly transfer the files to HDFS.
how can we achieve this.
Let's assume a couple of things first. You have one machine on which the external hard drive is mounted (named DISK) and one cluster of machines with an ssh access to the master (we denote by master in the command line the user#hostname part of the master machine). You run the script on the machine with the drive. The data on the drive consists of multiple directories with multiple files in each (like a 100); the numbers don't matter, it's just to justify the loops. The path to the data will be stored in the ${DIR} variable (on Linux, it would be /media/DISK and on Mac OS X /Volumes/DISK). Here is what the script looks like:
DIR=/Volumes/DISK;
for d in $(ls ${DIR}/);
do
for f in $(ls ${DIR}/${d}/);
do
cat ${DIR}/${d}/${f} | ssh master "hadoop fs -put - /path/on/hdfs/${d}/${f}";
done;
done;
Note that we go over each file and we copy it into a specific file because the HDFS API for put requires that "when source is stdin, destination must be a file."
Unfortunately, it takes forever. When I came back the next morning, it only did a fifth of the data (100GB) and was still running... Basically taking 20 minutes per directory! I ended up going forward with the solution of copying the data temporarily on one of the machines and then copying it locally to HDFS. For space reason, I did it one folder at a time and then deleting the temporarily folder immediately after. Here is what the script looks like:
DIR=/Volumes/DISK;
PTH=/path/on/one/machine/of/the/cluster;
for d in $(ls ${DIR}/);
do
scp -r -q ${DIR}/${d} master:${PTH}/
ssh master "hadoop fs -copyFromLocal ${PTH}/${d} /path/on/hdfs/";
ssh master "rm -rf ${PTH}/${d}";
done;
Hope it helps!

rsync folder from local system to server does not work

I'm trying to copy my webfolder 'depot' from my local machine to my server on Digital Ocean.
For that I run this command in the terminal:
rsync -anv ./Sites/depot root#my-server-ip:/sites
But when I ssh into my server and cd into the sites folder, the 'depot' folder is not there.
Am I doing something wrong?
You have set the -n flag, which results in a "dry run" (you get to see which files would be copied/deleted without actually doing any damage).
To do the actual copy you need to omit the -n flag:
rsync -av ./Sites/depot root#my-server-ip:/sites/depot
Also be careful about how you specify paths for rsync - normally you need a trailing /:
rsync -av ./Sites/depot/ root#my-server-ip:/sites/depot/
otherwise you can end up with copies of directories inside of directories (e.g. sites/depot/depot).
See man rsync for full details.

scp or sftp copy multiple files with single command

I'd like to copy files from/to remote server in different directories.
For example, I want to run these 4 commands at once.
scp remote:A/1.txt local:A/1.txt
scp remote:A/2.txt local:A/2.txt
scp remote:B/1.txt local:B/1.txt
scp remote:C/1.txt local:C/1.txt
What is the easiest way to do that?
Copy multiple files from remote to local:
$ scp your_username#remote.edu:/some/remote/directory/\{a,b,c\} ./
Copy multiple files from local to remote:
$ scp foo.txt bar.txt your_username#remotehost.edu:~
$ scp {foo,bar}.txt your_username#remotehost.edu:~
$ scp *.txt your_username#remotehost.edu:~
Copy multiple files from remote to remote:
$ scp your_username#remote1.edu:/some/remote/directory/foobar.txt \
your_username#remote2.edu:/some/remote/directory/
Source: http://www.hypexr.org/linux_scp_help.php
From local to server:
scp file1.txt file2.sh username#ip.of.server.copyto:~/pathtoupload
From server to local (up to OpenSSH v9.0):
scp -T username#ip.of.server.copyfrom:"file1.txt file2.txt" "~/yourpathtocopy"
From server to local (OpenSSH v9.0+):
scp -OT username#ip.of.server.copyfrom:"file1.txt file2.txt" "~/yourpathtocopy"
From man 1 scp:
-O Use the legacy SCP protocol for file transfers instead of the SFTP protocol. Forcing the use of the
SCP protocol may be necessary for servers that do not implement SFTP, for backwards-compatibility for
particular filename wildcard patterns and for expanding paths with a ‘~’ prefix for older SFTP
servers.
HISTORY
Since OpenSSH 9.0, scp has used the SFTP protocol for transfers by default.
You can copy whole directories with using -r switch so if you can isolate your files into own directory, you can copy everything at once.
scp -r ./dir-with-files user#remote-server:upload-path
scp -r user#remote-server:path-to-dir-with-files download-path
so for instance
scp -r root#192.168.1.100:/var/log ~/backup-logs
Or if there is just few of them, you can use:
scp 1.txt 2.txt 3.log user#remote-server:upload-path
As Jiri mentioned, you can use scp -r user#host:/some/remote/path /some/local/path to copy files recursively. This assumes that there's a single directory containing all of the files you want to transfer (and nothing else).
However, SFTP provides an alternative if you want to transfer files from multiple different directories, and the destinations are not identical:
sftp user#host << EOF
get /some/remote/path1/file1 /some/local/path1/file1
get /some/remote/path2/file2 /some/local/path2/file2
get /some/remote/path3/file3 /some/local/path3/file3
EOF
This uses the "here doc" syntax to define a sequence of SFTP input commands. As an alternative, you could put the SFTP commands into a text file and execute sftp user#host -b batchFile.txt
The answers with {file1,file2,file3} works only with bash (on remote or locally)
The real way is :
scp user#remote:'/path1/file1 /path2/file2 /path3/file3' /localPath
After playing with scp for a while I have found the most robust solution:
(Beware of the single and double quotation marks)
Local to remote:
scp -r "FILE1" "FILE2" HOST:'"DIR"'
Remote to local:
scp -r HOST:'"FILE1" "FILE2"' "DIR"
Notice that whatever after "HOST:" will be sent to the remote and parsed there. So we must make sure they are not processed by the local shell. That is why single quotation marks come in. The double quotation marks are used to handle spaces in the file names.
If files are all in the same directory, we can use * to match them all, such as
scp -r "DIR_IN"/*.txt HOST:'"DIR"'
scp -r HOST:'"DIR_IN"/*.txt' "DIR"
Compared to using the "{}" syntax which is supported only by some shells, this one is universal
The simplest way is
local$ scp remote:{A/1,A/2,B/3,C/4}.txt ./
So {.. } list can include directories (A,B and C here are directories; "1.txt" and "2.txt" are file names in those directories).
Although it would copy all these four files into one local directory - not sure if that's what you wanted.
In the above case you will end up remote files A/1.txt, A/2.txt, B/3.txt and C/4.txt copied over to a single local directory, with file names ./1.txt, ./2.txt, ./3.txt and ./4.txt
Problem: Copying multiple directories from remote server to local machine using a single SCP command and retaining each directory as it is in the remote server.
Solution: SCP can do this easily. This solves the annoying problem of entering password multiple times when using SCP with multiple folders. Consequently, this also saves a lot of time!
e.g.
# copies folders t1, t2, t3 from `test` to your local working directory
# note that there shouldn't be any space in between the folder names;
# we also escape the braces.
# please note the dot at the end of the SCP command
~$ cd ~/working/directory
~$ scp -r username#contact.server.de:/work/datasets/images/test/\{t1,t2,t3\} .
PS: Motivated by this great answer: scp or sftp copy multiple files with single command
Based on the comments, this also works fine in Git Bash on Windows
You can do this way:
scp hostname#serverNameOrServerIp:/path/to/files/\\{file1,file2,file3\\}.fileExtension ./
This will download all the listed filenames to whatever local directory you're on.
Make sure not to put spaces between each filename only use a comma ,.
Copy multiple directories:
scp -r dir1 dir2 dir3 admin#127.0.0.1:~/
Is more simple without using scp:
tar cf - file1 ... file_n | ssh user#server 'tar xf -'
This also let you do some things like compress the stream (-C) or (since OpenSSH v7.3) -J to jump any times through one (or more) proxy servers.
Avoid using passwords by coping your public key to ~/.ssh/authorized_keys (on server) with ssh-copy-id (on client).
Posted also here (with more details) and here.
scp remote:"[A-C]/[12].txt" local:
NOTE: I apologize in advance for answering only a portion of the above question. However, I found these commands to be useful for my current unix needs.
Uploading specific files from a local machine to a remote machine:
~/Desktop/dump_files$ scp file1.txt file2.txt lab1.cpp etc.ext your-user-id#remotemachine.edu:Folder1/DestinationFolderForFiles/
Uploading an entire directory from a local machine to a remote machine:
~$ scp -r Desktop/dump_files your-user-id#remotemachine.edu:Folder1/DestinationFolderForFiles/
Downloading an entire directory from a remote machine to a local machine:
~/Desktop$ scp -r your-user-id#remote.host.edu:Public/web/ Desktop/
In my case, I am restricted to only using the sftp command.
So, I had to use a batchfile with sftp. I created a script such as the following. This assumes you are working in the /tmp directory, and you want to put the files in the destdir_on_remote_system on the remote system. This also only works with a noninteractive login. You need to set up public/private keys so you can login without entering a password. Change as needed.
#!/bin/bash
cd /tmp
# start script with list of files to transfer
ls -1 fileset1* > batchfile1
ls -1 fileset2* >> batchfile1
sed -i -e 's/^/put /' batchfile1
echo "cd destdir_on_remote_system" > batchfile
cat batchfile1 >> batchfile
rm batchfile1
sftp -b batchfile user#host
In the specific case where all the files have the same extension but with different suffix (say number of log file) you use the following:
scp user_name#ip.of.remote.machine:/some/log/folder/some_log_file.* ./
This will copy all files named some_log_file from the given folder within the remote, i.e.- some_log_file.1 , some_log_file.2, some_log_file.3 ....
In my case there were too many files with non related names.
I ended up using,
$ for i in $(ssh remote 'ls ~/dir'); do scp remote:~/dir/$i ./$i; done
1.txt 100% 322KB 1.2MB/s 00:00
2.txt 100% 33KB 460.7KB/s 00:00
3.txt 100% 61KB 572.1KB/s 00:00
$
scp uses ssh for data transfer with the same authentication and provides the same security as ssh.
A best practise here is to implement "SSH KEYS AND PUBLIC KEY AUTHENTICATION". With this, you can write your scripts without worring about authentication. Simple as that.
See WHAT IS SSH-KEYGEN
serverHomeDir='/home/somepath/ftp/'
backupDirAbsolutePath=${serverHomeDir}'_sqldump_'
backupDbName1='2021-08-27-03-56-somesite-latin2.sql'
backupDbName2='2021-08-27-03-56-somesite-latin1.sql'
backupDbName3='2021-08-27-03-56-somesite-utf8.sql'
backupDbName4='2021-08-27-03-56-somesite-utf8mb4.sql'
scp -i ~/.ssh/id_rsa.pub user#server.domain.com:${backupDirAbsolutePath}/"{$backupDbName1,$backupDbName2,$backupDbName3,$backupDbName4}" .
. - at the end will download the files to current dir
-i ~/.ssh/id_rsa.pub - assuming that you established ssh to your server with .pub key
scp -r root#ip-address:/root/dir/ C:\Users\your-name\Downloads\
the -r will let you download all the files inside the dir directory of your remote server

How can I configure rsync to create target directory on remote server?

I would like to rsync from local computer to server. On a directory that does not exist, and I want rsync to create that directory on the server first.
How can I do that?
If you have more than the last leaf directory to be created, you can either run a separate ssh ... mkdir -p first, or use the --rsync-path trick as explained here :
rsync -a --rsync-path="mkdir -p /tmp/x/y/z/ && rsync" $source user#remote:/tmp/x/y/z/
Or use the --relative option as suggested by Tony. In that case, you only specify the root of the destination, which must exist, and not the directory structure of the source, which will be created:
rsync -a --relative /new/x/y/z/ user#remote:/pre_existing/dir/
This way, you will end up with /pre_existing/dir/new/x/y/z/
And if you want to have "y/z/" created, but not inside "new/x/", you can add ./ where you want --relativeto begin:
rsync -a --relative /new/x/./y/z/ user#remote:/pre_existing/dir/
would create /pre_existing/dir/y/z/.
From the rsync manual page (man rsync):
--mkpath create the destination's path component
--mkpath was added in rsync 3.2.3 (6 Aug 2020).
Assuming you are using ssh to connect rsync, what about to send a ssh command before:
ssh user#server mkdir -p existingdir/newdir
if it already exists, nothing happens
The -R, --relative option will do this.
For example: if you want to backup /var/named/chroot and create the same directory structure on the remote server then -R will do just that.
this worked for me:
rsync /dev/null node:existing-dir/new-dir/
I do get this message :
skipping non-regular file "null"
but I don't have to worry about having an empty directory hanging around.
I don't think you can do it with one rsync command, but you can 'pre-create' the extra directory first like this:
rsync --recursive emptydir/ destination/newdir
where 'emptydir' is a local empty directory (which you might have to create as a temporary directory first).
It's a bit of a hack, but it works for me.
cheers
Chris
This answer uses bits of other answers, but hopefully it'll be a bit clearer as to the circumstances. You never specified what you were rsyncing - a single directory entry or multiple files.
So let's assume you are moving a source directory entry across, and not just moving the files contained in it.
Let's say you have a directory locally called data/myappdata/ and you have a load of subdirectories underneath this.
You have data/ on your target machine but no data/myappdata/ - this is easy enough:
rsync -rvv /path/to/data/myappdata/ user#host:/remote/path/to/data/myappdata
You can even use a different name for the remote directory:
rsync -rvv --recursive /path/to/data/myappdata user#host:/remote/path/to/data/newdirname
If you're just moving some files and not moving the directory entry that contains them then you would do:
rsync -rvv /path/to/data/myappdata/*.txt user#host:/remote/path/to/data/myappdata/
and it will create the myappdata directory for you on the remote machine to place your files in. Again, the data/ directory must exist on the remote machine.
Incidentally, my use of -rvv flag is to get doubly verbose output so it is clear about what it does, as well as the necessary recursive behaviour.
Just to show you what I get when using rsync (3.0.9 on Ubuntu 12.04)
$ rsync -rvv *.txt user#remote.machine:/tmp/newdir/
opening connection using: ssh -l user remote.machine rsync --server -vvre.iLsf . /tmp/newdir/
user#remote.machine's password:
sending incremental file list
created directory /tmp/newdir
delta-transmission enabled
bar.txt
foo.txt
total: matches=0 hash_hits=0 false_alarms=0 data=0
Hope this clears this up a little bit.
eg:
from: /xxx/a/b/c/d/e/1.html
to: user#remote:/pre_existing/dir/b/c/d/e/1.html
rsync:
cd /xxx/a/ && rsync -auvR b/c/d/e/ user#remote:/pre_existing/dir/
rsync source.pdf user1#192.168.56.100:~/not-created/target.pdf
If the target file is fully specified, the directory ~/not-created is not created.
rsync source.pdf user1#192.168.56.100:~/will-be-created/
But the target is specified with only a directory, the directory ~/will-be-created is created. / must be followed to let rsync know will-be-created is a directory.
use rsync twice~
1: tranfer a temp file, make sure remote relative directories has been created.
tempfile=/Users/temp/Dir0/Dir1/Dir2/temp.txt
# Dir0/Dir1/Dir2/ is directory that wanted.
rsync -aq /Users/temp/ rsync://remote
2: then you can specify the remote directory for transfer files/directory
tempfile|dir=/Users/XX/data|/Users/XX/data/
rsync -avc /Users/XX/data rsync://remote/Dir0/Dir1/Dir2
# Tips: [SRC] with/without '/' is different
This creates the dir tree /usr/local/bin in the destination and then syncs all containing files and folders recursively:
rsync --archive --include="/usr" --include="/usr/local" --include="/usr/local/bin" --include="/usr/local/bin/**" --exclude="*" user#remote:/ /home/user
Compared to mkdir -p, the dir tree even has the same perms as the source.
If you are using a version or rsync that doesn't have 'mkpath', then --files-from can help. Suppose you need to create 'mysubdir' in the target directory
Create 'filelist.txt' to contain
mysubdir/dummy
mkdir -p source_dir/mysubdir/
touch source_dir/mysubdir/dummy
rsync --files-from='filelist.txt' source_dir target_dir
rsync will copy mysubdir/dummy to target_dir, creating mysubdir in the process. Tested with rsync 3.1.3 on Raspberry Pi OS (debian).

Using local settings through SSH

Is it possible to have an SSH session use all your local configuration files (.bash_profile, .vimrc, etc..) on login? That way you would have the same configuration for, say, editing files in vim in the remote session.
I just came across two alternatives to just doing a git clone of your dotfiles. I take no credit for either of these and can't say I've used either extensively so I don't know if there are pitfalls to either of these.
sshrc
sshrc is a tool (actually just a big bash function) that copies over local rc-files without permanently writing them to the remove user's $HOME - the idea being that might be a shared admin account that other people use. Appears to be customizable for different remote hosts as well.
.ssh/config and LocalCommand
This blog post suggests a way to automatically run a command when you login to a remote host. It tars and pipes a set of files to the remote, then un-tars them on the remote's $HOME:
Your local ~/.ssh/config would look like this:
Host *
PermitLocalCommand yes
LocalCommand tar c -C${HOME} .bashrc .bash_profile .exports .aliases .inputrc .vimrc .screenrc \
| ssh -o PermitLocalCommand=no %n "tar mx -C${HOME}"
You could modify the above to only run the command on certain hosts (instead of the * wildcard) or customize for different hosts as well. There might be a fair amount of duplication per host with this method - although you could package the whole tar c ... | ssh .. "tar mx .." into a script maybe.
Note the above looks like it clobbers the same files on the remote when you connect, so use with caution.
Use a dotfiles.git repo
What I do is keep all my config files in a dotfiles.git on a central server.
You can set it up so that when you ssh into a remote machine, you automatically pull the latest version of the dotfiles. I do something like this:
ssh myhost
cd ~/dotfiles
git pull --rebase
cd ~
ln -sf dotfiles/$username/linux/.* .
Note:
To put that in a shell script, you can automate the process of executing commands on a remote machine by piping to ssh.
The "$username" is there so that you can share your config files with other people you're working with.
The "ln -sf" creates symbolic links to all your dotfiles, overwriting any local ones, such that ~/.emacs is linked to the version controlled file ~/dotfiles/$username/.emacs.
The use of a "linux" subdirectory is just to allow for configuration changes across platforms. I also have a mac directory under dotfiles/$username/mac. Most of the files in the /mac directory are symlinked from the linux directory as it's very similar, but there are some exceptions.
Finally, note that you can make this even more sophisticated with hostnames and the like rather than just a generic 'linux'. With a dotfiles.git, you can also raid dotfiles from your friends, which is awesome -- everyone has their own set of little tricks and hacks.
No, because it's not SSH using your config files, but the remote shell.
I suggest keeping your config files in Subversion or some other VCS. Here's how I do it.
Well, no, because as Andy Lester says, the remote machine is the one doing the work, and it has no access back to your local machine to get .vimrc ...
On the other hand, you could use sshfs to mount the remote file system locally and edit the files locally. This doesn't require you to install anything on the remote machine. Not sure how efficient it is, maybe not great for editing big files over slow links.
Or Komodo IDE has a neat "Open >> Remote File" option which lets you edit files on remote
machines by scping them back and forth automatically.
I do this kind of things every day. I have about 15 bash rc files and .vimrc, a few vim plugin scripts, .screenrc and some other rc files. I have a sync script (written in bash) which uses the cool rsync command to sync all these files to remote servers. Every time I update some files on my main server, I would call the script to sync them to remote servers.
Setting up a svn/git/hg repository on the main server also works for me but my remote servers need to be repeatedly reinstalled for testing. So I find it's more convenient to use rsync.
A few years ago I also used the rdist tool which can also meet the requirement for most of the time. But now I prefer rsync as it supports incremental sync which is very efficient.
ssh can be configured to pass certain environment variables through to the other (remote side). And since most shells will check some environment variables for additional settings to apply, you can hack that into applying some local settings remotely. But its a bit complicated and most administrators turn off the ssh environment variable pass-through in the sshd config anyways.
You could always just copy the files to the machine before connecting with ssh:
#!/bin/bash
scp ~/.bash_profile ~/.vimrc user#host:
ssh user#host
This works best if you are using keys to login and no one else logs in as that user.
Here's a simple bash script I've used for this purpose. It syncs over some folders I like to have copied over using rsync and then adds the ~/bin folder to the remote machines .bashrc if it's not there already. It works best if you have have copied your ssh keys to each server. I use this approach instead of a "dotfiles repo" as lots of the servers I connect to don't have git on them.
So to use it, you'd do something like this:
./bin_sync_to_machine.sh server1
bin_sync_to_machine.sh
function show_help()
{
echo ""
echo "usage: SERVER {SERVER2 SERVER3 etc...}"
echo ""
exit
}
if [ "$1" == "help" ]
then
show_help
fi
if [ -z "$1" ]
then
show_help
fi
# Sync ~/bin and some dot files to remote server using rsync
for SERVER in $*; do
rsync -avrz --progress ~/bin/ -e ssh $SERVER:~/bin
rsync -avrz --progress ~/.vim/ -e ssh $SERVER:~/.vim
rsync -avrz --progress ~/.vimrc -e ssh $SERVER:~/.vimrc
rsync -avrz --progress ~/.aliases $SERVER:~/.aliases
rsync -avrz --progress ~/.aliases $SERVER:~/.bash_aliases
# Ensure remote server has ~/bin in the path
ssh $SERVER '~/bin/path_add_to_path.sh'
done
path_add_to_path.sh
pathadd() {
if [ -d "$1" ] && [[ ":$PATH:" != *":$1:"* ]]; then
PATH="${PATH:+"$PATH:"}$1"
fi
}
# Add to current path if running in a shell
pathadd ~/bin
# Add to ~/.bashrc
if ! grep -q PATH:~/bin ~/.bashrc; then
echo "PATH=\$PATH:~/bin" >> ~/.bashrc
fi
if ! grep -q source ~/.aliases ~/.bashrc; then
echo "source ~/.aliases" >> ~/.bashrc
fi
I wrote an extremely simple tool for this that will allow you to natively transport your .vimrc file whenever you ssh, by using SSHd built-in config options in a non-standard way.
No additional svn,scp,copy/paste, etc required.
It is simple, lightweight, and works by default on all server configurations I have tested so far.
https://github.com/gWOLF3/viSSHous
I think that https://github.com/fsquillace/kyrat does what you need.
I wrote it long time ago before sshrc was born and it has more benefits compared to sshrc:
It does not require dependencies on xxd for both hosts (which can be unavailable on remote host)
Kyrat uses a more efficient encoding algorithm
It is just ~20 lines of code (really easy to understand!)
No need of root access or any installations to the remote host
For instance:
$> echo "alias q=exit" > ~/.config/kyrat/bashrc
$> kyrat myuser#myserver.com
myserver.com $> q
exit

Resources