I am trying to copy files to a destination ignoring the directory structure. Here is how my files are stored:
/data/csv/1/history_1971-02-09.csv
/data/csv/1/history_1971-02-10.csv
/data/csv/2/history_1971-02-09.csv
/data/csv/2/history_1971-02-10.csv
...
I want to transfer all the .csv files in the same remote folder. The folder "csv" and all the subfolders can contain up to 1,000,000 files.
I am able to transfer and put all the csv files in the same remote folder with the command:
rsync -azvv --include-from=/tmp/transfer_list.txt --exclude=* /data/export/csv/*/ /tmp/rsync/
This works well for a limited number of files. The problem appears when there are 500,000+ files, it goes over each of the files to check the exclude pattern before transfering like this :
[sender] hiding file history_1971-02-09_18h40m33s.csv because of pattern *
[sender] hiding file history_-02-09_18h59m26s.csv because of pattern *
[sender] hiding file history_1971-02-09_18h56m23s.csv because of pattern *
....
Which takes forever to complete...
So my question is: Is there any way to do what im trying to do without using the "--exclude" option?
My limitations:
I have to use rsync
I have to transfer in batches of max 15,000 files (contained in the
transfer_list.txt file)
I cannot change the structure of the source folder
I cannot store the data any other way because its a third party
software
Related
I'm running two processes on AIX. Process one is generating several files, process two does backups from all files that are in a backup directory.
Process one will copy or move the files into the backup directory. Since process two is always running in the background there is the risk of it starting a backup of a file that is still in the process of being copied or moved and therefore incomplete. How can I avoid this problem?
Process one should create files in another directory (on the same disk); and when a file is created, move it into the final directory. Move is an atomic operation, so process2 will only find complete files.
Edit: on AIX, /usr/bin/istat helps to make sure that two directories (or files) are on the same disk/partition/device, e.g.
for Dir in /home /home/zsiga /tmp;
do /usr/bin/istat "$Dir" | grep device;
done
Result:
Inode 2 on device 10/8 Directory
Inode 33 on device 10/8 Directory
Inode 2 on device 10/7 Directory
The first two are on the same disk/partition/device (10/8); the last one is on another device (10/7)
I am trying to move about 1000 files that all begin with "simulation." into one directory entitled "simulations." I am using a remote server, and the files are currently in my home directory on the server. I need to move them to a separate directory because I need to, ultimately, append all the "simulation." files into one file. Is there a way to either append only the files in my home directory that begin with "simulation." or move only these files into a new directory?
Thank you.
Assuming you can change directories to the desired path on the remote server... and the simulations are located in /currentPath ... then....
cd desiredPath
mkdir simulations
mv /currentPath/simulation* simulations
(to futher answer your question... if you wanted to append all the files together, you could type cat simulation* > allSimulations.txt
I have a large folder of files that needs to be transferred to a remote site. This folder is currently 10GB total, but contains lots of much smaller files.
Rather than copying the entire 10GB each time, we'd like to massively reduce the data transfer size to be only the files that are new or changed. We plan to do this like so:
SOURCE_DIR is the folder that has all the files and is up-to-date.
COMPARE_DIR is a directory "clone" of the folder at the remote end. It is basically all the files up to the last time files were transferred.
TRANSFER_DIR is an empty folder that (we hope) ROBOCOPY can place files that are new or changed in SOURCE_DIR when compared with COMPARE_DIR into.
An example:
SOURCE_DIR has 4 files: 1.txt, 2.txt, 3.txt, 4.txt
COMPARE_DIR has 3 of those files: 1.txt, 2.txt, 3.txt
The ROBOCOPY command would compare SOURCE_DIR with COMPARE_DIR and see that 4.txt isn't in COMPARE_DIR so copies it into TRANSFER_DIR
TRANSFER_DIR then only has 4.txt file in it which we can copy up to the remote end and place in the folder making it the same as our SOURCE_DIR this end.
This can be done with rsync using the --compare-dest=DIR argument, but as this is Windows, I'd rather not have to install rsync unless I need to.
I am in process of doing a remote to local copy using rsyncand the file list is picked up from a txt file which looks like below
#FILE_PATH FILENAME
/a/b/c test1.txt
/a/x/y test2.txt
/a/v/w test1.txt
The FILE_PATH is the same for remote and local servers. The Problem is, I need to copy the files to a Staging area in the local first and then need to move it to the FILE_PATH, so as to make sure Integrity.
If I simply copy all the files to the Staging area, test1.txt will get overridden. So I thought I can go with clubbing the FILE_PATH and FILENAME, thus it gets unique. To do so, I can not create the file as /a/b/c/test1.txt in my staging area.
So I thought to replace / with special chars that support Unix.
Tried with - _ : ., I got conflicts with all this.
-a-b-c-test1.txt
How I can achieve copying files to the same Staging directory though the file names are of same but supposed to reach different directory
your thoughts pls.
I am attempting to use rsync to copy files, but I want to not copy hidden files and folders, and there is one ordinary file I want excluded from the file transfer. I believe I am eliminating the hidden folders with the --exclude="./" and I believe I am excluding the hidden file with the --exclude file path option. If I eliminate the --exclude file path option, I don't get any errors, but that file is copied, which I do not want. If I eliminate the --excluude="./" the hidden files are copied, which I do not want either. What am I doing wrong?
mbp:~ username $ rsync —-exclude /Users/username/work/java/textsearch/settings/search_config.properties --exclude=".*/" -avz /Users/username/work/java/ root#remote.local:/usr/local/java/ -n
building file list ... rsync: link_stat "/Users/username/?\#200\#224-exclude" failed: No such file or directory (2)
done
sent 9560 bytes received 20 bytes 6386.67 bytes/sec
total size is 17461760 speedup is 1822.73
rsync error: some files could not be transferred (code 23) at /SourceCache/rsync/rsync-42/rsync/main.c(992) [sender=2.6.9]
1) What is /Users/username/?#200#224-exclude and why is rsync looking for it?
2) How do I get rsync to copy everything except the hidden folders/files and the specified file?
If this is an exact copy of the command line, the "--" in front of exclude is not using the correct characters. Delete this and replace with double minus. What happens is, that rsync doesn't recognize the option, instead searching the user directory for the file "—-exclude"