Unix File naming conventions - unix

I am in process of doing a remote to local copy using rsyncand the file list is picked up from a txt file which looks like below
#FILE_PATH FILENAME
/a/b/c test1.txt
/a/x/y test2.txt
/a/v/w test1.txt
The FILE_PATH is the same for remote and local servers. The Problem is, I need to copy the files to a Staging area in the local first and then need to move it to the FILE_PATH, so as to make sure Integrity.
If I simply copy all the files to the Staging area, test1.txt will get overridden. So I thought I can go with clubbing the FILE_PATH and FILENAME, thus it gets unique. To do so, I can not create the file as /a/b/c/test1.txt in my staging area.
So I thought to replace / with special chars that support Unix.
Tried with - _ : ., I got conflicts with all this.
-a-b-c-test1.txt
How I can achieve copying files to the same Staging directory though the file names are of same but supposed to reach different directory
your thoughts pls.

Related

WinSCP script to synchronize directories, but exclude several subdirectories

I need to write a script that synchronizes local files with a remote machine.
My file structure is:
ProjectFolder/
.git/
input/
output/
classes/
main.py
readme.md
I need to synchronize everything, but:
completely ignore .git folder
ignore files in input and output folders, but copy the folder
So far my code is:
open sftp://me:password#server -hostkey="XXXXXXXX"
option batch abort
option confirm off
synchronize remote "C:\Users\MYNAME\Documents\MY FOLDER\Python Projects\ProjectFolder" "/home/MYNAME/py_proj/ProjectFolder" -filemask="|C:\Users\MYNAME\Documents\MY FOLDER\Python Projects\ProjectFolder\.git"
close
exit
First question: it doesn't seems to work.
Second question, how to add mask for input and output folder if I have spaces in file paths?
Thanks to all in advance.
Masks for directories have to end with a slash.
To exclude files in a specific folder, use something like */folder/*
-filemask="|.git\;*/input/*;*/output/*"

Manipulating multiple files with same name

I am trying to move about 1000 files that all begin with "simulation." into one directory entitled "simulations." I am using a remote server, and the files are currently in my home directory on the server. I need to move them to a separate directory because I need to, ultimately, append all the "simulation." files into one file. Is there a way to either append only the files in my home directory that begin with "simulation." or move only these files into a new directory?
Thank you.
Assuming you can change directories to the desired path on the remote server... and the simulations are located in /currentPath ... then....
cd desiredPath
mkdir simulations
mv /currentPath/simulation* simulations
(to futher answer your question... if you wanted to append all the files together, you could type cat simulation* > allSimulations.txt

Rsync: Transfer large number of files while ignoring directory structure

I am trying to copy files to a destination ignoring the directory structure. Here is how my files are stored:
/data/csv/1/history_1971-02-09.csv
/data/csv/1/history_1971-02-10.csv
/data/csv/2/history_1971-02-09.csv
/data/csv/2/history_1971-02-10.csv
...
I want to transfer all the .csv files in the same remote folder. The folder "csv" and all the subfolders can contain up to 1,000,000 files.
I am able to transfer and put all the csv files in the same remote folder with the command:
rsync -azvv --include-from=/tmp/transfer_list.txt --exclude=* /data/export/csv/*/ /tmp/rsync/
This works well for a limited number of files. The problem appears when there are 500,000+ files, it goes over each of the files to check the exclude pattern before transfering like this :
[sender] hiding file history_1971-02-09_18h40m33s.csv because of pattern *
[sender] hiding file history_-02-09_18h59m26s.csv because of pattern *
[sender] hiding file history_1971-02-09_18h56m23s.csv because of pattern *
....
Which takes forever to complete...
So my question is: Is there any way to do what im trying to do without using the "--exclude" option?
My limitations:
I have to use rsync
I have to transfer in batches of max 15,000 files (contained in the
transfer_list.txt file)
I cannot change the structure of the source folder
I cannot store the data any other way because its a third party
software

how to mass change folder names in amazon s3 with a script

I uploaded a bunch of images but accidentally named the folders with spaces. Now you can't access them cause obviously urls cant have spaces.
I've downloaded the aws cli and was wondering how to change folder names? I've looked at the documentation but I'm still having trouble and hoping someone can help.
I've tried the below command without any success:
aws s3 mv "s3://mybucketname/firstfolder/second folder with spaces/" s3://mybucketname/firstfolder/secondfolderwithspaces/ --recursive
How do I change the name of "second folder with spaces" to "secondfolderwithspaces"?
Also, is there a way I can iterate through these folders? Something like
for folder in s3:/bucketname/firstfolder:
aws s3 mv "folder with spaces" folderwithspaces --recursive
I'd do it via a python script using the boto SDK:
import boto
conn = boto.s3.connect_to_region('ap-southeast-2')
bucket = conn.get_bucket('bucket-name')
for k in bucket.list():
if ' ' in k.key:
bucket.copy_key(k.key.replace(' ', '+'), bucket.name, k.key)
bucket.delete_key(k.key)
The script loops through each object, copies it to a new key (which is like a filename, but it includes the full path including the directory name), then deletes the old object. It fully executes within Amazon S3 -- the contents of the objects are not downloaded/uploaded.
Modify the replace command to suit your needs.
URLs can have spaces. You have to encode them. Space character becomes "%20".
If you have chrome or firefox, open the developper tools console, and type
encodeURI("second folder with spaces")
It prints
second%20folder%20with%20spaces
For the mass rename, it can't be done in the same way as you traditionally would on an OS (Linux/Windows/Mac). On S3, you cannot rename files, you must copy them. So you have to download their content, delete them, and new files.
Amazon S3 boto: How do you rename a file in a bucket?

7zip command line - recursively unzipping all zipped files

I have 7zip on my desktop computer, at: c:\program files\7-zip\7z.exe. I also have 2 mapped drives that have similar structures, drive X and drive Y. Below is an example of the source.
Drive X
X:\Sourcefolder\Folder1\file1.zip
X:\Sourcefolder\Folder1\file2.zip
X:\Sourcefolder\Folder2\file3.zip
X:\Sourcefolder\Folder3\file4.zip
The destination folder structure would be as follows:
Drive Y
Y:\DestFolder\Thing1\Folder1\[file1.zip contents and subfolders]
Y:\DestFolder\Thing1\Folder1\[file2.zip contents and subfolders]
Y:\DestFolder\Thing1\Folder2\[file3.zip contents and subfolders]
Y:\DestFolder\Thing1\Folder3\[file4.zip contents and subfolders]
The folders on Drive Y (DestFolder\Thing1\Folder1, 2 & 3) are already created. Some of them may already have other files & subfolders in them.
I can run the following command line and unzip the contents:
for /R %i IN (*.zip) DO "c:\program files\7-zip\7z.exe" x "%i" -o"Y:\DestFolder\Thing1\Folder1"
However, what happens is that on my mapped drive, I see a NEW structure out there that is exactly what I had in the second set of quotes in the command line, even if those folders already existed. Thus far, I've only tested it on empty folders, as I am concerned it might corrupt any existing files that would be there. I can navigate between them, and can cut & paste the files into the correct folder.
Why is 7zip "creating" the duplicate folder structure to extract the files into? I know the -o switch allows you to specify a destination directory, but it doesn't say in the help file that it creates it or what happens if it already exists.
Should I be using another command line parameter to extract these zip files into the proper folders? Thanks!

Resources