exclude read-only files with rsync - rsync

While using rsync I would like to filter the files based on read/write attribute and potentially on timestamp. The manual does not mention that this would be possible. Well, is it?
In my shell I can do:
dir *.[CHch](w)
to list all writable source C sources, so I hoped that:
rsync -avzL --filter="+ */" --filter='+ *.[CHch](w)' --filter='- *' remote_host:dev ~/
might work, but apparently it does not.
Any ideas?

As of version 3.0.8, rsync doesn't support filtering on anything other than filename.
Your best bet is probably using find to generate the list of files to sync, and then using rsync's --files-from option. find has most all the options you could ever want for differentiating files.

Related

synchronization over http: rsync versus normal upload

I'm running file synchronization over HTTP. Both sides implement rsync. When synchronizing, for uploading I have two choices:
use a simple post request if:
the file to be uploaded does'nt exists on the remote side.
the file exists and is bigger than a certain value M.
else : perform rsync over get requests.
My question is: How can I determine the perfect value of M.
I'm certain that for a certain file size, performing simple upload is faster than performing rsync steps . Especially for multiple files.
Thanks
If you're using rsync correctly, I'd bet that it's always faster, especially with multiple files.
Rsync is specially built to check differences between directory trees and update the target directory incrementatlly.
The following is a one-liner to keep in mind whenever you need to sync two directory trees.
rsync -av --delete /path/to/src /path/to/target
(also works over SSH, if necessary.)
Only keep in mind that rsync is picky about trailing slashes on directory paths.

SFTP - remove files

I have a batchfile with SFTP instruction to download txt files (cron job), basically: get *.txt
Wondering what the best method is to delete those files after the server has downloaded them. The only problem being that the directory is constantly being updated with new files, so running rm *.txt afterwards won't work.
I've thought of a couple complex ways of doing this, but no command line based methods. So, I thought I'd shoot a query out to you guys, see if there's something I haven't thought of yet.
I suggest to make a list of all the files that were downloaded and then issue ftp delete/mdelete commands with the exact file names.

Unix directory structure: managing file name collision

Usually every time `make install' is run, files are not put in a specific directory like /usr/prog1. Instead, the files are put in directories where files from other programs are already in like /usr/lib and /usr/bin. I believe this has been a common practice since long time ago. This practice surely increases the probability of file name collision.
Since my googling returned no good discussion on this matter, I am wondering what people do to manage the file name collision? Do they simply try this or that name and if something goes wrong, a bug is filed by the user and the developer picks another name? Or, do they simply prefix the names of their files? Anyone is aware of a good discussion on this matter?
Usually people choose the name they want and if something collides then the problem gets resolved by the distribution. That's what happened with ack (ack in Debian, Kanji converter) and ack (ack-grep in Debian, text search utility).
Collisions don't seem to be that common though. A quick web search should tell you if the name is used somewhere. If it's not searcheable, it's probably not included in many distributions and that means you're not likely to actually conflict.
Usually when compiling programs, you can usually specify a prefix path like this: ./configure --prefix=/usr/local/prog1 or ./configure --prefix=/opt/prog1 (whether you use /usr/local or /opt doesn't really matter). Then when running make install it'll put the files in the specified prefix path. You can then either 1) add /opt/prog1/bin/ to your PATH or you can make a symlink to the executable file in /usr/local/bin which should already be in your PATH.
Best thing is to use your distributions package manager though.

Rsynch and SSH: Only rename folder when renamed from source

I have been reading the rsync documentation for a few hours, but I can't figure out how to convey to rsync how to only rename (and not re-upload folder and it's content) destination folders when they are renamed at the source.
I'm connecting to the destination with SSH, and the local folder is the source -- and the remote server is the destination. If I rename a folder containing files, rsync automatically re-uploads all the content of the source folder. I'm not using the rsync's server part, maybe it will works if were to do that ?
I have encountered the same behavior with lftp, and this tool doesn't seem's to have these options. Even if it is based on the file's date rule, files inside the renamed folder are removed/re-uploaded.
Thanks in advance if someone knows how to manage this :)
I've been looking for something similar.
so far, the best solution I have found is at:
http://serenadetoacuckooo.blogspot.com/2009/07/rsync-and-directory-renaming.html
It basically mentions including a meta-file in each folder that indicates the folder's name.
Essentially, you would want to check that file with the directory name, and rsync only if they are the same (otherwise, issue a remote rename command.)
It depends on the scope of what you're using rsync for, but I hope that this information can help you.
How would rsync or any other program know what constitutes renamed? What if two directories are very similar candidates and somehow rsync guesses maybe either one could be a rename of what went before? It's not possible. I think you're stuck with uploading everything again.
You know about the --delete option, right:
--delete delete files that don't exist on the sending side
Note also the --force option:
--force force deletion of directories even if not empty

Which archiving utility should I use in Ubuntu?

I am a Mac/Ubuntu user. I have folders such as "AWK", "awk", "awk_tip" and "awk_notes". I need to archive them, but the variety of utilities confuse me. I had a look at Tar, cpio and pax, but Git has started to fascinate me. I occasionally need encryption and backups.
Please, list the pros and cons of different archiving utilities.
Tar, cpio and pax are ancient Unix utilities. For instance, tar (which is probably the most common of these) was originally intended for making backups on tapes (hence the name, tar = tape archive).
The most commonly used archive formats today are:
tar (in Unix/Linux environments)
tar.gz or tgz (a gzip compressed tar file)
zip (in Windows environments)
If you want just one simple tool, take zip. It works right out of the box on most platforms, and it can be password protected (although the protection is technically weak).
If you need stronger protection (encryption), check out TrueCrypt. It is very good.
Under what OS / toolchain are you working? This might limit the range of existing solutions. Your name suggests Unix, but which one? Further, do you need portability or not?
The standard linux solution (at least to a newbie like me) might be to tar and gzip or bzip2 the folders, then encrypt them with gnupg if you really have to (encrypting awk tutorials seems a bit of overkill to me). You can also use full-fledged backup solutions like bacula, sync to a different location with rsync (perhaps sync to a backup server?).
If you've backing up directories from an ext2/ext3 filesystem, you may want to consider using dump. Some nice features:
it can backup a directory or a whole partition
it saves permissions and timestamps,
it allows you to do incremental backups,
it can compress (gzip or bzip2)
it will automatically split the archive into multiple parts based on a size-limit if you want
it will backup over a network or to a tape as well as a file
It doesn't support encryption, but you can always encrypt the dump files afterwards.

Resources