how to delete all files except the latest three in a folder - unix

I have a folder which contains some subversion revision checkouts (these are checked out when running a capistrano deployment recipe).
what I want to do really is that to keep the latest 3 revisions which the capistrano script checkouts and delete other ones, so for this I am planning to run some command on the terminal using a run command, actually capistrano hasn't got anything to do here, but a unix command.
I was trying to run a command to get a list of files except the lastest three and delete the rest, I could get the list of files using the following command.
(ls -t /var/path/to/folder |head -n 3; ls /var/path/to/folder)|sort|uniq -u|xargs
now if I add a rm -Rf to the end of this command it returns me with file not found to delete. so thats obvious because this returns only the name of the folder, not the full path to the folder.
is there anyway to delete these files / folders using one unix command?

Alright, there are a few things wrong with your script.
First, and most problematically, is this line:
ls -t /var/path/to/folder |head -n 3;
ls -t will return a list of files in order of their last modification time, starting with the most recently modified. head -n 3 says to only list the first three lines. So what this is saying is "give me a list of only the three most recently modified files", which I don't think is what you want.
I'm not really sure what you're doing with the second ls command, but I'm pretty sure that's just going to concatenate all the files in the directory into your list. That means when it gets sorted and uniq'ed, you'll just be left with an alphabetical list of all the files in that directory. When this gets passed to something like xargs rm, you'll wipe out everything in that directory.
Next, sort | uniq doesn't need the uniq part. You can just use the -u switch on sort to get rid of duplicates. You don't need this part anyway.
Finally, the actual removal of the directory. On that part, you had it right in your question: just use rm -r
Here's the easiest way I can think to do this:
ls -t1 /var/path/to/folder | tail -n +4 | xargs rm -r
Here's what's happening here:
ls -t1 is printing a list, one file/directory per line, of all files in /var/path/to/folder, ordering by the most recent modification date.
tail -n +4 is printing all lines in the output of ls -t1 starting with the fourth line (i.e. the three most recently modified files won't be listed)
xargs rm -r says to delete any file output from the tail. The -r means to recursively delete files, so if it encounters a directory, it will delete everything in that directory, then delete the directory itself.
Note that I'm not sorting anything or removing any duplicates. That's because:
ls only reports a file once, so there are no duplicates to remove
You're deleting every file passed anyway, so it doesn't matter in what order they're deleted.
Does all of that make sense?
Edit:
Since I was wrong about ls specifying the full path when passed an absolute directory, and since you might not be able to perform a cd, perhaps you could use tail instead.
For example:
ls -t1 /var/path/to/folder | tail -n +4 | xargs find /var/path/to/folder -name $1 | xargs rm -r

Below is a useful way of doing the task.......!!
for Linux and HP-UX:
ls -t1 | tail -n +50 | xargs rm -r # to leave latest 50 files/directories.
for SunOS:
rm `(ls -t |head -n 100; ls)|sort|uniq -u`

Hi I found a way to do this we can use the unix &&
so the command will look like this
cd /var/path/to/folder && ls -t1 /var/path/to/folder | tail -n +4 | xargs rm -r

Related

Mac OS: How to use RSYNC to copy files modified within the last 24 hours and keep folder structure?

It's a simple question that I can't seem to figure out. I'm on a Mac with Big Sur with all the latest updates, and I'm going through Terminal to get these commands to run. If there's a better way please let me know.
This is, in basic terms, what I'm trying to do--I want RSYNC to recursively go through a source directory (which in this case would ideally be an entire drive), find any files modified within the last 24 hours, and copy those to another drive, while preserving the folder structure. So if I have:
/Volumes/Drive1/Folder1/File1.file
/Volumes/Drive1/Folder1/File2.file
/Volumes/Drive1/Folder1/File3.file
And File1 has been modified in the last 24 hours, but the other two haven't, I want it to copy that file, so that on the second drive I wind up with:
/Volumes/Drive2/Folder1/File1.file
But without copying File2 and File3.
I've tried a lot of different solutions and strings, but I'm running into problems. The closest I've been able to get is this:
find /Volumes/Drive1/ -type f -mtime -1 -exec cp -a "{}" /Volumes/Drive2/ \;
The problem is that while this one does go through Drive1 and find all the files newer than a day like I want, when it copies them it just dumps them all into the root of Drive2.
This one also seems to come close:
rsync --progress --files-from=<(find /Volumes/Drive1/ -mtime -1 -type f -exec basename {} \;) /Volumes/Drive1/ /Volumes/Drive2/
This one also identifies all the files modified in the last 24 hours, but instead of copying them it gives an error, "link_stat (filename and path) failed: no such file or directory (2)."
I've spent several days trying to figure out what I'm doing wrong but I can't figure it out. Help please!
I think this'll work:
srcDir=/Volumes/Drive1
destDir=/Volumes/Drive2
(cd "$srcDir" && find . -type f -mtime -1 -print0) |
while IFS= read -r -d $'\0' filepath; do
mkdir -p "$(dirname "$destDir/$filepath")"
cp -a "$srcDir/$filepath" "$destDir/$filepath"
done
Explanation:
Using cd "$srcDir"; find . -whatever will generate relative paths (starting with "./") from the source directory to the found files; that means appending the results to $srcDir and $destDir will give the full source and destination paths for each file.
Putting it in parentheses makes it run in a subshell, so the cd won't affect other commands. Coupling cd and find with && means that if cd fails, it won't run find (which would run in the wrong place, generate a list of the wrong file file, and generally cause trouble).
Using -print0 and while IFS= read -r -d $'\0' is a standard weird-filename-safe way of iterating over found files (see BashFAQ #20). Note that if anything in the loop reads from standard input (e.g. cp -i asking for confirmation), it'll steal part of the file list; if this is a worry, use this variant (instead of the pipe) to send the file list over file descriptor #3 instead of standard input:
while IFS= read -r -d $'\0' filepath <&3; do
...
done 3< <(cd "$srcDir" && find . -type f -mtime -1 -print0)
Finally, mkdir -p is used to make sure the destination directory exists, and then cp to copy the file.

Git Status - List only the direct subfolders with changes, not inner files

I love using Git to organize version control and backup all my web files in Wordpress.
After updating plugins, I'd like to get the list of changes only on the direct subfolder by using git status. Typically if doing git status will a very long line of changes including the inner of each subfolder.
what I'd like is to limit the result to the subfolders with the changes inside the plugins directory.
For example, this git command:
git status project_folder/wp-content/plugins
will result to:
plugins/wpml-translation-management/classes/translation-basket/
plugins/wpml-translation-management/classes/translation-dashboard/
plugins/acfml/assets/
plugins/acfml/classes/class-wpml-acf-attachments.php
plugins/wordpress-seo/js/dist/commons-921.min.js
plugins/wordpress-seo/js/dist/components-921.min.js
Actually, the git command will make a really long list of lines like on the screenshot:
What I would love to know is the git command to output only:
plugins/wpml-translation-management/
plugins/acfml/
plugins/wordpress-seo/
Command such:
git status project_folder/wp-content/plugins --{display_only_direct_subfolders_with_changes}
try this:
git status --porcelain | awk '{print $2}' | xargs -n 1 dirname | uniq
awk '{print $2}'
get filename from list
xargs -n 1 dirname
extract dir from a full path
uniq
show only unique directories
uniq can be a little slower when you have many lines

Concatenating input to svn list command with output, then pass it to grep

I currently have the following shell command which is only partially working:
svn list $myrepo/libs/ |
xargs -P 10 -L 1 -I {} echo $myrepo/libs/ {} trunk |
sed 's/ //g' |
xargs -P 20 -L 1 svn list --depth infinity |
grep .xlsx
where $myrepo corresponds to the svn server address.
The libs folder contains a number of subfolders (currently about 30 although eventually up to 100), each which contain a number of tags, branches and a trunk. I wish to get a list of xlsx files contained only within the trunk folder of each of these subfolders. The command above works fine however it only returns the relative path from $myrepo/libs/subfolder/trunk/, so I get this back:
1/2/3/file.xlsx
Because of the potentially large number of files I would have to search through, I am performing it in two parallel steps by using xargs -P (I do not have and cannot use parallels). It am also trying to do this in one command so it can be used in php/perl/etc. and avoid multiple sytem calls.
What I would like to do is concatenate the input to this part of the command:
xargs -P 20 -L 1 svn list --depth infinity
with the output from it, to give the following:
$myrepo/libs/subfolder/trunk/1/2/3/file.xlsx
Then pass this to the grep to find the xlsx files.
I appreciate any assistance that could be provided.
If I manage to correctly divine your intention, something like this might work for you.
svn list "$myrepo/libs/" |
xargs -P 20 -n 1 sh -c 'svn list -R "$0/trunk/$1" |
sed -n "s%.*\.xlsx$%$0/trunk/$1/&%p"' "$myrepo"
Briefly, we postprocess the output from the inner svn list to filter to just .xslx files and tack the full SVN path back on at the same time. This way, the processing happens where the repo path is still known.
We hack things a bit by passing in "$myrepo" as "$0" to the subordinate sh so we don't have to export this variable. The input from the outer svn list comes as $1.
(The repos I have access to have a slightly different layout so there could be a copy/paste error somewhere.)

Pipes in unix - is the value implicitly supplied as an argument?

I would like to delete all files ending in .orig recursively from the current directory.
Will this do the trick?
ls -R | grep ".orig$" | rm
Are the results of grep passed implicitly as an argument to rm here?
How about something like:
find ./ -type f -name "*.orig" -exec rm "{}" \;
Seems to work for me, but it might be a good idea to test it with echo instead of rm first ;)
ls -R wont five quite the correct format output to pass directly to rm (through grep), as it lists files separately for each dir like:
.:
local1.orig local
./dir:
nested1.orig nested2.orig
If you wanted to do something similar using grep, you would need to use xargs like this:
grep ".orig$" | xargs rm
No, they are not. But that is the purpose of xargs:
ls -R | grep ".orig$" | xargs rm -i
will do what you want. The -i is not necessary, but is a good idea to use the first time you run this. (It will prompt you to delete a file. If you are confident that the answer is always yes, abort and re-run without the -i.)

Unix shell file copy flattening folder structure

On the UNIX bash shell (specifically Mac OS X Leopard) what would be the simplest way to copy every file having a specific extension from a folder hierarchy (including subdirectories) to the same destination folder (without subfolders)?
Obviously there is the problem of having duplicates in the source hierarchy. I wouldn't mind if they are overwritten.
Example: I need to copy every .txt file in the following hierarchy
/foo/a.txt
/foo/x.jpg
/foo/bar/a.txt
/foo/bar/c.jpg
/foo/bar/b.txt
To a folder named 'dest' and get:
/dest/a.txt
/dest/b.txt
In bash:
find /foo -iname '*.txt' -exec cp \{\} /dest/ \;
find will find all the files under the path /foo matching the wildcard *.txt, case insensitively (That's what -iname means). For each file, find will execute cp {} /dest/, with the found file in place of {}.
The only problem with Magnus' solution is that it forks off a new "cp" process for every file, which is not terribly efficient especially if there is a large number of files.
On Linux (or other systems with GNU coreutils) you can do:
find . -name "*.xml" -print0 | xargs -0 echo cp -t a
(The -0 allows it to work when your filenames have weird characters -- like spaces -- in them.)
Unfortunately I think Macs come with BSD-style tools. Anyone know a "standard" equivalent to the "-t" switch?
The answers above don't allow for name collisions as the asker didn't mind files being over-written.
I do mind files being over-written so came up with a different approach. Replacing each / in the path with - keep the hierarchy in the names, and puts all the files in one flat folder.
We use find to get the list of all files, then awk to create a mv command with the original filename and the modified filename then pass those to bash to be executed.
find ./from -type f | awk '{ str=$0; sub(/\.\//, "", str); gsub(/\//, "-", str); print "mv " $0 " ./to/" str }' | bash
where ./from and ./to are directories to mv from and to.
If you really want to run just one command, why not cons one up and run it? Like so:
$ find /foo -name '*.txt' | xargs echo | sed -e 's/^/cp /' -e 's|$| /dest|' | bash -sx
But that won't matter too much performance-wise unless you do this a lot or have a ton of files. Be careful of name collusions, however. I noticed in testing that GNU cp at least warns of collisions:
cp: will not overwrite just-created `/dest/tubguide.tex' with `./texmf/tex/plain/tugboat/tubguide.tex'
I think the cleanest is:
$ find /foo -name '*.txt' | xargs -i cp {} /dest
Less syntax to remember than the -exec option.
As far as the man page for cp on a FreeBSD box goes, there's no need for a -t switch. cp will assume the last argument on the command line to be the target directory if more than two names are passed.

Resources