Sorting file names by length of the file name - unix

ls displays the files available in a directory. I want the file names to be displayed based on the length of the file name.
Any help will be highly appreciated.
Thanks in Advance

The simplest way is just:
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>'

You can do like this
for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n

make test files:
mkdir -p test; cd test
touch short-file-name medium-file-name loooong-file-name
the script:
ls |awk '{print length($0)"\t"$0}' |sort -n |cut --complement -f1
output:
short-file-name
medium-file-name
loooong-file-name

for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2-

TL;DR
Command:
find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } #F)' | sed 's#/#/\\n/#g'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g'
Not Parsing ls Output AND Benchmarking
There are good answers here. However, if one wants to follow the advice not to parse the output of ls, here are some ways to get the job done. This will especially take care of the situation where you have spaces in filenames. I'm going to benchmark everything here as well as the paring-ls examples. (Hopefully I get to that, soon.) I've put a bunch of somewhat-random filenames that I've downloaded from different places over the last 25 years or so -- 73 to begin with. All 73 are 'normal' filenames, with only alphanumeric characters, underscores, dots, and hyphens. I'll add 2 more which I make now (in order to show problems with some sorts).
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ mkdir ../dir_w_fnames__spaces
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cp ./* ../dir_w_fnames__spaces/
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ touch "just one file with a really long filename that can throw off some counts bla so there"
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ mkdir ../dir_w_fnames__spaces_and_newlines
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cp ./* ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ touch $'w\nlf.aa'
This one, i.e. the filename,
w
lf.aa
stands for with linefeed - I make it like this to make it easier to see the problems. I don't know why I chose .aa as the file extension, other than the fact that it made this filename length easily visible in the sorts.
Now, I'm going back to the orig_dir_73 directory; just trust me that this directory only contains files. We'll use a surefire way to get the number of files.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ du --inodes
74 .
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ # The 74th inode is for the current directory, '.'; we have 73 files
There's a more surefire way, which doesn't depend on the directory only having files and doesn't require you to remember the extra '.' inode. I just looked through the man page, did some research, and did some experimentation. This command is
awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'
or, in more-readable fashion,
awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
Let's find out how many files we have
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
73
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
74
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
75
(See [ 1 ] for details and an edge case for a previous solution that led to the command here now.)
I'll be switching back and forth between these directories; just make sure you pay attention to the path - I won't note every switch.
* Usable even with weird filenames (containing spaces, linefeeds, etc.)
1a. Perl à la #tchrist with Additions
Using find with null separator. Hacking around newlines in a filename.
Command:
find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } #F)' | sed 's#/#/\\n/#g'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g'
I'll actually show part of the sort results to show that the following command works. I'll also show how I'm checking that weird filenames aren't breaking anything.
Note that one wouldn't usually use head or tail if one wants the whole, sorted list (hopefully, it's not a sordid list). I'm using those commands for demonstration.
First, 'normal' filenames.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | tail -n 5
137.csv
13.csv
o6.dat
3.csv
a.dat
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ # No spaces in fnames, so...
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f | wc -l
73
Works for normal filenames
Next: spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
Works for filenames containing spaces
Next: newline
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | tail -8
Lk3f.png
LOqU.txt
137.csv
w/\n/lf.aa
13.csv
o6.dat
3.csv
a.dat
If you prefer, you can also change this command a bit, so the filename comes out with the linefeed "evaluated".
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#\n#g' | tail -8
LOqU.txt
137.csv
w
lf.aa
13.csv
o6.dat
3.csv
a.dat
In either case, you will know, due to what we've been doing, that the list is sorted, even though it doesn't appear so.
(Visual on not appearing sorted by filename length)
********
********
*******
********** <-- Visual Problem
*****
*****
****
****
OR
********
*******
* <-- Visual
**** <-- Problems
*****
*****
****
****
Works for filenames containing newlines
* 2a. Very Close, but Doesn't Keep Newline Filename Together - à la #cpasm
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | head
lf.aa
3.csv
a.dat
13.csv
o6.dat
137.csv
w
1UG5.txt
1uWj.txt
2Ese.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
Note, for the head part, that the w in
w(\n)
lf.aa
is in the correct, sorted position for the 6-character-long filename that it is. However, the lf.aa is not in a logical place.
* Less-Easily Breakable (only '\n' and possibly command characters could be a problem)
1b. Perl à la #tchrist with find, not ls
Using find with null separator and xargs.
Command:
find . -maxdepth 1 -type f -print0 | xargs -I'{}' -0 echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | perl -e 'print sort { length($b) <=> length($a) } <>'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>'
Let's go for it.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
IKlT.txt
Lk3f.png
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
Works for normal filenames
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
Works for filenames containing spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' |
perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w
WARNING
BREAKS for filenames containing newlines
1c. Good for normal filenames and filenames with spaces, but breakable with filenames containing newlines - à la #tchrist
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w
3a. Good for normal filenames and filenames with spaces, but breakable with filenames containing newlines - à la #Peter_O
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | head -n 8
w
3.csv
a.dat
lf.aa
13.csv
o6.dat
137.csv
1UG5.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
* More-Easily Breakable
4a. Good for normal filenames - à la #Raghuram
This version is breakable with filenames containing either spaces or newlines (or both).
I do want to add that I do like the display of the actual string length, if just for analysis purposes.
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | head -n 20
1 a
1 w
2 so
3 bla
3 can
3 off
3 one
4 file
4 just
4 long
4 some
4 that
4 with
5 3.csv
5 a.dat
5 lf.aa
5 there
5 throw
6 13.csv
6 counts
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | tail -5
69 17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
70 83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
76 79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
87 oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
238 68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
Explanation of Some Commands
For now, I'll only note that, with the works-for-all find command, I used '/' for the newline substitute because it is the only character that is illegal in a filename both on *NIX and Windows.
Note(s)
[ 1 ] The command used,
du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=int($1)}END{print sum}'
will work in this case, because when there is a file with a newline, and therefore an "extra" line in the output of the find command, awk's int function will evaluate to 0 for the text of that link. Specifically, for our newline-containing filename, w\nlf.aa, i.e.
w
lf.aa
we will get
$ awk '{print int($1)}' < <(echo "lf.aa")
0
If you have a situation where the filename is something like
firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg
i.e.
firstline
3 and some other
1
2 exciting
86stuff.jpg
well, I guess the computer has beaten me. If anyone has a solution, I'd be glad to hear it.
Edit I think I'm way too deep into this question. from this SO answer and experimentation, I got this command (I don't understand all the details, but I've tested it pretty well.)
awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'
More readably:
awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'

You can use
ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-
To see it in action, create some files
% touch a ab abc
and some directories
% mkdir d de def
Output of the normal ls command
% ls
a ab abc d/ de/ def/
Output from the proposed command
% ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-
a
d
ab
de
abc
def

Related

How to list the quantity of files per subdir on Unix

I have 60 subdirs in a directory, example name of the directory: test/queues.
The subdirs:
test/queues/subdir1
test/queues/subdir2
test/queues/subdir3
(...)
test/queues/subdir60
I want a command that gives me the output of the number of files in each subdirectory, listed separately, example:
test/queues/subdir1 - 45 files
test/queues/subdir2 - 76 files
test/queues/subdir3 - 950 files
(...)
test/queues/subdir60 - 213 files
Through my researchs, I only got the command ls -lat test/queues/* | wc -l, but this command outputs me the total of files in all of these subdirs. For example, It returns me only 4587, that is the total number of files in all these 60 subdirs. I want the output listing separately, the quantity of files in each folder.
How can I do that ?
Use a loop to count the lines for every subdirectory individually:
for d in test/queues/*/
do
echo "$d" - $(ls -lat "$d" | wc -l)
done
Note that the output of ls -lat some_directory will contain a few additional lines like
total 123
drwxr-xr-x 1 user group 0 Feb 26 09:51 ../
drwxr-xr-x 1 user group 0 Jan 25 12:35 ./
If your ls command supports these options you can use
for d in test/queues/*/
do
echo "$d" - $(ls -A1 "$d" | wc -l)
done
You can apply ls | wc -l in a loop to all subdirs
for x in *; do echo "$x => $(ls $x | wc -l)"; done;
If you want to restrict the output to directories that are one level deep and you only want a count of regular files, you could do:
find . -type d -maxdepth 1 -exec sh -c '
printf "%s\t" "$0"; find "$0" -type f -maxdepth 1 | wc -l' {} \; \
| column -t
You can get the "name - %d files" format with:
find . -type d -maxdepth 1 -exec sh -c '
printf "%s - %d files\n" "$0" \
"$(find "$0" -type f -maxdepth 1 | wc -l)"' {} \;
Using find and awk:
find test/queues -maxdepth 2 -mindepth 2 -printf "%h\n" | awk '{ map[$0]++ } END { for (i in map) { print i" - "map[i]} }'
Use maxdepth and mindepth to ensure that we are only searching the directory structure one level down. Print only the leading directories thorough printf "%h" Pipe the output into awk and create an incrementing map array with the directories as the index. At the end, loop through the map array printing the directories and the counts.
On Unix in the case of not -printf option with find, use exec dirname instead:
find test/queues -maxdepth 2 -mindepth 2 -exec dirname {} \; | awk '{ map[$0]++ } END { for (i in map) { print i" - "map[i]} }'

Unix command for list all the files by grouping and sorting by file type and name

There are lots of files in a directory and output to be group and sort like below,first exe files
without any file extension,then sql files ending with "body",then sql files ending with "spec",then
other sql files.then "sh" then "txt" files.
abc
1_spec.sql
1_body.sql
2_body.sql
other.sql
a1.sh
a1.txt
find . -maxdepth 1 -type f ! -name "*.*"
find . -type f -name "*body*.sql"
find . -type f -name "*spec*.sql"
Getting difficult to combine all and sorting group with order.
with ls, grep and sort you could do something like this script I hacked together:
#!/bin/sh
ls | grep -v '\.[a-zA-Z0-9]*$' | sort
ls | grep '_body.sql$' | sort
ls | grep '_spec.sql$' | sort
ls | grep -vE '_body.sql$|_spec.sql$' | grep '.sql$' | sort
ls | grep '.sh$' | sort
ls | grep '.txt$' | sort
normal ls:
$ ls -1
1_body.sql
1_spec.sql
2_body.sql
a1.sh
a1.txt
abc
bar.sql
def
foo.sh
other.sql
script
$
sorting script:
$ ./script
abc
def
script
1_body.sql
2_body.sql
1_spec.sql
bar.sql
other.sql
a1.sh
foo.sh
a1.txt
$

Can't add double quotes to file's directory

I need to get this result having this format :
"hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC | grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "
So I tried to use this instruction :
paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_","1ELPC",cat(" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "),sep = "")
But, this return
grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 [1] "hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1EPSE"
So, the problem is about using the cat function, In fact I need that its result will be in quoted format. In other way, I can't understand why the result was inversed here ?
I'm assuming you split up the arguments to paste0 for a specific reason. As #RuiBarradas mentions - cat is for printing, but not returning an actual object (always returns NULL):
paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_",
"1ELPC",
" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 ",
sep = "")
returns:
[1] "hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "
which looks to me like what you want.
Do note that, in the output \" is one character (a double quote). i.e.,
> nchar("\"")
[1] 1
To further illustrate the point:
temp <- paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_",
"1ELPC",
" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 ",
sep = "")
> cat(temp)
hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8
> print(temp, quote = FALSE)
[1] hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8

How to find count of a particular word in Different Files in Unix

How Do i Find Count of a particular word in Different Files in Unix:
I have: 50 file in a Directory (abc.txt, abc.txt.1,abc.txt.2, etc)
What I want: To Find number of instances of word 'Hello' in each file.
What I have used is grep -c Hello abc* | grep -v :0
It gave me result in Form of,
<<File name>> : <<count>>
I want Output to be in a form
<<Date>> <<File_Name>> <<Number of Instances of word Hello in the file>>
1-1-2001 abc.txt 23
1-1-2014 abc.txt.19 57
2-5-2015 abc.txt.49 16
You can use gnu awk >=4.0 (due to ENDFILE) to get the number.
If we know where the data comes from, I will add it.
awk '{for (i=1;i<=NF;i++) if ($i~/Hello/) a++} ENDFILE {print FILENAME,a;a=0}' abc.txt*
### Sample code for you to tweak for your needs:
touch test.txt
echo "ravi chandran marappan 30" > test.txt
echo "ramesh kumar marappan 24" >> test.txt
echo "ram lakshman marappan 22" >> test.txt
sed -e 's/ /\n/g' test.txt | sort | uniq | awk '{print "echo """,$1,
"""`grep -wc ",$1," test.txt`"}' | sh
Results:
22 -1
24 -1
30 -1
chandran -1
kumar -1
lakshman -1
marappan -3
ram -1
ramesh -1
ravi -1`

compare two diff fields in two files

I need to compare field1, field5 in fileA to field5, field6 in fileB
and print out when there are no matches:
file A
ZEROC_ZAR,MKT,M,ZAR,3YEAR,7.59
ZEROC_AED,MKT,M,ZAR,4YEAR,7.84
ZEROC_ZAR,MKT,M,ZAR,5YEAR,8.03
ZEROC_AED,MKT,M,ZAR,7YEAR,8.33
file B
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
This oneliner prints the non-matching lines of fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort | uniq -u
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
Explanation:
First combine fields 1 and 5 of fileA:
$ cut -d, -f1,5 fileA
ZEROC_ZAR,3YEAR
ZEROC_AED,4YEAR
ZEROC_ZAR,5YEAR
ZEROC_AED,7YEAR
Use these strings to grep for matching lines in fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
Then use cat - fileB | sort to combine these two lines with the content of fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
Finally, use uniq -u to remove duplicate lines:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort | uniq -u
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR

Resources