Can't add double quotes to file's directory - r

I need to get this result having this format :
"hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC | grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "
So I tried to use this instruction :
paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_","1ELPC",cat(" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "),sep = "")
But, this return
grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 [1] "hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1EPSE"
So, the problem is about using the cat function, In fact I need that its result will be in quoted format. In other way, I can't understand why the result was inversed here ?

I'm assuming you split up the arguments to paste0 for a specific reason. As #RuiBarradas mentions - cat is for printing, but not returning an actual object (always returns NULL):
paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_",
"1ELPC",
" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 ",
sep = "")
returns:
[1] "hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 "
which looks to me like what you want.
Do note that, in the output \" is one character (a double quote). i.e.,
> nchar("\"")
[1] 1
To further illustrate the point:
temp <- paste0("hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_",
"1ELPC",
" grep \"^d\" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8 ",
sep = "")
> cat(temp)
hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8
> print(temp, quote = FALSE)
[1] hadoop fs -ls -d -C -t /hdfs/data/t1/t11/34/1EX4/ | grep indicateurs-PUB_1ELPC grep "^d" | sort -k6,7 | tail -1 | tr -s ' ' | cut -d' ' -f8

Related

jq parsing date to timestamp

I have the following script:
curl -s -S 'https://bittrex.com/Api/v2.0/pub/market/GetTicks?marketName=BTC-NBT&tickInterval=thirtyMin&_=1521347400000' | jq -r '.result|.[] |[.T,.O,.H,.L,.C,.V,.BV] | #tsv | tostring | gsub("\t";",") | "(\(.))"'
This is the output:
(2018-03-17T18:30:00,0.00012575,0.00012643,0.00012563,0.00012643,383839.45768188,48.465051)
(2018-03-17T19:00:00,0.00012643,0.00012726,0.00012642,0.00012722,207757.18765437,26.30099514)
(2018-03-17T19:30:00,0.00012726,0.00012779,0.00012698,0.00012779,97387.01596624,12.4229077)
(2018-03-17T20:00:00,0.0001276,0.0001278,0.00012705,0.0001275,96850.15260027,12.33316229)
I want to replace the date with timestamp.
I can make this conversion with date in the shell
date -d '2018-03-17T18:30:00' +%s%3N
1521325800000
I want this result:
(1521325800000,0.00012575,0.00012643,0.00012563,0.00012643,383839.45768188,48.465051)
(1521327600000,0.00012643,0.00012726,0.00012642,0.00012722,207757.18765437,26.30099514)
(1521329400000,0.00012726,0.00012779,0.00012698,0.00012779,97387.01596624,12.4229077)
(1521331200000,0.0001276,0.0001278,0.00012705,0.0001275,96850.15260027,12.33316229)
This data is stored in MySQL.
Is it possible to execute the date conversion with jq or another command like awk, sed, perl in a single command line?
Here is an all-jq solution that assumes the "Z" (UTC+0) timezone.
In brief, simply replace .T by:
((.T + "Z") | fromdate | tostring + "000")
To verify this, consider:
timestamp.jq
[splits("[(),]")]
| .[1] |= ((. + "Z")|fromdate|tostring + "000") # milliseconds
| .[1:length-1]
| "(" + join(",") + ")"
Invocation
jq -rR -f timestamp.jq input.txt
Output
(1521311400000,0.00012575,0.00012643,0.00012563,0.00012643,383839.45768188,48.465051)
(1521313200000,0.00012643,0.00012726,0.00012642,0.00012722,207757.18765437,26.30099514)
(1521315000000,0.00012726,0.00012779,0.00012698,0.00012779,97387.01596624,12.4229077)
(1521316800000,0.0001276,0.0001278,0.00012705,0.0001275,96850.15260027,12.33316229)
Here is an unportable awk solution. It is not portable because it relies on the system date command; on the system I'm using, the relevant invocation looks like: date -j -f "%Y-%m-%eT%T" STRING "+%s"
awk -F, 'BEGIN{OFS=FS}
NF==0 { next }
{ sub(/\(/,"",$1);
cmd="date -j -f \"%Y-%m-%eT%T\" " $1 " +%s";
cmd | getline $1;
$1=$1 "000"; # milliseconds
printf "%s", "(";
print;
}' input.txt
Output
(1521325800000,0.00012575,0.00012643,0.00012563,0.00012643,383839.45768188,48.465051)
(1521327600000,0.00012643,0.00012726,0.00012642,0.00012722,207757.18765437,26.30099514)
(1521329400000,0.00012726,0.00012779,0.00012698,0.00012779,97387.01596624,12.4229077)
(1521331200000,0.0001276,0.0001278,0.00012705,0.0001275,96850.15260027,12.33316229)
Solution with sed :
sed -e 's/(\([^,]\+\)\(,.*\)/echo "(\$(date -d \1 +%s%3N),\2"/g' | ksh
test :
<commande_curl> | sed -e 's/(\([^,]\+\)\(,.*\)/echo "(\$(date -d \1 +%s%3N),\2"/g' | ksh
or :
<commande_curl> > results_curl.txt
cat results_curl.txt | sed -e 's/(\([^,]\+\)\(,.*\)/echo "(\$(date -d \1 +%s%3N),\2"/g' | ksh

I am trying to run a script in -vx mode , but getting error

" ././tst.ksh: line 16: ONL P BL9_RATED_EVENT_D 1,295 780 4,063,232 60 LOCA SYST AUTO: cannot open [No such file or directory] "
I am trying to execute the below script in -vx mode
I am not getting why in the output I am getting this
!/bin/ksh -xv
#for i in `cat /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log | grep ONL | column -t`
while i= read -r line
do
echo $i
stat=`echo $i | cut -d" " -f1`
typ=`echo $i | cut -d" " -f2`
tbs=`echo $i | cut -d" " -f3`
tot=`echo $i | cut -d" " -f4`
free=`echo $i | cut -d" " -f5`
lrg=`echo $i | cut -d" " -f6`
fr=`echo $i | cut -d" " -f7`
Ext=`echo $i | cut -d" " -f8`
All=`echo $i | cut -d" " -f9`
spc=`echo $i | cut -d" " -f10`
done < `cat /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log | grep ONL | column -t`
+ cat /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log | grep ONL | column -t+ cat /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log
+ column -t
+ grep ONL
././tst.ksh: line 16: ONL P BL9_RATED_EVENT_D 1,295 780 4,063,232 60 LOCA SYST AUTO: cannot open [No such file or directory]
There are some issues that I will explain here.
The read command files a variable that is given after the command. In your code, line is the name of the variable that is filled, and the i = doesn't belong here. The first improvement is:
while read -r i
do
echo $i
stat=`echo $i | cut -d" " -f1`
typ=`echo $i | cut -d" " -f2`
tbs=`echo $i | cut -d" " -f3`
tot=`echo $i | cut -d" " -f4`
free=`echo $i | cut -d" " -f5`
lrg=`echo $i | cut -d" " -f6`
fr=`echo $i | cut -d" " -f7`
Ext=`echo $i | cut -d" " -f8`
All=`echo $i | cut -d" " -f9`
spc=`echo $i | cut -d" " -f10`
done < /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log
I also changed the last line. The while loop wants to read from a file, not from the output of a command.
Note: You will get confused from Bash examples. When you want to redirect output of a command into a while loop, you can use some special syntax in Bash.
In Bash you will need that, so that the variables set in the while loop are known after finishing the while loop.
In your case, ksh, you can solve it by starting with your command and redirect it to the while loop.
Fixing your code without fixing the logic gives
cat /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log | grep ONL | column -t | while read -r i
do
echo $i
stat=`echo $i | cut -d" " -f1`
typ=`echo $i | cut -d" " -f2`
tbs=`echo $i | cut -d" " -f3`
tot=`echo $i | cut -d" " -f4`
free=`echo $i | cut -d" " -f5`
lrg=`echo $i | cut -d" " -f6`
fr=`echo $i | cut -d" " -f7`
Ext=`echo $i | cut -d" " -f8`
All=`echo $i | cut -d" " -f9`
spc=`echo $i | cut -d" " -f10`
done
Using column -t did not help. You need to get fields seperated with one space before trying to split them on spaces. You need to replace tabs by spaces and multiple spaces squeezed to one space.
You can use expand -1 | tr -s " " for this.
I would like to suggest replacing the backtics with the notation $(subcommand), but you can assign the variables with the read command.
My suggestion would be
grep ONL /tefnfs/tef/tools/tooladm/Users/Jithesh/prd3cust.log | expand -1 | tr -s " " |
while read -r stat typ tbs tot free lrg fr Ext All spc; do
echo "Processed line starting with ${stat}"
done
Now you should do something in the while loop. The variables are changed everytime that the while loop processes the next line. After the while loop the variables are filled with the values from the last line processed.

R - system commands with multiple pipe do not return stdout

bash:
ps -aux | grep -E "^.*\b[^grep](python).*(runserver).*$" 2>/dev/null | tr -s " " | cut -d " " -f 2
It's return correct result. (eg.)
1450
1452
This same code in R
vLog <- system('ps -aux | grep -E "^.*\b[^grep](python).*(runserver).*$" 2>/dev/null | tr -s " " | cut -d " " -f 2', intern = TRUE)
return character(0)
Just replace \b with \\b and also be aware of [^grep] which matches any character but not of g or r or e or p.
vLog <- system('ps -aux | grep -E "^.*\\b[^grep](python).*(runserver).*$" 2>/dev/null | tr -s " " | cut -d " " -f 2', intern = TRUE)
Example:
> system('ps -aux | grep -E "^.*\\bpython" 2>/dev/null | tr -s " " | cut -d " " -f 2', intern = TRUE)
[1] "2519" "2526" "3285" "3291"
> system('ps -aux | grep -E "^.*\bpython" 2>/dev/null | tr -s " " | cut -d " " -f 2', intern = TRUE)
character(0)

compare two diff fields in two files

I need to compare field1, field5 in fileA to field5, field6 in fileB
and print out when there are no matches:
file A
ZEROC_ZAR,MKT,M,ZAR,3YEAR,7.59
ZEROC_AED,MKT,M,ZAR,4YEAR,7.84
ZEROC_ZAR,MKT,M,ZAR,5YEAR,8.03
ZEROC_AED,MKT,M,ZAR,7YEAR,8.33
file B
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
This oneliner prints the non-matching lines of fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort | uniq -u
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
Explanation:
First combine fields 1 and 5 of fileA:
$ cut -d, -f1,5 fileA
ZEROC_ZAR,3YEAR
ZEROC_AED,4YEAR
ZEROC_ZAR,5YEAR
ZEROC_AED,7YEAR
Use these strings to grep for matching lines in fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
Then use cat - fileB | sort to combine these two lines with the content of fileB:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690231,02977,AED,ZEROC_AED,4YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690235,02977,AED,ZEROC_AED,7YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR
Finally, use uniq -u to remove duplicate lines:
$ cut -d, -f1,5 fileA | xargs -n1 -I{} grep {} fileB | cat - fileB | sort | uniq -u
TKS,010690226,02977,AED,ZEROC_AED,3YEAR
TKS,010690233,02977,AED,ZEROC_AED,5YEAR
TKS,010690236,02977,AED,ZEROC_AED,10YEAR

Sorting file names by length of the file name

ls displays the files available in a directory. I want the file names to be displayed based on the length of the file name.
Any help will be highly appreciated.
Thanks in Advance
The simplest way is just:
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>'
You can do like this
for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n
make test files:
mkdir -p test; cd test
touch short-file-name medium-file-name loooong-file-name
the script:
ls |awk '{print length($0)"\t"$0}' |sort -n |cut --complement -f1
output:
short-file-name
medium-file-name
loooong-file-name
for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2-
TL;DR
Command:
find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } #F)' | sed 's#/#/\\n/#g'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g'
Not Parsing ls Output AND Benchmarking
There are good answers here. However, if one wants to follow the advice not to parse the output of ls, here are some ways to get the job done. This will especially take care of the situation where you have spaces in filenames. I'm going to benchmark everything here as well as the paring-ls examples. (Hopefully I get to that, soon.) I've put a bunch of somewhat-random filenames that I've downloaded from different places over the last 25 years or so -- 73 to begin with. All 73 are 'normal' filenames, with only alphanumeric characters, underscores, dots, and hyphens. I'll add 2 more which I make now (in order to show problems with some sorts).
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ mkdir ../dir_w_fnames__spaces
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cp ./* ../dir_w_fnames__spaces/
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ touch "just one file with a really long filename that can throw off some counts bla so there"
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ mkdir ../dir_w_fnames__spaces_and_newlines
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cp ./* ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ touch $'w\nlf.aa'
This one, i.e. the filename,
w
lf.aa
stands for with linefeed - I make it like this to make it easier to see the problems. I don't know why I chose .aa as the file extension, other than the fact that it made this filename length easily visible in the sorts.
Now, I'm going back to the orig_dir_73 directory; just trust me that this directory only contains files. We'll use a surefire way to get the number of files.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ du --inodes
74 .
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ # The 74th inode is for the current directory, '.'; we have 73 files
There's a more surefire way, which doesn't depend on the directory only having files and doesn't require you to remember the extra '.' inode. I just looked through the man page, did some research, and did some experimentation. This command is
awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'
or, in more-readable fashion,
awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
Let's find out how many files we have
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
73
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
74
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
75
(See [ 1 ] for details and an edge case for a previous solution that led to the command here now.)
I'll be switching back and forth between these directories; just make sure you pay attention to the path - I won't note every switch.
* Usable even with weird filenames (containing spaces, linefeeds, etc.)
1a. Perl à la #tchrist with Additions
Using find with null separator. Hacking around newlines in a filename.
Command:
find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } #F)' | sed 's#/#/\\n/#g'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g'
I'll actually show part of the sort results to show that the following command works. I'll also show how I'm checking that weird filenames aren't breaking anything.
Note that one wouldn't usually use head or tail if one wants the whole, sorted list (hopefully, it's not a sordid list). I'm using those commands for demonstration.
First, 'normal' filenames.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | tail -n 5
137.csv
13.csv
o6.dat
3.csv
a.dat
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ # No spaces in fnames, so...
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f | wc -l
73
Works for normal filenames
Next: spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
Works for filenames containing spaces
Next: newline
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#/\\n/#g' | tail -8
Lk3f.png
LOqU.txt
137.csv
w/\n/lf.aa
13.csv
o6.dat
3.csv
a.dat
If you prefer, you can also change this command a bit, so the filename comes out with the linefeed "evaluated".
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
'$_=join("\n", sort { length($b) <=> length($a) } #F)' | \
sed 's#/#\n#g' | tail -8
LOqU.txt
137.csv
w
lf.aa
13.csv
o6.dat
3.csv
a.dat
In either case, you will know, due to what we've been doing, that the list is sorted, even though it doesn't appear so.
(Visual on not appearing sorted by filename length)
********
********
*******
********** <-- Visual Problem
*****
*****
****
****
OR
********
*******
* <-- Visual
**** <-- Problems
*****
*****
****
****
Works for filenames containing newlines
* 2a. Very Close, but Doesn't Keep Newline Filename Together - à la #cpasm
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | head
lf.aa
3.csv
a.dat
13.csv
o6.dat
137.csv
w
1UG5.txt
1uWj.txt
2Ese.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
Note, for the head part, that the w in
w(\n)
lf.aa
is in the correct, sorted position for the 6-character-long filename that it is. However, the lf.aa is not in a logical place.
* Less-Easily Breakable (only '\n' and possibly command characters could be a problem)
1b. Perl à la #tchrist with find, not ls
Using find with null separator and xargs.
Command:
find . -maxdepth 1 -type f -print0 | xargs -I'{}' -0 echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | perl -e 'print sort { length($b) <=> length($a) } <>'
Alternate version of command that's easier to read:
find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>'
Let's go for it.
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
bballdave025#MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
IKlT.txt
Lk3f.png
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
Works for normal filenames
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
Works for filenames containing spaces
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
xargs -I'{}' -0 \
echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' |
perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w
WARNING
BREAKS for filenames containing newlines
1c. Good for normal filenames and filenames with spaces, but breakable with filenames containing newlines - à la #tchrist
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w
3a. Good for normal filenames and filenames with spaces, but breakable with filenames containing newlines - à la #Peter_O
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | head -n 8
w
3.csv
a.dat
lf.aa
13.csv
o6.dat
137.csv
1UG5.txt
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
* More-Easily Breakable
4a. Good for normal filenames - à la #Raghuram
This version is breakable with filenames containing either spaces or newlines (or both).
I do want to add that I do like the display of the actual string length, if just for analysis purposes.
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | head -n 20
1 a
1 w
2 so
3 bla
3 can
3 off
3 one
4 file
4 just
4 long
4 some
4 that
4 with
5 3.csv
5 a.dat
5 lf.aa
5 there
5 throw
6 13.csv
6 counts
bballdave025#MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | tail -5
69 17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
70 83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
76 79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
87 oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
238 68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
Explanation of Some Commands
For now, I'll only note that, with the works-for-all find command, I used '/' for the newline substitute because it is the only character that is illegal in a filename both on *NIX and Windows.
Note(s)
[ 1 ] The command used,
du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=int($1)}END{print sum}'
will work in this case, because when there is a file with a newline, and therefore an "extra" line in the output of the find command, awk's int function will evaluate to 0 for the text of that link. Specifically, for our newline-containing filename, w\nlf.aa, i.e.
w
lf.aa
we will get
$ awk '{print int($1)}' < <(echo "lf.aa")
0
If you have a situation where the filename is something like
firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg
i.e.
firstline
3 and some other
1
2 exciting
86stuff.jpg
well, I guess the computer has beaten me. If anyone has a solution, I'd be glad to hear it.
Edit I think I'm way too deep into this question. from this SO answer and experimentation, I got this command (I don't understand all the details, but I've tested it pretty well.)
awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'
More readably:
awk -F"\0" '{print NF-1}' < \
<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=$1}END{print sum}'
You can use
ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-
To see it in action, create some files
% touch a ab abc
and some directories
% mkdir d de def
Output of the normal ls command
% ls
a ab abc d/ de/ def/
Output from the proposed command
% ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-
a
d
ab
de
abc
def

Resources