How to properly grep filenames only from ls -al - unix

How do I tell grep to only print out lines if the "filename" matches when I'm piping through ls? I want it to ignore everything on each line until after the timestamp. There must be some easy way to do this on a single command.
As you can see, without it, if I searched for the file "rwx", it would return not only the line with rwx.c, but also the first three lines because of permissions. I was going to use AWK but I want it to display the whole last line if I search for "rwx".
Any ideas?
EDIT: Thanks for the hacks below. However, it would be great to have a more bug-free method. For example, if I had a file named "rob rob", I wouldn't be able to use the stated solutions.
drwxrwxr-x 2 rob rob 4096 2012-03-04 18:03 .
drwxrwxr-x 4 rob rob 4096 2012-03-04 12:38 ..
-rwxrwxr-x 1 rob rob 13783 2012-03-04 18:03 a.out
-rw-rw-r-- 1 rob rob 4294 2012-03-04 18:02 function1.c
-rw-rw-r-- 1 rob rob 273 2012-03-04 12:54 function1.c~
-rw-rw-r-- 1 rob rob 16 2012-03-04 18:02 rwx.c
-rw-rw-r-- 1 rob rob 16 2012-03-04 18:02 rob rob

The following will list only file name, and one file in each row.
$ ls -1
To include . files
$ ls -1a
Please note that the argument is number "1", not letter "l".

Why don't you use grep and match the file name following the timestamp?
grep -P "[0-9]{2}:[0-9]{2} $FILENAME(\.[a-zA-Z0-9]+)?$"
The [0-9]{2}:[0-9]{2} is for the time, the $FILENAME is where you'd put rob rob or rwx, and the trailing (\.[a-zA-Z0-9]+)? is to allow for an optional extension.
Edit: #JonathanLeffler below points out that when files are older than bout 6 months the time column gets replaced by a year - this is what happens on my computer anyhow. You could do ([0-9]{2}:[0-9]{2}|(19|20)[0-9]{2}) to allow time OR year, but you may be best of using awk (?).
[foo#bar ~/tmp]$ls -al
total 8
drwxrwxr-x 2 foo foo 4096 Mar 5 09:30 .
drwxr-xr-- 83 foo foo 4096 Mar 5 09:30 ..
-rw-rw-r-- 1 foo foo 0 Mar 5 09:30 foo foo
-rw-rw-r-- 1 foo foo 0 Mar 5 09:29 rwx.c
-rw-rw-r-- 1 foo foo 0 Mar 5 09:29 tmp
[foo#bar ~/tmp]$export filename='foo foo'
[foo#bar ~/tmp]$echo $filename
foo foo
[foo#bar ~/tmp]$ls -al | grep -P "[0-9]{2}:[0-9]{2} $filename(\.[a-zA-Z0-9]+)?$"
-rw-rw-r-- 1 cha66i cha66i 0 Mar 5 09:30 foo foo
(You could additionally extend to matching the whole line if you wanted:
^ # start of line
[d-]([r-][w-][x-]){3} + # permissions & space (note: is there a 't' or 's'
# sometimes where the 'd' can be??)
[0-9]+ # whatever that number is
[\w-]+ [\w-]+ + # user/group (are spaces allowed in these?)
[0-9]+ + # file size (modify for -h switch??)
(19|20)[0-9]{2}- # yyyy (modify if you want to allow <1900)
(1[012]|0[1-9])- # mm
(0[1-9]|[12][0-9]|3[012]) + # dd
([01][0-9]|2[0-3]):[0-6][0-9] +# HH:MM (24hr)
$filename(\.[a-zA-Z0-9]+)? # filename & optional extension
$ # end of line
. You get the point, tailor to your needs.)

Assuming that you aren't prepared to do:
ls -ld $(ls -a | grep rwx)
then you need to exploit the fact that there are 8 columns with space separation before the file name starts. Using egrep (or grep -E), you could do:
ls -al | egrep "^([^ ]+ +){8}.*rwx"
This looks for 'rwx' after the 8th column. If you want the name to start with rwx, omit the .*. If you want the name to end with rwx, add a $ at the end. Note that I used double quotes so you could interpolate a variable in place of the literal rwx.
This was tested on Mac OS X 10.7.3; the ls -l command consistently gives three columns for the date field:
-r--r--r-- 1 jleffler staff 6510 Mar 17 2003 README,v
-r--r--r-- 1 jleffler staff 26676 Mar 3 21:44 ccs.nmd
Your ls -l seems to be giving just two columns, so you'd need to change the {8} to {7} for your machine - and beware migrating between systems.

Well, if you're working with filenames that don't have spaces in them, you could do something like this:
grep 'rwx\S*$'

Aside frrm the fact that you can use pattern matching with ls, exaple ksh and bash,
which is probably what you should do, you can use the fact that filename occur in a
fixed position. awk (gawk, nawk or whaever you have) is a better choice for this.
If you have to use grep it smells like homework to me. Please tag it that way.
Assume the filename starting position is based on this output from ls -l in linux: 56
-rwxr-xr-x 1 Administrators None 2052 Feb 28 20:29 vote2012.txt
ls -l | awk ' substr($0,56) ~/your pattern even with spaces goes here/'
e.g.,
ls -l | awk ' substr($0,56) ~/^val/'
will find files starting with "val"

As a simple hack, just add a space before your filename so you don't match the beginning of the output:
ls -al | grep '\srwx'
Edit: OK, this is not as robust as it should be. Here's awk:
ls -l | awk ' $9 ~ /rwx/ { print $0 }'

This works for me, unlike ls -l & others as some folks pointed out. I like this because its really generic & gives me the base file name, which removes the path names before the file.
ls -1 /path_name |awk -F/ '{print $NF}'

Only one command you needed for this --
ls -al | gawk '{print $9}'

You can use this:
ls -p | grep -v /

this is super old, but i needed the answer and had a hard time finding it. i didn't really care about the one-liner part; i just needed it done. this is down and dirty and requires that you count the columns. i'm not looking for an upvote here, just leaving some options for future searcher-ers.
the helpful awk trick is here -- Using awk to print all columns from the nth to the last
if
YOUR_FILENAME="rob rob"
and
WHERE_FILENAMES_START=8
ls -al | while read x; do
y=$(echo "$x" | awk '{for(i=$WHERE_FILENAMES_START; i<=NF; ++i) printf $i""FS; print ""}')
[[ "$YOUR_FILENAME " = "$y" ]] && echo "$x"
done
if you save it as a bash script and swap out the vars with $2 and $1, throw the script in your usr bin... then you'll have your clean simple one-liner ;)
output will be:
> -rw-rw-r-- 1 rob rob 16 2012-03-04 18:02 rob rob
the question was for a one-liner so...
ls -al | while read x; do [[ "$YOUR_FILENAME " = "$(echo "$x" | awk '{for(i=WHERE_FILENAMES_START; i<=NF; ++i) printf $i""FS; print ""}')" ]] && echo "$x" ; done
(lol ;P)
on another note: mathematical.coffee your answer was rad. it didn't solve my version of this problem, so i didn't upvote, but i liked your regex breakdown :D

Related

grep and awk, combine commands?

I have file that looks like:
This is a RESTRICTED site.
All connections are monitored and recorded.
Disconnect IMMEDIATELY if you are not an authorized user!
sftp> cd outbox
sftp> ls -ltr
-rw------- 1 0 0 1911 Jun 12 20:40 61N0584832_EDIP000749728818_MFC_20190612203409.txt
-rw------- 1 0 0 1878 Jun 13 06:01 613577165_EDIP000750181517_MFC_20190613055207.txt
I want to print only the .txt file names, ideally in one command.
I can do:
grep -e '^-' outfile.log > outfile.log2
..which gives only the lines that start with '-'.
-rw------- 1 0 0 1911 Jun 12 20:40 61N0584832_EDIP000749728818_MFC_20190612203409.txt
-rw------- 1 0 0 1878 Jun 13 06:01 613577165_EDIP000750181517_MFC_20190613055207.txt
And then:
awk '{print $9}' outfile.log2 > outfile.log3
..which gives the desired output:
61N0584832_EDIP000749728818_MFC_20190612203409.txt
613577165_EDIP000750181517_MFC_20190613055207.txt
And so the question is, can these 2 commands be combined into 1?
You may use a single awk:
awk '/^-/{ print $9 }' file > outputfile
Or
awk '/^-/{ print $9 }' file > tmp && mv tmp file
It works like this:
/^-/ - finds each line starting with -
{ print $9 } - prints Field 9 of the matching lines only.
Seems like matching the leading - is not really want you want. If you want to just get the .txt files as output, filter on the file name:
awk '$9 ~ /\.txt$/{print $9}' input-file
Using grep with PCRE enabled (-P) flag:
grep -oP '^-.* \K.*' outfile.log
61N0584832_EDIP000749728818_MFC_20190612203409.txt
613577165_EDIP000750181517_MFC_20190613055207.txt
'^-.* \K.*' : Line starting with - till last white space are matched but ignored (anything left of \K will be matched and ignored) and matched part right of \K will be printed.
Since he clearly writes I want to print only the .txt file names, we should test for txt file and since file name are always the latest column we make it more portable by only test the latest filed line this:
awk '$NF ~ /\.txt$/{print $NF}' outfile.log > outfile.log2
61N0584832_EDIP000749728818_MFC_20190612203409.txt
613577165_EDIP000750181517_MFC_20190613055207.txt

Format output of concatenating 2 variables in unix

I am coding a simple shell script that checks the space of the target path and the space utilization per directory on that target path (example, I am checking space of /path1/home, and also checks how all the folders on /path1/home is consuming the total space.) My question is regarding the output it produces, it is not that pleasing to the eye (uneven spacing). See sample output lines below.
SIZE USER_FOLDER DATE_LAST_MODIFIED
83G FOLDER 1 Apr 15 03:45
34G FOLDER 10 Mar 9 05:02
26G FOLDER 11 Mar 29 13:01
8.2G FOLDER 100 Apr 1 09:42
1.8G FOLDER 101 Apr 11 13:50
1.3G FOLDER 110 Feb 16 09:30
I just want the output format to be in line with the header so it will look neat because I will use it as a report. Here is the code I am using for this part.
ls -1 | grep -v "lost+found" |grep -v "email_body.tmp" > $v_path/Users.tmp
for user in `cat $v_path/Users.tmp | grep -v "Users.tmp"`
do
folder_size=`du -sh $user 2>/dev/null` # should be run using a more privileged user so that other folders can be read (2>/dev/null was used to discard error messages i.e. "du: cannot read directory `./marcnad/.gnupg': Permission denied")
folder_date=`ls -ltr | tr -s " " | cut -f6,7,8,9, -d" " | grep -w $user | cut -f1,2,3, -d" "`
folder_size="$folder_size $folder_date"
echo $folder_size >> $v_path/Users_Usage.tmp
done
echo "Summary of $v_path Disk Space Utilization per folder." >> email_body.tmp
echo "" >> email_body.tmp
echo "SIZE USER_FOLDER DATE_LAST_MODIFIED" >> email_body.tmp
for i in T G M K
do
cat $v_path/Users_Usage.tmp | grep [0-9]$i | sort -nr -k 1 >> $v_path/email_body.tmp
done
Thanks!
EDIT: Formatting
When you print the data use printf instead of echo
cat $v_path/Users_Usage.tmp | while read a b c d e f
do
printf '%-5s%-7%s%-4s%-4s%-3s-6s' $a $b $c $d $e $f
done
See here

How to list the files greater than specific timestamp in its pattern in Unix?

Can you please how I can accomplish the below scenario in Unix Ksh command?
I have a job J1 which is completed by the time HH:MM. I would like to list all the files created by this job J1, The file has the timestamp in its pattern YYYYMMDDHHMMSS_?
where YYYYMMDD is the date, HHMMSS is the system timestamp. I want to list the files if the job's timestamp is less than the file time stamp as the job creates the files, the timestamp of the job would be greater than the file timestamp?
Regards
Ben
You can use something like this: (Assuming the files listed)
$ ls -la
total 44K
drwxr-xr-x 2 gp users 4.0K Oct 27 14:56 .
drwxr-xr-x 11 gp users 4.0K Oct 27 14:57 ..
-rw-r--r-- 1 gp users 0 Oct 23 14:45 logfile
-rw-r--r-- 1 gp users 137 Oct 27 15:09 t2t2
prw-r--r-- 1 gp users 0 Oct 23 12:34 testpipe
-rw-r--r-- 1 gp users 0 Oct 23 14:51 tmpfile
-rw-r--r-- 1 gp users 7 Oct 27 14:58 ttt
# Find newer files
$ find . -newer ttt -print
./t2t2
# Find files that are NOT newer
$ find . ! -newer ttt -print
.
./tmpfile
./testpipe
./logfile
./ttt
# You can eliminate the directories (all of them) from the output this way:
$ find . ! -newer ttt ! -type d -print
./tmpfile
./testpipe
./logfile
./ttt
# or this way
$ find . ! -newer ttt -type f -print
Note that the different forms of the "newer" option (like anewer, cnewer) will not compare the other files against the the same timestamp. You might have to do a few tests to see which version suits you better.
If you must use the timestamp in the file name, and the different options of "find", including "mmin" are not acceptable, then you will have to examine the embedded timestamp of each file name. I suggest checking into these commands:
# You have to escape the < of > signs you use.
$ expr "fabc" \< "cde"
0
$ expr "abc" \< "cde"
1
and this:
FILENAME="ABC_20141026101112.log" ; TIMESTAMP="`expr \"$FILENAME\" : \".*_20\([0-9]\{12\}\).*$\"`";echo $TIMESTAMP
So a "while read" loop, looking at all the file names and comparing their timestamps using the above "expr" compares should do the job. Ideally, I'd try to see if "find" can do the job because reading and examining each file will be slower. If you have thousands of files in that directory, then I would try some other solution. If you are interested in more options, let me know.

Unix utilities, sum the data under the save entries

I have this little problem that I want to ask:
So I have a file named "quest", which has:
Tom 100 John 10 Tom 100
How do I use awk to output something like:
Tom 200
I'd appreciate your help. I tried to look up online but I am not sure what I am look for. Thanks ahead!!
I do know how to use regular expression /Tom/ to grep the entry, but I am not sure how to proceed from there.
You can try something like:
$ awk '{
for(i=1; i<=NF; i+=2)
names[$i] = ((names[$i]) ? names[$i]+$(i+1) : $(i+1))
}
END{
for (name in names) print name, names[name]
}' quest
Tom 200
John 10
You basically iterate over the fields creating keys for all odd fields and assigning values of even fields to them. If the key already exists, you just add to the existing value.
This expects your file format to have Names on odd fields (for eg. 1, 3, 5 .. etc) and values on even fields (eg 2, 4, 6 .. etc).
In the END block, you just print entire array content.
I guess you need calculate all users' mark, not only Tom, here is the code:
xargs -n2 < file|awk '{a[$1]+=$2}END{for (i in a) print i,a[i]}'
Tom 200
John 10
and one-liner of awk
awk '{for (i=1;i<=NF;i+=2) a[$i]+=$(i+1)}END{for (i in a) print i,a[i]}' file
Tom 200
John 10
$ echo 'Tom 100 John 10 Tom 100' | grep -o '[0-9]*' | paste -sd+ | bc
210
grep -o '[0-9]*' produces
100
10
100
paste -sd+ produces
100+10+100
bc calculates the result.
However, this only works for small input since bc has limitation in input size.
In that case you can use awk '{s+=$0}END{print s}' instead of paste -sd+ | bc.
However note that GNU Awk treats all number as floting point, it produces inaccurate result when number is too large.
awk '/Tom/{
for(i=1;i<=NF;i++)
if($i=="Tom")s+=$(i+1);
print "Tom",s;s=0}' your_file
Test
Here is a way to do it in awk (no loop):
awk -v RS=" " '{n=$1;getline;a[n]+=$1} END {for (i in a) print i,a[i]}' quest
Tom 200
John 10
If there are more than one line like this
cat quest
Tom 100 John 10 Tom 100
Paul 20 Tom 40 John 10
Then do this with gnu awk:
awk -v RS=" |\n" '{n=$1;getline;a[n]+=$1} END {for (i in a) print i,a[i]}' quest
Paul 20
Tom 240
John 20
And if you do not like getline
awk -v RS=" |\n" 'NR%2 {n=$1;next}{a[n]+=$1} END {for (i in a) print i,a[i]}' quest

What does ^ character mean in grep ^d?

When I do ls -l | grep ^d it lists only directories in the current directory.
What I'd like to know is what does the character ^ in ^d mean?
The caret ^ and the dollar sign $ are meta-characters that respectively match the empty string at the beginning and end of a line.The grep is matching only lines that start with "d".
To complement the good answer by The New Idiot, I want to point out that this:
ls -l | grep ^d
Shows all directories in the current directory. That's because the ls -l adds a d in the beginning of the directories info.
The format of ls -l is like:
-rwxr-xr-x 1 user group 0 Jun 12 12:25 exec_file
-rw-rw-r-- 1 user group 0 Jun 12 12:25 normal_file
drwxr-xr-x 16 user group 4096 May 24 12:46 dir
^
|___ see the "d"
To make it more clear, you can ls -lF to include a / to the end of the directories info:
-rwxr-xr-x 1 user group 0 Jun 12 12:25 exec_file*
-rw-rw-r-- 1 user group 0 Jun 12 12:25 normal_file
drwxr-xr-x 16 user group 4096 May 24 12:46 dir/
So ls -lF | grep /$ will do the same as ls -l | grep ^d.
It has two meanings. One as 'The New Idiot' above pointed out. The other, equally useful, is within character class expression, where it means negation: grep -E '[^[:digit:]]' accepts any character except a digit. The^` must be the first character within [].

Resources