Unix - Using ls with grep

Unix - Using ls with grep - unix

How can I use ls (or other commands) and grep together to search from specific files for a certain word inside that file?
Example I have a file - 201503003_315_file.txt and I have other files in my dir.
I only want to search files that have a file name that contains _315_ and inside that file, search for the word "SAMPLE".
Hope this is clear and thanks in advance for any help.

You can do:
ls * _315_* | xargs grep "SAMPLE"
The first part: ls * _315_* will list only files that have 315 as part of the file name, this list of files is piped to grep which will scan each one of them and look for "SAMPLE"
UPDATE
A bit easier (and actually safer) approach was mentioned by David in the comments bellow:
grep "SAMPLE" *_315_*.txt
The reason why it's safer is that ls doesn't handle well special characters.
Another option, as mentioned by Charles Duffy in the comments below:
printf '%s\0' *_315_* | xargs -0 grep

Change to that directory (using cd dir) and try:
grep SAMPLE *_315_*
If you really MUST use ls AND grep try this:
ls *_315_* | xargs grep SAMPLE
The first example, however, requires less typing...

Related

terminal command to act on filenames that don't contain text

I have a directory full of files with names such as:
file_name_is_001
file_name_001
file_name_is_002
file_name_002
file_name_is_003
file_name_003
I want to copy only the files that don't contain 'is'. I'm not sure how to do this. I have tried to search for it, but can't seem to google the right phrase to find the results.

Details depend on operating system, shell, etc.
For a unix system a quite verbose but easy to understand approach could look like this (please mind that I didn't test it):
mkdir some_temporary_directory
mv *_is_* some_temporary_directory
cp * where_ever_you_want_to_copy_it
mv some_temporary_directory/* .
rmdir some_temporary_directory

You can do this using bash. First, here's a command to get you a list of files that don't contain the text _is_:
ls | grep -v "_is_"
This takes the output of ls and matches all values with DO NOT contain _is_ using grep -v.
In order to then copy these files, we need to turn the lines output by grep into arguments of cp. We can do this using xargs:
ls | grep -v "_is_" | xargs -J % cp % new_folder
From the xargs man page, it is a tool to "build and execute command lines from standard input".

how can I highlight just one item from the ls output

real beginner in Unix commands so not sure if the following is actually possible but here goes.
Is it possible to highlight just one item in a ls output?
I.e.: in a directory I use the following
ls -l --color=auto
this lists 4 items in green
file1.xls
file2.xls
file3.xls
file4.xls
But I want to highlight a specific item, in this case file2.
Is this possible?

The ls program will not do this for you. But you could filter the results from ls through a custom script which modifies the text to highlight just one item. It would be simpler if no color was originally given; then you could match on the given filename (for example as the pattern in an awk script, or in a sed script) and modify just that one item, adding colors.
That is, certainly it is possible. Writing a sample script is a different question.
How you approach the problem depends on what you want from the output. If that is (literally) the output from ls with a single filename in color, then a script would be the normal approach. You could use grep as suggested in the other answer, which raises a few issues:
commenting on ls -l --color=auto makes it sound as if you are using GNU ls, hence likely using Linux. An appropriate tag for the question would be linux rather than unix. If you ask for unix, the answers should differ.
supposing that you are using Linux. Then likely you have GNU grep, which can do colors. That would let you do something like this:
ls -l | grep --color=always file2 |less -R
however, there is a known bug in GNU grep's use of color (see xterm FAQ "grep --color" does not show the right output).
using grep like this shows only the matching lines. For ls that might be a good choice. For matches in a manual page -- definitely not.
Alternatively, less (which is found more often on Unix systems than GNU grep) also can highlight matches (not in color) and would show the file you are looking for in context. You could do this:
ls -l | less -p file2
(Both grep and less use patterns aka regular expressions, but I left the example simple — read the documentation to learn more).

If you're a beginner I would strongly suggest you learn the grep command if you want to filter results - A Unix users best friend (mine anyway)
Use grep to only display the list items you want to see...
ls- l | grep "file2"
NOTE: This is no different to typing ls -l file2 by the way but your pattern could be expanded based on what you actually want displayed on the screen.
So if you had a directory full of files ".txt", ".xls", ".doc" and you wanted to only see ".doc" with the word "work" in the name (work1.doc) you could write:
ls -ls | grep "work" | grep "txt"
This would list work1.txt, work2.txt, work3.txt and so on.
This is a very basic example but I use grep extensively whilst in the unix shell and would advise using this to filter all results instead of colours.
A little side note using grep -v will show you everything but the pattern you give it
ls -l | grep -v ".txt" will show everything BUT .txt files.

Unix [Homework]: Get a list of /home/user/ directories in /etc/passwd

I'm very new to Unix, and currently taking a class learning the basics of the system and its commands.
I'm looking for a single command line to list off all of the user home directories in alphabetical order from the /etc/passwd directory. This applies only to the home directories, and not the contents within them. There should be no duplicate entries. I've tried many permutations of commands such as the following:
sort -d | find /etc/passwd /home/* -type -d | uniq | less
I've tried using -path, -name, removing -type, using -prune, and changing the search pattern to things like /home/*/$, but haven't gotten good results once. At best I can get a list of my own directory (complete with every directory inside it, which is bad), and the directories of the other students on the server (without the contained directories, which is good). I just can't get it to display the /home/user directories and nothing else for my own account.
Many thanks in advance.

/etc/passwd is a file. the home directory is usually at field/column 6, where ":" is the delimiter. When you are dealing with file structure that has distinct characters as delimiters, you should use a tool that can break your data down into smaller chunks for easier manipulation using fields and field delimiters. awk/cut etc, even using the shell with IFS variable set can do the job. eg
awk -F":" '{print $6}' /etc/passwd | sort
cut -d":" -f6 /etc/passwd |sort
using the shell to read the file
while IFS=":" read -r a b c d e home_dir g
do
echo $home_dir
done < /etc/passwd | sort

I think the tools you want are grep, tr and awk. Grep will give you lines from the file that actually contain home directories. tr will let you break up the delimiter into spaces, which makes each line easier to parse.
Awk is just one program that would help you display the results that you want.
Good luck :)
Another hint, try ls --color=auto /etc, passwd isn't the kind of file that you think it is. Directories show up in blue.

In Unix, find is a command for finding files under one or more directories. I think you are looking for a command for finding lines within a file that match a pattern? Look into the command grep.

sed 's|\(.[^:]*\):\(.[^:]*\):\(.*\):\(.[^:]*\):\(.[^:]*\)|\4|' /etc/passwd|sort

I think all this processing could be avoided. There is a utility to list directory contents.
ls -1 /home
If you'd like the order of the sorting reversed
ls -1r /home
Granted, this list out the name of just that directory name and doesn't include the '/home/', but that can be added back easily enough if desired with something like this
ls -1 /home | (while read line; do echo "/home/"$line; done)

I used something like :
ls -l -d $(cut -d':' -f6 /etc/passwd) 2>/dev/null | sort -u
The only thing I didn't do is to sort alphabetically, didn't figured that yet

Diff files present in two different directories

I have two directories with the same list of files. I need to compare all the files present in both the directories using the diff command. Is there a simple command line option to do it, or do I have to write a shell script to get the file listing and then iterate through them?

You can use the diff command for that:
diff -bur folder1/ folder2/
This will output a recursive diff that ignore spaces, with a unified context:
b flag means ignoring whitespace
u flag means a unified context (3 lines before and after)
r flag means recursive

If you are only interested to see the files that differ, you may use:
diff -qr dir_one dir_two | sort
Option "q" will only show the files that differ but not the content that differ, and "sort" will arrange the output alphabetically.

Diff has an option -r which is meant to do just that.
diff -r dir1 dir2

diff can not only compare two files, it can, by using the -r option, walk entire directory trees, recursively checking differences between subdirectories and files that occur at comparable points in each tree.
$ man diff
...
-r --recursive
Recursively compare any subdirectories found.
...
Another nice option is the über-diff-tool diffoscope:
$ diffoscope a b
It can also emit diffs as JSON, html, markdown, ...

If you specifically don't want to compare contents of files and only check which one are not present in both of the directories, you can compare lists of files, generated by another command.
diff <(find DIR1 -printf '%P\n' | sort) <(find DIR2 -printf '%P\n' | sort) | grep '^[<>]'
-printf '%P\n' tells find to not prefix output paths with the root directory.
I've also added sort to make sure the order of files will be the same in both calls of find.
The grep at the end removes information about identical input lines.

If it's GNU diff then you should just be able to point it at the two directories and use the -r option.
Otherwise, try using
for i in $(\ls -d ./dir1/*); do diff ${i} dir2; done
N.B. As pointed out by Dennis in the comments section, you don't actually need to do the command substitution on the ls. I've been doing this for so long that I'm pretty much doing this on autopilot and substituting the command I need to get my list of files for comparison.
Also I forgot to add that I do '\ls' to temporarily disable my alias of ls to GNU ls so that I lose the colour formatting info from the listing returned by GNU ls.

When working with git/svn or multiple git/svn instances on disk this has been one of the most useful things for me over the past 5-10 years, that somebody might find useful:
diff -burN /path/to/directory1 /path/to/directory2 | grep +++
or:
git diff /path/to/directory1 | grep +++
It gives you a snapshot of the different files that were touched without having to "less" or "more" the output. Then you just diff on the individual files.

In practice the question often arises together with some constraints. In that case following solution template may come in handy.
cd dir1
find . \( -name '*.txt' -o -iname '*.md' \) | xargs -i diff -u '{}' 'dir2/{}'

Here is a script to show differences between files in two folders. It works recursively. Change dir1 and dir2.
(search() { for i in $1/*; do [ -f "$i" ] && (diff "$1/${i##*/}" "$2/${i##*/}" || echo "files: $1/${i##*/} $2/${i##*/}"); [ -d "$i" ] && search "$1/${i##*/}" "$2/${i##*/}"; done }; search "dir1" "dir2" )

Try this:
diff -rq /path/to/folder1 /path/to/folder2

How to do a mass rename?

I need to rename files names like this
transform.php?dappName=Test&transformer=YAML&v_id=XXXXX
to just this
XXXXX.txt
How can I do it?
I understand that i need more than one mv command because they are at least 25000 files.

Easiest solution is to use "mmv"
You can write:
mmv "long_name*.txt" "short_#1.txt"
Where the "#1" is replaced by whatever is matched by the first wildcard.
Similarly #2 is replaced by the second, etc.
So you do something like
mmv "index*_type*.txt" "t#2_i#1.txt"
To rename index1_type9.txt to t9_i1.txt
mmv is not standard in many Linux distributions but is easily found on the net.

If you are using zsh you can also do this:
autoload zmv
zmv 'transform.php?dappName=Test&transformer=YAML&v_id=(*)' '$1.txt'

You write a fairly simple shell script in which the trickiest part is munging the name.
The outline of the script is easy (bash syntax here):
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i <modified name>
done
Modifying the name has many options. I think the easiest is probably an awk one-liner like
`echo $i | awk -F'=' '{print $4}'`
so...
for i in 'transform.php?dappName=Test&transformer=YAML&v_id='*
do
mv $i `echo $i | awk -F'=' '{print $4}'`.txt
done
update
Okay, as pointed out below, this won't necessarily work for a large enough list of files; the * will overrun the command line length limit. So, then you use:
$ find . -name 'transform.php?dappName=Test&transformer=YAML&v_id=*' -prune -print |
while read
do
mv $reply `echo $reply | awk -F'=' '{print $4}'`.txt
done

Try the rename command
Or you could pipe the results of an ls into a perl regex.

You may use whatever you want to transform the name (perl, sed, awk, etc.). I'll use a python one-liner:
for file in 'transform.php?dappName=Test&transformer=YAML&v_id='*; do
mv $file `echo $file | python -c "print raw_input().split('=')[-1]"`.txt;
done
Here's the same script entirely in Python:
import glob, os
PATTERN="transform.php?dappName=Test&transformer=YAML&v_id=*"
for filename in glob.iglob(PATTERN):
newname = filename.split('=')[-1] + ".txt"
print filename, '==>', newname
os.rename(filename, newname)
Side note: you would have had an easier life saving the pages with the right name while grabbing them...

find -name '*v_id=*' | perl -lne'rename($_, qq($1.txt)) if /v_id=(\S+)/'

vimv lets you rename multiple files using Vim's text editing capabilities.
Entering vimv opens a Vim window which lists down all files and you can do pattern matching, visual select, etc to edit the names. After you exit Vim, the files will be renamed.
[Disclaimer: I'm the author of the tool]

I'd use ren-regexp, which is a Perl script that lets you mass-rename files very easily.
21:25:11 $ ls
transform.php?dappName=Test&transformer=YAML&v_id=12345
21:25:12 $ ren-regexp 's/transform.php.*v_id=(\d+)/$1.txt/' transform.php*
transform.php?dappName=Test&transformer=YAML&v_id=12345
1 12345.txt
21:26:33 $ ls
12345.txt

This should also work:
prfx='transform.php?dappName=Test&transformer=YAML&v_id='
ls $prfx* | sed s/$prfx// | xargs -Ipsx mv "$prfx"psx psx

this renamer command would do it:
$ renamer --regex --find 'transform.php?dappName=Test&transformer=YAML&v_id=(\w+)' --replace '$1.txt' *

Ok, you need to be able to run a windows binary for this.
But if you can run Total Commander, do this:
Select all files with *, and hit ctrl-M
In the Search field, paste "transform.php?dappName=Test&transformer=YAML&v_id="
(Leave Replace empty)
Press Start
It doesn't get much simpler than that.
You can also rename using regular expressions via this dialog, and you see a realtime preview of how your files are going to be renamed.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex