Using grep to get the count of files where keyword exist - unix

i am trying to get the count of files which has matching keywords in directory. Code i used is:
grep -r -i --include=\*.sas 'keyword'
Can any one help me to, how to get the count of the files which contains the keyword.
Thanks

You will need to do two things. The first is to suppress normal output from grep and print only the file name with -l. The second is to pipe the output through to wc -l to get the count of the lines, hence the count of the files.
grep -ril "keyword" --include="*.sas" * | wc -l

Related

find directories

I have been trying to get to count all the empty folders in a certain directory. sub-directories excluded. i used the code below but i don't know how to define empty folders or folders that contain files.
echo "$(ls -l | egrep -l $1/* | wc -l)"
the $1 will be the user argument in the command line. example: ./script.sh ~/Desktop/backups/March2021.
Edit - im not allowed to use find command
Edit 2 - ls -l * | awk '/total 0/{print last}{last=$0}' | wc -l this script works but lists all folders even if the directory contains files and data or if the directory is empty.
What about this:
grep -v "." *
I mean the following: "." means any character (I'm not sure the syntax is correct), so basically you look for every file which not even contain any character.
You should not parse ls (directories or file names with newlines), so this solution is only for the assignment:
ls -d */ */* | cut -d/ -f1 | sort | uniq -u | wc -l
Explanation:
ls -d */ shows all directories. This is combined with ls -d */* which will also show contents in the directories.
The resulting output will show all directories.
Empty directories will be shown only once, so you want to look for unique lines.
With the cut you only see the name of the directory, not the files in the directory.
The sort could be skipped here, the ls will give sorted output. When you change the solution to find (next assignment?) the sort might be needed.
uniq can look for lines that occur once. The flag -u removes all lines that have duplicates, so it will show the unique lines in the output.

UNIX egrep multiple strings

I am attempting to search multiple files within a directory, using multiple egrep pipes I am organize the data I need, but it matches every single line. I am only need it to match one line, and continue to the next file in the directory.
Ex:
egrep -i "stringname" * | egrep -i "anotherstringname"
Is there another way? Any recommendations, I am new to Unix.
You probably want something like:
egrep -m 1 "string1|string2|string3" *.txt
(where I assume your files have names matching *.txt). For instance, on my own computer if I type:
egrep -m 1 "def|append" *.py
I will search at most 1 line in each python file matching either "def" or "append".

Unix Command to get the count of lines in a csv file

Hi I am new to UNIX and I have to get the count of lines from incoming csv files. I have used the following command to get the count.
wc -l filename.csv
Consider files coming with 1 record iam getting some files with * at the start and for those files if i issue the same command iam getting count as 0. Does * mean anything in here and if i get a file with ctrlm(CR) instead of NL how to get the count of lines in that file. gimme a command that solves the issue.
The following query helps you to get the count
cat FILE_NAME | wc -l
All of the answers are wrong. CSV files accept line breaks in between quotes which should still be considered part of the same line. If you have either Python or PHP on your machine, you could be doing something like this:
Python
//From stdin
cat *.csv | python -c "import csv; import sys; print(sum(1 for i in csv.reader(sys.stdin)))"
//From file name
python -c "import csv; print(sum(1 for i in csv.reader(open('csv.csv'))))"
PHP
//From stdin
cat *.csv | php -r 'for($i=0; fgetcsv(STDIN); $i++);echo "$i\n";'
//From file name
php -r 'for($i=0, $fp=fopen("csv.csv", "r"); fgetcsv($fp); $i++);echo "$i\n";'
I have also created a script to simulate the output of wc -l: https://github.com/dhulke/CSVCount
In case you have multiple .csv files in the same folder, use
cat *.csv | wc -l
to get the total number of lines in all csv files in the current directory. So,
-c counts characters and -m counts bytes (identical as long as you use ASCII). You can also use wc to count the number of files, e.g. by: ls -l | wc -l
wc -l mytextfile
Or to only output the number of lines:
wc -l < mytextfile
Usage: wc [OPTION]... [FILE]...
or: wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified. With no FILE, or when FILE is -,
read standard input.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
--files0-from=F read input from the files specified by
NUL-terminated names in file F;
If F is - then read names from standard input
-L, --max-line-length print the length of the longest line
-w, --words print the word counts
--help display this help and exit
--version output version information and exit
You can also use xsv for that. It also supports many other subcommands that are useful for csv files.
xsv count file.csv
echo $(wc -l file_name.csv|awk '{print $1}')

List files that contains telephone numbers in a unix directory

i have a directory called testDir and it contains 1000 file, some of them contains telephone numbers and some of them doesn't, the telephone number format is "12-3456789"
how to get the number of files that contains telephone numbers ?
EDIT: i am not familiar with unix, so i couldn't answer the question.
A simple solution could be:
grep -lE "[0-9]{2}-[0-9]{7}" * | wc -l
EDIT:
grep seeks for pattern in files.
-E activates regular expressions (you could use egrep instead)
-l filters grep results, only the file name will be printed
wc counts
-l lines will be count (-w counts words, but it could provide incorrect results in case of spaces in filenames)

grep -l and grep -ln

according to the manual for grep,
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match.
grep -l, this seems fine in that when a match is found, the file name containing the match is echoed.
However when i do a grep -ln, grep echoes every line of the occurrence.
Does grep -l really mean to stop when the first occurrence of the match is found and stop scanning, while grep -ln will ignore the -l flag?
Those options are incompatible. Use grep -Hnm 1 if you want to display the line number of the first match (and only the first match) in each file.
-H, --with-filename
Print the filename for each match.
-n, --line-number
Prefix each line of output with the line number within its input file.
-m NUM, --max-count=NUM
Stop reading a file after NUM matching lines.

Resources