Find word count using wc and assign to a variable - unix

i want the word count wc -w value be assigned to a variable
i've tried something like this, but i'm getting error, what is wrong?
winget="this is the first line"
wdCount=$winget | wc -w
echo $wdCount

You need to $(...) to assign the result:
wdCount=$(echo $winget | wc -w)
Or you could also avoid echo by using here-document:
wdCount=$(wc -w <<<$winget)

You can pass word count without the filename using the following:
num_of_lines=$(< "$file" wc -w)
See https://unix.stackexchange.com/a/126999/320461

You can use this to store the word count in variable:
word_count=$(wc -w filename.txt | awk -F ' ' '{print $1}'

Related

tcsh passing a variable inside a shell script

I've defined a variable inside a shell script and I want to use it. For some reason, I cannot pass it into to command line that I need it in.
Here's my script which fails at the last lines
#! /usr//bin/tcsh -f
if ( $# != 2 ) then
echo "Usage: jump_sorter.sh <jump> <field to sort on>"
exit;
endif
set a = `cat $1 | tail -1` #prepares last row for check with loop
set b = $2 #this is the value last row will be checked for
set counter = 0
foreach i ($a)
if ($i == "$b") then
set bingo = $counter
echo "$bingo is the field to print from $a"
endif
set counter = `expr $counter + 1`
end
echo $bingo #this prints the correct value for using in the command below
cat $1 | awk '{print($bingo)}' | sort | uniq -c | sort -nr #but this doesn't work.
#when I use $9 instead of $bingo, it does work.
How can I pass $bingo into the final line correctly, please?
Update: following the accepted answer from Martin Tournoij, the correct way to handle the "$" sign in the command is:
cat $1 | awk "{print("\$"$bingo)}" | sort | uniq -c | sort -nr
The reason it doesn't work is because variables are only substituted inside double quotes ("), not single quotes ('), and you're using single quotes:
cat $1 | awk '{print($bingo)}' | sort | uniq -c | sort -nr
The following should work:
cat $1 | awk "{print($bingo)}" | sort | uniq -c | sort -nr
You also have an error here:
#! /usr//bin/tcsh -f
That should be:
#!/usr/bin/tcsh -f
Note that csh isn't usually recommended for scripting; it has many quirks and lacks some features like functions. Unless you really need to use csh, it's recommended to use a Bourne shell (/bin/sh, bash, zsh) or a scripting language (Python, Ruby, etc.) instead.

How to change the field sequence in cut command in unix

I want to print the fields in specific format ,
Input :
col1|col2|col3|col4
I used cat file | cut -d '|' -f 3,1,4
output :
col1|col3|col4
But my expected output is:
col3|col1|col4
Can anyone help me with this?
From man cut:
Selected input is written in the same order that it is read, and is written exactly once
You should do:
$ awk -F'|' -vOFS='|' '{print $3,$1,$4}' <<< "col1|col2|col3|col4"
col3|col1|col4
even though awk is good,here is a perl solution:
perl -F"\|" -ane 'print join "|",#F[2,0,3]'
tested:
> echo "col1|col2|col3|col4" | perl -F"\|" -ane 'print join "|",#F[2,0,3]'
col3|col1|col4

How do I Get the distinct List of Special Characters from a File using GREP or SED?

I have a file which contains about 30000 Records delimited by '|'. I need to get a distinct list of special characters only from the file.
For Eg:
123|fasdf|%df&|pap,came|!
234|%^&asdf|34|'":|
My output should be:
|%&,!^'":
Any help would be greatly appreciated.
Thanks,
Velraj.
grep -o '[|%&,!^":]' input | sort -u
You have to list all your special characters inside brackets.
This will return each unique special character on its own line. If you really need a string with these characters you have to remove newlines afterwards, e.g.:
grep -o '[|%&,!^":]' input | sort -u | tr -d '\n'
UPDATE:
If you need to remove all characters which are not from 'a-zA-Z0-9' set then you can use this one:
grep -o '[^a-zA-Z0-9]' input | sort -u | tr -d '\n'
echo "123|fasdf|%df&|pap,came|! 234|%^&asdf|34|'\":|" \
| { tr -d '[[:alnum:]]'; printf "\n"; } \
| sed 's/\(.\)/\1_/g' \
| awk -v 'RS=_' '{print $0}' \
| sort -u \
| awk '{printf $0}END{printf "\n"}'
output
!"%&',:^||
You can replace the first line echo .... with cat fileName

counting records in unix file

This was an interview question, nevertheless still a programming question.
I have a unix file with two columns name and score. I need to display count of all the scores.
like
jhon 100
dan 200
rob 100
mike 100
the output should be
100 3
200 1
You only need to use built in unix utility to solve it, so i am assuming using shell scripts . or reg ex. or unix commands
I understand looping would be one way to do. store all the values u have already seen and then grep every record for unseen values. any other efficient way of doing it
Try this:
cut -d ' ' -f 2 < /tmp/foo | sort -n | uniq -c \
| (while read n v ; do printf "%s %s\n" "$v" "$n" ; done)
The initial cut could be replaced with another while read loop, which would be more resilient to input file format variations (extra whitespace). If some of the names consist in several words, simple field extraction will not work as easily, but sed can do it.
Otherwise, use your favorite programming language. Perl would probably shine. It is not difficult either in Java or even in C or Forth.
$ cat foo.txt
jhon 100
dan 200
rob 100
mike 100
$ awk '{print $2}' foo.txt | sort | uniq -c
3 100
1 200
Its a pity you can't do a count with sort or uniq alone.
Edit: I just noticed I have the count in front ... to get it exactly the same you can do:
$ awk '{print $2}' foo.txt | sort | uniq -c | awk '{ print $2 " " $1 }'
Not very complicated in perl:
#!/usr/bin/perl -w
use strict;
use warnings;
my %count = ();
while (<>) {
chomp;
my ($name, $score) = split(/ /);
$count{$score}++;
}
foreach my $key (sort keys %count) {
print "$key ", $count{$key}, "\n";
}
You could go with awk:
awk '/.*/ { a[$2] = a[$2] + 1; } END { for (x in a) { print x, " ", a[x] } }' record_file.txt
Alternatively with shell commands:
for i in `awk '{print $2}' inputfile | sort -u`
do
echo -n "$i "
grep $i inputfile | wc -l
done
The first awk command will give a list of all the different scores (e.g. 100 and 200) which then
the for loop iterates over, counting up each separately. Not very super efficient, but simple. If the file is not to big is should not be a too big problem.

Parsing each field and process it using 'awk'/'gawk'

Here is a query:
grep bar 'foo.txt' | awk '{print $3}'
The field name emitted by the 'awk' query are mangled C++ symbol names. I want to pass each to dem and finally output the output of 'dem'- i.e the demangled symbols.
Assume that the field separator is a ' ' (space).
awk is a pattern matching language. The grep is totally unnecessary.
awk '/bar/{print $3}' foot.txt
does what your example does.
Edit Fixed up a bit after reading the comments on the precedeing answer (I don't know a thing about dem...):
You can make use of the system call in awk with something like:
awk '/bar/{cline="dem " $3; system(cline)}' foot.txt
but this would spawn an instance of dem for each symbol processed. Very inefficient.
So lets get more clever:
awk '/bar/{list = list " " $3;}END{cline="dem " list; system(cline)}' foot.txt
BTW-- Untested as I don't have dem or your input.
Another thought: if you're going to use the xargs formulation offered by other posters, cut might well be more efficient than awk. At that point, however, you would need grep again.
How about
grep bar 'foo.txt' | awk '{ print $3 }' | xargs dem | awk '{ print $3 }'
This will print the demangled symbols, complete with argument lists in the case of methods:
awk '/bar/ { print $3 }' foo.txt | xargs dem | sed -e 's:.* == ::'
This will print the demangled symbols, without argument lists in the case of methods:
awk '/bar/ { print $3 }' foo.txt | xargs dem | sed -e 's:.* == \([^(]*\).*:\1:'
Cheers,
V.

Resources