Customizing print output after getting a column using 'cut' command - unix

I'm trying to print the first column of output in a "customized" way, after executing a program that prints out a table. I know how to get the first column from the output, but I want to print each row between single quotes. So, right now I have the commands that can get me the first column:
./genTable | cut -f2 | xargs -0
What can I add to this command so that it prints the values between quotes. For example, the output right now looks like
apple
cider
vinegar
I want it to look like
'apple'
'cider'
'vinegar'

I'd use Perl. ./genTable | perl -nwla -e 'print \'$F[1]\''

I'd use awk ;-) , i.e.
./genTable | awk -v singleQ="'" '{print singleQ $1 singleQ}'
And of course you if you want super-minimalist, change all references from singleQ to Q ;-)
output
'apple'
'cider'
'vinegar'
IHTH

Related

find similar rows in a text file in unix system

I have a file named tt.txt and the contents of this file is as follows:
fdgs
jhds
fdgs
I am trying to get the similar row as the output in a text file.
my expected output is:
fdgs
fdgs
to do so, I used this command:
uniq -u tt.txt > output.txt
but it returns:
fdgs
jhds
fdgs
do you know how to fix it?
If by similar row you mean the row with the same content.
From the uniq manpage the uniq command would only filter the adjacent matching lines from the repeated lines. So you need to sort the input first and used -D option to print all duplicated lines like below. However -D options is limited to the GNU implementation, and doing this would print the output in different order from the input.
sort tt.txt | uniq -D
If you want the output to be in the same order you need to remember the input line number and sort the line number again like this
cat -n tt.txt | sort -k 2 | uniq -f 1 -D | sort -k 1,1 | sed 's/\s+[0-9]+\s+//'
cat -n would print the content with the line number
sort -k 2 would sort the input starting at 2rd column
uniq -f 1 would ignore the first column
sort -k1,1 would sort the the output back by the original line number
sed 's/\s+[0-9]+\s+//' would delete the first column with line number
uniq -u command would output only the unique input line, which is completely opposite as what you want.
One in awk:
$ awk '++seen[$0]==2;seen[$0]>1' file
fdgs
fdgs

unix command to print every 2nd line of duplicate

I have a text file that has 110132 lines and looks like this,
b3694658:heccc 238622
b3769025:heccc 238622
b3694659:heccc 238623
b3769026:heccc 238623
b3694660:heccc 238624
b3769027:heccc 238624
b3694661:heccc 238625
b3769028:heccc 238625
Notice that every 2nd line has a duplicate entry at heccc etc., i want an output that only has the 2nd occurrence of the duplicate, so it would look like this,
b3769025:heccc 238622
b3769026:heccc 238623
b3769027:heccc 238624
b3769028:heccc 238625
Thanks for your help!
It appears that you are just looking to output unique values. If that is so, just do this:
cat textfile | sort | uniq
uniq -f1 file.txt
should do in this case.
see how -f , -s options work with the uniq command?

how to take substring in ksh

I have a file named "output.txt" having data in format:
400949703|2000025967912|20130614010652|20130614131543
355949737|2144050263|20120407100407|20120407101307
355499738|2144500262|20110911010901|20110911135601
I am executing an awk command as shown below:
awk -F"|" '{num1="`echo $3| cut -c1-8`"; print $num1}' output.txt
My expected output is :
20130614
20120407
20110911
But I am getting output as what is actually the input.
400949703|2000025967912|20130614010652|20130614131543
355949737|2144050263|20120407100407|20120407101307
355499738|2144500262|20110911010901|20110911135601
Not able to find out the reason. My task is to compare the 1st 8 characters in 3rd and 4th column. But stucked at this part only.
Experts, kindly help me to get the way, where I am missing.
What about using cut twice?
$ cut -d'|' -f4 file | cut -c-8
20130614
20120407
20110911
Firstly to get the 4th field based on | delimiter.
Secondly to get the first 8 characters (note that cut -c-8 is the same as your cut -c1-8)
You're mixing bash with awk, one tool is just enough:
awk -F\| 'a=substr($3, 1, 8){if(a==substr($4, 1, 8)){print a}}' output.txt
Get substrings of columns 3 and 4 , compare it and print if its ok.

Unix uniq, sort & cut command remove duplicate lines

If we have the following result:
Operating System,50
Operating System,40
Operating System,30
Operating System,23
Data Structure,87
Data Structure,21
Data Structure,17
Data Structure,8
Data Structure,3
Crypo,33
Crypo,31
C++,65
C Language,39
C Language,19
C Language,4
Java 1.6,16
Java 1.6,11
Java 1.6,10
Java 1.6,2
I only want to compare the first field (book name), and remove duplicate lines except the first line of each book, which records the largest number. So the result is as below:
Operating System,50
Data Structure,87
Crypo,33
C++, 65
C Language,39
Java 1.6,16
Can anyone help me out that how could I do using uniq, sort & cut command? May be using tr, head or tail?
Most elegant in this case would seem
rev input | uniq -f1 | rev
If your input is sorted, you can use GNU awk like this:
awk -F, '!array[$1]++' file.txt
Results:
Operating System,50
Data Structure,87
Crypo,33
C++,65
C Language,39
Java 1.6,16
If your input is unsorted, you can use GNU awk like this:
awk -F, 'FNR==NR { if ($2 > array[$1]) array[$1]=$2; next } !dup[$1]++ { if ($1 in array) print $1 FS array[$1] }' file.txt{,}
Results:
Operating System,50
Data Structure,87
Crypo,33
C++,65
C Language,39
Java 1.6,16
awk -F, '{if(P!=$1)print;p=$1}' your_file
This could be done in different ways, but I've tried to restrict myself to the tools you suggested:
cut -d, -f1 file | uniq | xargs -I{} grep -m 1 "{}" file
Alternatively, if you are sure that the words in the first column do not have more than 2 characters which are the same, you can simply use: uniq -w3 file. This tells uniq to compare no more than the first three characters.

Forcing the order of output fields from cut command

I want to do something like this:
cat abcd.txt | cut -f 2,1
and I want the order to be 2 and then 1 in the output. On the machine I am testing (FreeBSD 6), this is not happening (its printing in 1,2 order). Can you tell me how to do this?
I know I can always write a shell script to do this reversing, but I am looking for something using the 'cut' command options.
I think I am using version 5.2.1 of coreutils containing cut.
This can't be done using cut. According to the man page:
Selected input is written in the same order that it is read, and is
written exactly once.
Patching cut has been proposed many times, but even complete patches have been rejected.
Instead, you can do it using awk, like this:
awk '{print($2,"\t",$1)}' abcd.txt
Replace the \t with whatever you're using as field separator.
Lars' answer was great but I found an even better one. The issue with his is it matches \t\t as no columns. To fix this use the following:
awk -v OFS=" " -F"\t" '{print $2, $1}' abcd.txt
Where:
-F"\t" is what to cut on exactly (tabs).
-v OFS=" " is what to seperate with (two spaces)
Example:
echo 'A\tB\t\tD' | awk -v OFS=" " -F"\t" '{print $2, $4, $1, $3}'
This outputs:
B D A

Resources