Convert specific column of file into upper case in unix (without using awk and sed) - unix

My file is as below
file name = test
1 abc
2 xyz
3 pqr
How can i convert second column of file in upper case without using awk or sed.

You can use tr to transform from lowercase to uppercase. cut will extract the single columns and paste will combine the separated columns again.
Assumption: Columns are delimited by tabs.
paste <(cut -f1 file) <(cut -f2 file | tr '[:lower:]' '[:upper:]')
Replace file with your file name (that is test in your case).

In pure bash
#!/bin/bash
while read -r col1 col2;
do
printf "%s%7s\n" "$col1" "${col2^^}"
done < file > output-file
Input-file
$ cat file
1 abc
2 xyz
3 pqr
Output-file
$ cat output-file
1 ABC
2 XYZ
3 PQR

Related

find many word that match in a pattern file (txt file)

Unix system fine all word in a txt file, key word in a pattern file
EX: pattern file txt
1
2
3
EX: a.txt file we want to fine out that word contain 1 or 2 or 3
a
2
4
3
5
4
1
2
Result like:
2
3
1
2
I had try awk, but not good
awk '/1/,/2/,/3/,.... a.txt
You want to have an exact match between pattern.txt and a.txt. This implies if pattern.txt contains a line:
foo
then this line can only match "foo" and not
bar foo
foo bar
foo123
For a perfect match you can do:
$ awk '(NR==FNR){a[$0];next}($0 in a)' pattern.txt a.txt
$ grep -xFf pattern.txt a.txt

Finding amount of sequence matches per line

I'm looking to use GREP or something similiar to find the total matches of a 5 letter sequence (AATTC) in every line of a file, and then print the result in a new file. For example:
File 1:
GGGGGAATTCGAATTC
GGGGGAATTCGGGGGG
GGGGGAATTCCAATTC
Then in another file it prints the matches line by line
File 2:
2
1
2
Awk solution:
awk '{ print gsub(/AATTC/,"") }' file1 > file2
The gsub() function returns the number of substitutions made
$ cat file2
2
1
2
If you have to use grep, then put that in a while loop,
$ while read -r line; do grep -o 'AATTC'<<<"$line"|wc -l >> file2 ; done < file1
$ cat file2
2
1
2
Another way: using perl.
$ perl -ne 'print s/AATTC/x/g ."\n"' file1 > file2

Finding common elements from one file in a column of another file and output the entire row of the latter

I needed to extract all hits from one list (list.txt) which can be found in one of the columns of another (here in Data.txt) into a third (output.txt).
Data.txt (tab delimited)
some_data more_data other_data here yet_more_data etc
A B 2 Gee;Whiz;Hello 13 12
A B 2 Gee;Whizz;Hi 56 32
E 4 Btm;Lol 16 2
T 3 Whizz 13 3
List.txt
Gee
Whiz
Lol
Ideally output.txt looks like
some_data more_data other_data here yet_more_data etc
A B 2 Gee;Whiz;Hello 13 12
A B 2 Gee;Whizz;Hi 56 32
E 4 Btm;Lol 16 2
So I tried a shell script
for ids in List.txt
do
grep $ids Data.txt >> output.txt
done
except I typed out everything (cut and paste actually) in List.txt in said script.
Unfortunately it gave me an output.txt including the last line, I assume as 'Whizz' contains 'Whiz'.
I also tried cat Data.txt | egrep -F "List.txt" and that resulted in grep: conflicting matchers specified -- I suppose that was too naive of me. The actual files: List.txt contains a sorted list of 985 words, Data.txt has 115576 rows with 17 columns.
Some help/guidance would be much appreciated thanks.
Try something like this:
for ids in List.txt
do
grep "[TAB;]$ids[TAB;]" Data.txt >> output.txt
done
But it has two drawbacks:
"Data.txt" is scanned multiple times
You can get one line multiple times.
If it is problem try two step version:
cat List.txt | sed -e "s/.*/[TAB;]\0[TAB;]/g" > List_mod.txt
grep -f List_mod.txt Data.txt > output.txt
Note:
TAB character can be inserted by combination Ctrl-V following by Tab key in command line, and Tab character in editor. You have to check if your edit does not change tab to series of spaces.
The UNIX tool for general text processing is "awk":
awk '
NR==FNR { list[$0]; next }
{
for (word in list) {
if ($0 ~ "[\t;]" word "[\t;]") {
print
next
}
}
}
' List.txt Data.txt > output.txt

Using grep -f and -w together

I have two files like this:
abc.txt
a
b
z
1
10
and abcd.txt
a
b
c
d
1
10
100
1000
I would like:
a
b
1
10
I would like to use grep -fw abc.txt abcd.txt to search through every line of abc.txt and print lines which match the entire word. If I just use grep -f, I get lines 100 since the pattern '10' matches '100'. But grep -f -w abc.txt abcd.txt produces:
a
b
1
and doesn't print out the 10. So, I guess, what is the best way to match every line in abc.txt with the entire line of abcd.txt ?

Create a CSV File with specific format from a String obtained after WS Invocation

The title is self explanatory. I am calling a web service which is returning a String like this :
First Name="Kunal";Middle Name="";Last Name="Bhowmick";Address 1="HGB";Address 2="cvf";Address 3="tfg";City="DF";State="KL";Country="MN";Postal Code="0012";Telephone="(+98)6589745623"
Now i have to write a shell script to create a csv file named CSV_Output.csv and the file must be formatted with the String content.
The format must be something like this :
Field Name(in yellow color) Value(in yellow color)
First Name Kunal
Middle Name
Last Name Bhowmick
Address 1 HGB
Address 2 cvf
Address 3 tfg
City DF
State KL
Country MN
Postal Code 0012
Telephone (+98)6589745623
Now I can easily generate a CSV file using redirection(>>), but how can i create and format a CSV file like in the format show above ?
Sorry, to be blunt and i have no code to show as well, as i am not understanding what to use here.
Kindly provide some suggestions(sample code). Any help is greatly appreciated .
an awk one-liner could convert the format:
awk -v RS="\\n|;" -v OFS="\t" -F= '{gsub(/"/,"");$1=$1}7' file
if you want the output to look better, you could pass the output to column and change the OFS like:
awk -v RS="\\n|;" -v OFS="#" -F= '{gsub(/"/,"");$1=$1}7' file|column -s"#" -t
the output is:
kent$ awk -v RS="\\n|;" -v OFS="#" -F= '{gsub(/"/,"");$1=$1}7' f|column -s"#" -t
First Name Kunal
Middle Name
Last Name Bhowmick
Address 1 HGB
Address 2 cvf
Address 3 tfg
City DF
State KL
Country MN
Postal Code 0012
Telephone (+98)658974562
short explanation:
awk #awk command
-v RS="\\n|;" #set line separator is \n(newline) or ;(semi)
-v OFS="\t" #set output field separator: <tab>
-F= #set "=" as field separator
'{gsub(/"/,""); #remove all double quotes
$1=$1} #$1=$1, to let awk reformat the line with given OFS
7' #the non-zero number to print the whole line.
Can be achieved using tr and column:
$ cat input
First Name="Kunal";Middle Name="";Last Name="Bhowmick";Address 1="HGB";Address 2="cvf";Address 3="tfg";City="DF";State="KL";Country="MN";Postal Code="0012";Telephone="(+98)6589745623"
$ cat input | tr ";" "\n" | column -s= -t | tr -d \"
First Name Kunal
Middle Name
Last Name Bhowmick
Address 1 HGB
Address 2 cvf
Address 3 tfg
City DF
State KL
Country MN
Postal Code 0012
Telephone (+98)6589745623
Split input on ;; pipe the output to column specifying = as the delimiter, get rid of quotes!
EDIT: Didn't realize that you want a CSV. In that event, use:
$ cat input | tr ";" "\n" | tr "=" "\t" | tr -d \"
which will result into a TAB delimited output.

Resources