I have a file with multiple lines. I'm trying to find lines that match a certain pattern and then get them appended to an output file, all on the same.
Ex:
Input file:
ABCD
other text
EFGH
other text
IJKLM
I'm trying to get the output to be :
ABCD EFGH IJKLM
An easy way to make grep output matches separated by spaces instead of newlines is to wrap it in a sub-shell with $(...) like this:
echo $(grep -o '^[A-Z]*$' input.txt) >> output.txt
Or you could use tr:
grep -o '^[A-Z]*$' input.txt | tr '\n' ' ' >> output.txt
Or perl:
grep -o '^[A-Z]*$' input.txt | perl -pe 'chomp; s/$/ /'
You can use tr to translate the newlines to spaces:
grep $EXPRESSION $INPUT_FILE | tr '\n' ' ' >> $OUTPUT_FILE
If you like perl, you can also
perl -nl40e 'print if /PATTERN/' files....
like
perl -nl40e 'print if /[A-Z]/' file
for your input produces
ABCD EFGH IJKLM
Here is an short awk
awk 'NR%2==1' ORS=" " file
ABCD EFGH IJKLM
It will print every second line into one line.
Related
Need get the delimiters at the starting of each line, below are sample input and output files for reference. In actual Delimiter used are( £{, ^$^)
Note - The file to be rearranged has huge data
Have tried the below but it does not work:
tr £{ \\n
sed 's/£{/\n/g'
awk '{ gsub("£{", "\n") } 1'
Input File:
£{firstlinecontinues£{secondstartsfromhereandit
keepsoncontinueingtillend£{herecomes3rdand£{fi
nallyfourthisalsohere
Output File:
£{firstlinecontinues
£{secondstartsfromhereanditkeepsoncontinueingtillend
£{herecomes3rdand
£{finallyfourthisalsohere
With GNU awk for multi-char RS and \s:
$ awk -v RS='£{' 'NR>1{gsub(/\s/,""); print RS $0}' file
£{firstlinecontinues
£{secondstartsfromhereanditkeepsoncontinueingtillend
£{herecomes3rdand
£{finallyfourthisalsohere
awk 'BEGIN{RS="(£{\|\^\$\^)"; OFS=ORS=""}{$1=$1;print $0 (FNR>1?"\n":"") RT}' file
Since the £ symbol is represented by two Octal codes, 302 and 243, I was able to produce the desired result with this perl command:
perl -pe 's/(\302\243)/\n$1/g' data.txt
NOTE: Here's what I see on my system:
echo "£" | od -c
0000000 302 243 \n
0000003
Below are the full file names.
qwertyuiop.abcdefgh.1234567890.txt
qwertyuiop.1234567890.txt
trying to use
awk -F'.' '{print $1}'
How can i use awk command to extract below output.
qwertyuiop.abcdefgh
qwertyuiop
Edit
i have a list of files in a directory
i am trying to extract time,size,owner,filename into seperate variables.
for filenames.
NAME=$(ls -lrt /tmp/qwertyuiop.1234567890.txt | awk -F'/' '{print $3}' | awk -F'.' '{print $1}')
$ echo $NAME
qwertyuiop
$
NAME=$(ls -lrt /tmp/qwertyuiop.abcdefgh.1234567890.txt | awk -F'/' '{print $3}' | awk -F'.' '{print $1}')
$ echo $NAME
qwertyuiop
$
expected
qwertyuiop.abcdefgh
With GNU awk and other versions that allow manipulation of NF
$ awk -F. -v OFS=. '{NF-=2} 1' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
NF-=2 will effectively delete last two fields
1 is an awk idiom to print contents of $0
Note that this assumes there are at least two fields in every line, otherwise you'd get an error
Similar concept with perl, prints empty line if number of fields in the line is less than 3
$ perl -F'\.' -lane 'print join ".", #F[0..$#F-2]' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
With sed, you can preserve lines if number of fields is less than 3
$ sed 's/\.[^.]*\.[^.]*$//' ip.txt
qwertyuiop.abcdefgh
qwertyuiop
EDIT: Taking inspiration from Sundeep sir's solution and adding this following too in this mix.
awk 'BEGIN{FS=OFS="."} {$(NF-1)=$NF="";sub(/\.+$/,"")} 1' Input_file
Could you please try following.
awk -F'.' '{for(i=(NF-1);i<=NF;i++){$i=""};sub(/\.+$/,"")} 1' OFS="." Input_file
OR
awk 'BEGIN{FS=OFS="."} {for(i=(NF-1);i<=NF;i++){$i=""};sub(/\.+$/,"")} 1' Input_file
Explanation: Adding explanation for above code too here.
awk '
BEGIN{ ##Mentioning BEGIN section of awk program here.
FS=OFS="." ##Setting FS and OFS variables for awk to DOT here as per OPs sample Input_file.
} ##Closing BEGIN section here.
{
for(i=(NF-1);i<=NF;i++){ ##Starting for loop from i value from (NF-1) to NF for all lines.
$i="" ##Setting value if respective field to NULL.
} ##Closing for loop block here.
sub(/\.+$/,"") ##Substituting all DOTs till end of line with NULL in current line.
}
1 ##Mentioning 1 here to print edited/non-edited current line here.
' Input_file ##Mentioning Input_file name here.
I have the following list in a text file:
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
I want to encase in "double quotes", comma separate and join the values as one string.
Can this be done in sed or awk?
Expected output:
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203","10.1.2.204"
The easiest is something like this (in pseudo code):
Read a line;
Put the line in quotes;
Keep that quoted line in a stack or string;
At the end (or while constructing the string), join the lines together with a comma.
Depending on the language, that is fairly straightforward to do:
With awk:
$ awk 'BEGIN{OFS=","}{s=s ? s OFS "\"" $1 "\"" : "\"" $1 "\""} END{print s}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Or, less 'wall of quotes' to define a quote character:
$ awk 'BEGIN{OFS=",";q="\""}{s=s ? s OFS q$1q : q$1q} END{print s}' file
With sed:
$ sed -E 's/^(.*)$/"\1"/' file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
(With Perl and Ruby, with a join function, it is easiest to push the elements onto a stack and then join that.)
Perl:
$ perl -lne 'push #a, "\"$_\""; END{print join(",", #a)}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Ruby:
$ ruby -ne 'BEGIN{#arr=[]}; #arr.push "\"#{$_.chomp}\""; END{puts #arr.join(",")}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
here is another alternative
sed 's/.*/"&"/' file | paste -sd,
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' file
should work for given example.
Tested with gawk:
kent$ cat f
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
kent$ awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' f
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
$ awk '{o=o (NR>1?",":"") "\""$0"\""} END{print o}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
I have text file,
Input file:
sno|name|lab|result|dep
1|aaa|ALB|<= 3.67|CHE
2|bbb|WBC|> 7.2|FVC
3|ccc|RBC|> 14|CHE
Output file:
sno|name|lab|result|dep
1|aaa|ALB|<=3.67|CHE
2|bbb|WBC|>7.2|FVC
3|ccc|RBC|>14|CHE
How to remove white spaces in column 4(result)?
If you can remove spaces from everything, just use sed:
sed 's/ //g' input.txt > output.txt
Or even tr (translate):
tr -d ' ' < input.txt > output.txt
Otherwise, if you need to edit just the fourth column, use awk. The following command considers | as field separator (-F \|) and then outputs files using | as output field separator (-vOFS=\|).
awk -F \| -vOFS=\| '{gsub(/ /, "", $4); print; }' input.txt > output.txt
I have a file which contains about 30000 Records delimited by '|'. I need to get a distinct list of special characters only from the file.
For Eg:
123|fasdf|%df&|pap,came|!
234|%^&asdf|34|'":|
My output should be:
|%&,!^'":
Any help would be greatly appreciated.
Thanks,
Velraj.
grep -o '[|%&,!^":]' input | sort -u
You have to list all your special characters inside brackets.
This will return each unique special character on its own line. If you really need a string with these characters you have to remove newlines afterwards, e.g.:
grep -o '[|%&,!^":]' input | sort -u | tr -d '\n'
UPDATE:
If you need to remove all characters which are not from 'a-zA-Z0-9' set then you can use this one:
grep -o '[^a-zA-Z0-9]' input | sort -u | tr -d '\n'
echo "123|fasdf|%df&|pap,came|! 234|%^&asdf|34|'\":|" \
| { tr -d '[[:alnum:]]'; printf "\n"; } \
| sed 's/\(.\)/\1_/g' \
| awk -v 'RS=_' '{print $0}' \
| sort -u \
| awk '{printf $0}END{printf "\n"}'
output
!"%&',:^||
You can replace the first line echo .... with cat fileName