Trim Leading and trailing Spaces in Awk - unix

I have a file which contains 1 line like below
VINOTH |KARTHICK |RAVI
I'm using the below command to remove the leading and trailing spaces , but it's not not working.
awk '{ gsub(/^[ \t]+|[ \t]+$/, ""); print }' Input_File
Please help.
Required Output.
VINOTH|KARTHICK|RAVI

You may use
sed 's/[ \t]*|[ \t]*/|/g;s/^[ \t]*\|[ \t]*$//g' Input_File
There are two regexps here:
s/[ \t]*|[ \t]*/|/g replaces all | enclosed with optional whitespaces with a single | (the | in the regex matches a literal | char as per BRE POSIX standard)
s/^[ \t]*\|[ \t]*$//g removes all whitespaces at the start and end of lines. Note that \| here is an OR operator (escaped because the BRE POSIX syntax is used).
See the online demo.

Could you please try following(since your sample input and expected output are not clear so didn't test it).
awk '{gsub(/^[[:space:]]+|[[:space:]]+$/,"")} 1' Input_file

Related

Replace two characters with sed dynamically

I have the following strings
C:/data
D:/backups
C:/Users/Guest/old_data
F:/files/new
How can I replace the first two characters with /cygdrive/LOWERCASE_DRIVE_LETTER?
RESULT
/cygdrive/c/data
/cygdrive/d/backups
/cygdrive/c/Users/Guest/old_data
/cygdrive/f/files/new
awk -F':' 'sub(/../,"/cygdrive/"tolower($1))' file
Brief explanation,
-F':': set ':' as the field separator.
tolower($1): return thee lower case of $1
sub(/../,"/cygdrive/"tolower($1)): substitute the first 2 character to "/cygdrive/"tolower($1)
This might work for you (GNU sed):
sed 's/\(.\):/\/cygdrive\/\l\1/' file
Remember by grouping the first character followed by a :. Then insert /cygdrive/ and lowercase the group i.e. the first character.
Could you please try following.
awk 'BEGIN{FS=OFS="/"}{sub(/:/,"",$1);$1=tolower($1);print "/cygdrive/" $0}' Input_file

how to grep nth string

How to use "grep" shell command to show specific word from a line starting with a specific word.
Ex:
I want to print a string "myFTPpath/folderName/" from the line starting with searchStr in the below mentioned line.
searchStr:somestring:myFTPpath/folderName/:somestring
Something like this with awk:
awk -F: '/^searchStr/{print $3}' File
From all the lines starting with searchStr, print the 3rd field (field seperator set as :)
Sample:
AMD$ cat File
someStr:somestring:myFTPpath/folderName/:somestring
someStr:somestring:myFTPpath/folderName/:somestring
searchStr:somestring:myFTPpath/folderName/:somestring
someStr:somestring:myFTPpath/folderName/:somestring
AMD$ awk -F: '/^searchStr/{print $3}' File
myFTPpath/folderName/
Remember that grep isn't the only tool that can usefully do searches.
In this particular case, where the lines are naturally broken into fields, awk is probably the best solution, as #A.M.D's answer suggests.
For more general case edits, however, remember sed's -n option, which suppresses printing out a line after edits:
sed -n 's/searchStr:[^:]*:\([^:]*\):.*/\1/p' input-file
The -n suppresses automatic printing of the line, and the trailing /p flag explicitly prints out lines on which there is a substitution.
This matching pattern is fiddly – use awk in this fielded case – but don't forget sed -n.
You could get the desired output with grep itself but you need to enable -P and -o parameters.
$ echo 'searchStr:somestring:myFTPpath/folderName/:somestring' | grep -oP '^searchStr:[^:]*:\K[^:]*'
myFTPpath/folderName/
\K discards the characters which are matched previously from printing at the final leaving only the characters which are matched by the pattern exists next to \K. Here we used \K instead of a variable length positive lookbehind assertion.

Check if file contains some text (not regex) in Unix

I want to check if a multiline text matches an input. grep comes close, but I couldn't find a way to make it interpret pattern as plain text, not regex.
How can I do this, using only Unix utilities?
Use grep -F:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified by
POSIX.)
EDIT: Initially I didn't understand the question well enough. If the pattern itself contains newlines, use -z option:
-z, --null-data
Treat the input as a set of lines, each terminated by a zero
byte (the ASCII NUL character) instead of a newline. Like the
-Z or --null option, this option can be used with commands like
sort -z to process arbitrary file names.
I've tested it, multiline patterns worked.
From man grep
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
If the input string you are trying to match does not contain a blank line (eg, it does not have two consecutive newlines), you can do:
awk 'index( $0, "needle\nwith no consecutive newlines" ) { m=1 }
END{ exit !m }' RS= input-file && echo matched
If you need to find a string with consecutive newlines, set RS to some string that is not in the file. (Note that the results of awk are unspecified if you set RS to more than one character, but most awk will allow it to be a string.) If you are willing to make the sought string a regex, and if your awk supports setting RS to more than one character, you could do:
awk 'END{ exit NR == 1 }' RS='sought regex' input-file && echo matched

grep -w with only space as delimiter

grep -w uses punctuations and whitespaces as delimiters.
How can I set grep to only use whitespaces as a delimiter for a word?
If you want to match just spaces: grep -w foo is the same as grep " foo ". If you also want to match line endings or tabs you can start doing things like: grep '\(^\| \)foo\($\| \)', but you're probably better off with perl -ne 'print if /\sfoo\s/'
You cannot change the way grep -w works. However, you can replace punctuations with, say, X character using tr or sed and then use grep -w, that will do the trick.
The --word-regexp flag is useful, but limited. The grep man page says:
-w, --word-regexp
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the
underscore.
If you want to use custom field separators, awk may be a better fit for you. Or you could just write an extended regular expression with egrep or grep --extended-regexp that gives you more control over your search pattern.
Use tr to replace spaces with new lines. Then grep your string. The contiguous string I needed was being split up with grep -w because it has colons in it. Furthermore, I only knew the first part, and the second part was the unknown data I needed to pull. Therefore, the following helped me.
echo "$your_content" | tr ' ' '\n' | grep 'string'

UNIX: Replace Newline w/ Colon, Preserving Newline Before EOF

I have a text file ("INPUT.txt") of the format:
A<LF>
B<LF>
C<LF>
D<LF>
X<LF>
Y<LF>
Z<LF>
<EOF>
which I need to reformat to:
A:B:C:D:X:Y:Z<LF>
<EOF>
I know you can do this with 'sed'. There's a billion google hits for doing this with 'sed'. But I'm trying to emphasis readability, simplicity, and using the correct tool for the correct job. 'sed' is a line editor that consumes and hides newlines. Probably not the right tool for this job!
I think the correct tool for this job would be 'tr'. I can replace all the newlines with colons with the command:
cat INPUT.txt | tr '\n' ':'
There's 99% of my work done. I have a problem, now, though. By replacing all the newlines with colons, I not only get an extraneous colon at the end of the sequence, but I also lose the carriage return at the end of the input. It looks like this:
A:B:C:D:X:Y:Z:<EOF>
Now, I need to remove the colon from the end of the input. However, if I attempt to pass this processed input through 'sed' to remove the final colon (which would now, I think, be a proper use of 'sed'), I find myself with a second problem. The input is no longer terminated by a newline at all! 'sed' fails outright, for all commands, because it never finds the end of the first line of input!
It seems like appending a newline to the end of some input is a very, very common task, and considering I myself was just sorely tempted to write a program to do it in C (which would take about eight lines of code), I can't imagine there's not already a very simple way to do this with the tools already available to you in the Linux kernel.
This should do the job (cat and echo are unnecessary):
tr '\n' ':' < INPUT.TXT | sed 's/:$/\n/'
Using only sed:
sed -n ':a; $ ! {N;ba}; s/\n/:/g;p' INPUT.TXT
Bash without any externals:
string=($(<INPUT.TXT))
string=${string[#]/%/:}
string=${string//: /:}
string=${string%*:}
Using a loop in sh:
colon=''
while read -r line
do
string=$string$colon$line
colon=':'
done < INPUT.TXT
Using AWK:
awk '{a=a colon $0; colon=":"} END {print a}' INPUT.TXT
Or:
awk '{printf colon $0; colon=":"} END {printf "\n" }' INPUT.TXT
Edit:
Here's another way in pure Bash:
string=($(<INPUT.TXT))
saveIFS=$IFS
IFS=':'
newstring="${string[*]}"
IFS=$saveIFS
Edit 2:
Here's yet another way which does use echo:
echo "$(tr '\n' ':' < INPUT.TXT | head -c -1)"
Old question, but
paste -sd: INPUT.txt
Here's yet another solution: (assumes a character set where ':' is
octal 72, eg ascii)
perl -l72 -pe '$\="\n" if eof' INPUT.TXT

Resources