Break string into two separate lines - unix

I have string in this format (PUNC on one line
and i need to brake it into two separate lines as follows:
(
PUNC
how to do that?

You can do:
s='(PUNC'
r="${s/\(/(\n}"
echo -e "$r"
(
PUNC

try this:
echo "(\
PUNC"
the "\" as last character on the line signals, that the next line belongs to this line.
echo is an example here. show your code, than I can adapt the answer.

Related

Adding previous lines to current line unless pattern is found in unix shell script

I am facing an issue while adding previous lines to current line for a pattern. I have a 43 MB file in unix. The snippet is shown below:
AAA7034 new value and a old value
A
78698 new line and old value
BCA0987 old value and new value
new value
What I want is :
AAA7034 new value and a old value A 78698 new line and old value
BCA0987 old value and new value new value
Means I have add all the the lines till next pattern is found ( first pattern is : AAA and next pattern is : BCA )
because of high size of files..not sure if awk/sed shall work. Any bash script is appreciated.
You can combine all patterns and perform a regex match. Try something like this (it is just a scratch, you should trim the output if you need):
#!/bin/bash
patterns="^(AAA|BCS|BABA|BCA)"
file="$1"
while IFS= read -r line; do
if [[ "$line" =~ $patterns ]] ; then
echo # prints new line
fi
echo -n $line " " # prints the line itself and a space as a separator
done < "$file"
You can redirect the output to a file, of course.
It's not really clear precisely what you want. You've stated that you want to match the patterns 'AAA' and 'BCA', and later expanded that to "patter shall be like: AAA, BCS, BABA, BCA". I don't know if that means that you only want to match 'AAA', 'BCA', 'AAA, 'BCS, 'BABA', and 'BCA, or if you want to match 3 or 4 characters strings containing only 'A', B', 'C', and 'S', but it sounds like you are just looking for:
awk '/[A-Z]{3,4}/{printf "\n"} { printf "%s ", $0} END {printf "\n"}' input-file
Change the pattern as needed when your requirements are made more precise.
Based on the comment, it is trivial to convert any awk program to perl. Here is (basically) the output of a2p on the above awk script, with changes to reflect the stated pattern:
#!/usr/bin/env perl
while (<>) {
chomp;
if (/AAA|BCA|BCS|BABA/) {
printf "\n";
}
printf '%s ', $_;
}
printf "\n";
You can simplify that a bit:
perl -pe 'chomp; printf "\n" if /AAA|BCA|BCS|BABA/; printf "%s ", $_' input-file; echo

Split line into multiple lines of 42 Unix after last given char

I have a text file in unix formed from multiple long lines
ALTER Tit como(titel('42423432;434235111;757567562;2354679;5543534;6547673;32322332;54545453'))
ALTER Mit como(Alt('432322;434434211;754324237562;2354679;5543534;6547673;32322332;54545453'))
I need to split each line in multiple lines of no longer than 42 characters.
The split should be done at the end of last ";", and
so my ideal output file will be :
ALTER Tit como(titel('42423432;434235111; -
757567562;2354679;5543534;6547673; -
32322332;54545453'))
ALTER Mit como(Alt('432322;434434211; -
754324237562;2354679;5543534;6547673; -
32322332;54545453'))
I used fold -w 42 givenfile.txt | sed 's/ $/ -/g'
it splits the line but doesnt add the "-" at the end of the line and doesnt split after the ";".
any help is much appreciated.
Thanks !
awk -F';' '
w{
print""
}
{
w=length($1)
printf "%s",$1
for (i=2;i<=NF;i++){
if ((w+length($i)+1)<42){
w+=length($i)+1
printf";%s",$i
} else {
w=length($i)
printf"; -\n%s",$i
}
}
}
END{
print""
}
' file
This produces the output:
ALTER Tit como(titel('42423432;434235111; -
757567562;2354679;5543534;6547673; -
32322332;54545453'))
ALTER Mit como(Alt('432322;434434211; -
754324237562;2354679;5543534;6547673; -
32322332;54545453'))
How it works
Awk implicitly loops through each line of its input and each line is divided into fields. This code uses a single variable w to keep track of the current width of the output line.
-F';'
Tell awk to break fields on semicolons.
`w{print""}
If the last line was not completed, w>0, then print a newline to terminate it before we start with a new line.
w=length($1); printf "%s",$1
Print the first field of the new line and set w according to its length.
Loop over the remaining fields:
for (i=2;i<=NF;i++){
if ((w+length($i)+1)<42){
w+=length($i)+1
printf";%s",$i
} else {
w=length($i)
printf"; -\n%s",$i
}
}
This loops over the second to final fields of this line. Whenever we reach the point where we can't print another field without exceeding the 42 character limit, we print ; -\n.
END{print""}
Print a newline at the end of the file.
This might work for you (GNU sed):
sed -r 's/.{1,42}$|.{1,41};/& -\n/g;s/...$//' file
This globally replaces 1 to 41 characters followed by a ; or 1 to 42 characters followed by end of line with -\n. The last string will have three characters too many and so they are deleted.

Why does awk function only return last line from file?

I am using awk to reformat some fields in a file and an awk function to fix one field value if it is negative. Here is my awk command:
awk 'function fix_neg(value) {\
if(value < 0)\
return '$new_value'\
else\
return value\
} END { print $2,$1,fix_neg($3) }' input_file.txt
where $new_value was set before this call. I do not understand why this only returns the reformatted last line of input_file.txt (which contains multiple lines of data).
Thanks for your help.
Try this:
awk -v newV="$new_value" '{print $2,$1,($3<0?newV:$3)}' inputfile
In your program, you only got the last line data because you put your print statement in the END{..} block. It is triggered after the whole file was processed, not for each line. Drop the END and it would work as you intended.

pattern matching and delete all the lines except the last occurence

I have a txt file which is having 100+ lines, i want to search for pattern and delete all the lines except the last occurrence.
Here are the lines from the txt file.
my pattern search is "string1=" , "string2=", "string3=" , "string4=" and "string5="
string1=hi
string2=hello
string3=welcome
string3=welcome1
string3=
string4=hi
string5=hello
i want to go through the each line and keep "string3=" is empty on the file and remove the "string3=welcome" ,"string3=welcome1"
please help me.
For a single pattern, you can start with something like this:
grep "string3" input | tail -1
#!/usr/bin/perl
my %h;
while (<STDIN>) {
my ($k, $v) = split /=/;
$h{$k} = $v;
}
foreach my $k ( sort keys %h ) {
print "$k=$h{$k}";
}
The perl script here will take your list as stdin and process output as you mention. This assumes you want the keys (string*) as sorted output.
If you only wants the values that start with string1-5 only then you can put a match in the beginning of your while loop as so:
next if ! /^string[1-5]=/;

Creating string of repeated characters in shell script [duplicate]

This question already has answers here:
How can I repeat a character in Bash?
(36 answers)
Closed 8 years ago.
I need to generate a string of dots (.characters) as a variable.
I.e., in my Bash script, for input 15 I need to generate this string of length 15: ...............
I need to do so variably. I tried using this as a base (from Unix.com):
for i in {1..100};do printf "%s" "#";done;printf "\n"
But how do I get the 100 to be a variable?
You can get as many NULL bytes as you want from /dev/zero. You can then turn these into other characters. The following prints 16 lowercase a's
head -c 16 < /dev/zero | tr '\0' '\141'
len=100 ch='#'
printf '%*s' "$len" | tr ' ' "$ch"
Easiest and shortest way without a loop
VAR=15
Prints as many dots as VAR says (change the first dot to any other character if you like):
printf '.%.0s' {1..$VAR}
Saves the dotted line in a variable to be used later:
line=`printf '.%.0s' {1..$VAR}`
echo "Sign here $line"
-Blatantly stolen from dogbane's answer https://stackoverflow.com/a/5349842/3319298
Edit: Since I have now switched to fish shell, here is a function defined in config.fish that does this with convenience in that shell:
function line -a char -a length
printf '%*s\n' $length "" | tr ' ' $char
end
Usage: line = 8 produces ========, line \" 8 produces """""""".
On most systems, you could get away with a simple
N=100
myvar=`perl -e "print '.' x $N;"`
I demonstrated a way to accomplish this task with a single command in another question, assuming it's a fixed number of characters to be produced.
I added an addendum to the end about producing a variable number of repeated characters, which is what you asked for, so my previous answer is relevant here:
https://stackoverflow.com/a/17030976/2284005
I provided a full explanation of how it works there. Here I'll just add the code to accomplish what you're asking for:
n=20 # This the number of characters you want to produce
variable=$(printf "%0.s." $(seq 1 $n)) # Fill $variable with $n periods
echo $variable # Output content of $variable to terminal
Outputs:
....................
You can use C-style for loops in Bash:
num=100
string=$(for ((i=1; i<=$num; i++));do printf "%s" "#";done;printf "\n")
Or without a loop, using printf without using any externals such as sed or tr:
num=100
printf -v string "%*s" $num ' ' '' $'\n'
string=${string// /#}
The solution without loops:
N=100
myvar=`seq 1 $N | sed 's/.*/./' | tr -d '\n'`
num=100
myvar=$(jot -b . -s '' $num)
echo $myvar
When I have to create a string that contains $x repetitions of a known character with $x below a constant value, I use this idiom:
base='....................'
# 0 <= $x <= ${#base}
x=5
expr "x$base" : "x\(.\{$x\}\)" # Will output '\n' too
Output:
.....

Resources