Parsing line continuations - unix

What is the simplest way to parse line continuation characters? This seems like such a basic action that I'm surprised there's no basic command for doing this. 'while read' and 'while read -r' loops don't do what I want, and the easiest solution I've found is the sed solution below. Is there a way to do this with something basic like tr?
$ cat input
Output should be \
one line with a '\' character.
$ while read l; do echo $l; done < input
Output should be one line with a '' character.
$ while read -r l; do echo $l; done < input
Output should be \
one line with a '\' character.
$ sed '/\\$/{N; s/\\\n//;}' input
Output should be one line with a '\' character.
$ perl -0777 -pe 's/\\\n//s' input
Output should be one line with a '\' character.

If by "simplest" you mean concise and legible, I'd suggest your perl-ism with one small modification:
$ perl -pe 's/\\\n//' /tmp/line-cont
No need for the possibly memory intensive ... -0777 ... (whole file slurp mode) switch.
If, however, by "simplest" you mean not the leaving shell, this will suffice:
$ { while read -r LINE; do
printf "%s" "${LINE%\\}"; # strip line-continuation, if any
test "${LINE##*\\}" && echo; # emit newline for non-continued lines
done; } < /tmp/input
(I prefer printf "%s" $USER_INPUT to echo $USER_INPUT because echo cannot portably be told to stop looking for switches, and printf is commonly a built-in anyway.)
Just tuck that in a user-defined function and never be revolted by it again. Caution: this latter approach will add a trailing newline to a file which lacks one.

The regex way looks like the way to go.

I would go with the Perl solution simply because it will likely be the most extensible if you want to add more functionality later.

Related

Attempted to use awk sqrt but only returns 0

I am attempting to use the sqrt function from awk command in my script, but all it returns is 0. Is there anything wrong with my script below?
echo "enter number"
read root
awk 'BEGIN{ print sqrt($root) }'
This is my first time using the awk command, are there any mistakes that I am not understanding here?
Maybe you can try this.
echo "enter number"
read root
echo "$root" | awk '{print sqrt($0)}'
You have to give a data input to awk. So, you can pipe 'echo'.
The BEGIN statement is to do things, like print a header...etc before
awk starts reading the input.
$ echo "enter number"
enter number
$ read root
3
$ awk -v root="$root" 'BEGIN{ print sqrt(root) }'
1.73205
See the comp.unix.shell FAQ for the 2 correct ways to pass the value of a shell variable to an awk script.
UPDATE : My proposed solution turns out to be potentially dangerous. See Ed Morton's answer for a better solution. I'll leave this answer here as a warning.
Because of the single quotes, $root is interpreted by awk, not by the shell. awk treats root as an uninitialized variable, whose value is the empty string, treated as 0 in a numeric context. $root is the root'th field of the current line -- in this case, as $0, which is the entire line. Since it's in a BEGIN block, there is no current line, so $root is the empty string -- which again is treated as 0 when passed to sqrt().
You can see this by changing your command line a bit:
$ awk 'BEGIN { print sqrt("") }'
0
$ echo 2 | awk '{ print sqrt($root) }'
1.41421
NOTE: The above is merely to show what's wrong with the original command, and how it's interpreted by the shell and by awk.
One solution is to use double quotes rather than single quotes. The shell expands variable references within double quotes:
$ echo "enter number"
enter number
$ read x
2
$ awk "BEGIN { print sqrt($x) }" # DANGEROUS
1.41421
You'll need to be careful when doing this kind of thing. The interaction between quoting and variable expansion in the shell vs. awk can be complicated.
UPDATE: In fact, you need to be extremely careful. As Ed Morton points out in a comment, this method can result in arbitrary code execution given a malicious value for $x, which is always a risk for a value read from user input. His answer avoids that problem.
(Note that I've changed the name of your shell variable from $root to $x, since it's the number whose square root you want, not the root itself.)

Unix add a comma in the hundredths place and a $ to the last field

I have a number in the last field of my text file and I need to add a dollar sign to each line and a comma in the hundredths place of the number. So 10000 would now be $10,000.
one of the lines looks like this
World fair:399-454-9999:832 ponce Drive, Gary, IN 87878:3/22/62:24500
need it to look like this
World fair:399-454-9999:832 ponce Drive, Gary, IN 87878:3/22/62:$24,500
You can use the ' printf format flag to get the thousands groupings.
(I can't find a good reference for it but it is in the printf man page at least.)
The SUSv2 specifies one further flag character.
'
For decimal conversion (i, d, u, f, F, g, G) the output is to be grouped with thousands' grouping characters if the locale information indicates any. Note that many versions of gcc(1) cannot parse this option and will issue a warning. SUSv2 does not include %'F.
Then you just need a fairly simple application of awk.
awk -F : -v OFS=: '{$NF="$"sprintf("%\047d", $NF)}7' file
-F : sets the field separator to : so we get just the number in the final field
-v OFS=: sets the output field separator to : so awk puts the colons back for us
\047 is the octal code for a single quote to embed it in the single-quoted string easily
7 is a truth-y value to cause awk to print the line
The Perl Cookbook offers this regex solution:
sub commify {
my $text = reverse $_[0];
$text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g;
return scalar reverse $text;
}
This can be incorporated into a specific solution:
perl -lpe 'BEGIN{sub commify {$t=reverse shift; $t=~s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g; reverse $t}} s/(\d+)$/chr(044).commify($1)/e' file
output:
World fair:399-454-9999:832 ponce Drive, Gary, IN 87878:3/22/62:$24,500
A solution using unpack:
perl -lpe 'BEGIN{sub commify {$b=reverse shift; #c=unpack("(A3)*", $b); reverse join ",", #c}} s/(\d+)$/chr(044).commify($1)/e' file
If you have the Number::Format library installed, there is a shorter solution:
perl -lpe 'BEGIN{use Number::Format "format_number"} s/(\d+)$/chr(044).format_number($1)/e' file
All of the above solutions use Perl's s/foo/bar/e substitute operator with the e flag, which eval's the bar section.
chr(044) is used to print the $ (otherwise it would be eval'd)
You can add the dollar signs and the comma separately:
sed -i "s/:\([0-9]*\)$/:\$\1/g" file.txt
sed -i "s/\([0-9]\)\([0-9]\{3\}\)$/\1,\2/g" file.txt
sed -i "s/\([0-9]\)\([0-9]\{3\}\)\([0-9]\{3\}\)$/\1,\2,\3/" zeros.txt

Unix command to prepend text to a file

Is there a Unix command to prepend some string data to a text file?
Something like:
prepend "to be prepended" text.txt
printf '%s\n%s\n' "to be prepended" "$(cat text.txt)" >text.txt
sed -i.old '1s;^;to be prepended;' inFile
-i writes the change in place and take a backup if any extension is given. (In this case, .old)
1s;^;to be prepended; substitutes the beginning of the first line by the given replacement string, using ; as a command delimiter.
Process Substitution
I'm surprised no one mentioned this.
cat <(echo "before") text.txt > newfile.txt
which is arguably more natural than the accepted answer (printing something and piping it into a substitution command is lexicographically counter-intuitive).
...and hijacking what ryan said above, with sponge you don't need a temporary file:
sudo apt-get install moreutils
<<(echo "to be prepended") < text.txt | sponge text.txt
EDIT: Looks like this doesn't work in Bourne Shell /bin/sh
Here String (zsh only)
Using a here-string - <<<, you can do:
<<< "to be prepended" < text.txt | sponge text.txt
This is one possibility:
(echo "to be prepended"; cat text.txt) > newfile.txt
you'll probably not easily get around an intermediate file.
Alternatives (can be cumbersome with shell escaping):
sed -i '0,/^/s//to be prepended/' text.txt
If it's acceptable to replace the input file:
Note:
Doing so may have unexpected side effects, notably potentially replacing a symlink with a regular file, ending up with different permissions on the file, and changing the file's creation (birth) date.
sed -i, as in Prince John Wesley's answer, tries to at least restore the original permissions, but the other limitations apply as well.
Here's a simple alternative that uses a temporary file (it avoids reading the whole input file into memory the way that shime's solution does):
{ printf 'to be prepended'; cat text.txt; } > tmp.txt && mv tmp.txt text.txt
Using a group command ({ ...; ...; }) is slightly more efficient than using a subshell ((...; ...)), as in 0xC0000022L's solution.
The advantages are:
It's easy to control whether the new text should be directly prepended to the first line or whether it should be inserted as new line(s) (simply append \n to the printf argument).
Unlike the sed solution, it works if the input file is empty (0 bytes).
The sed solution can be simplified if the intent is to prepend one or more whole lines to the existing content (assuming the input file is non-empty):
sed's i function inserts whole lines:
With GNU sed:
# Prepends 'to be prepended' *followed by a newline*, i.e. inserts a new line.
# To prepend multiple lines, use '\n' as part of the text.
# -i.old creates a backup of the input file with extension '.old'
sed -i.old '1 i\to be prepended' inFile
A portable variant that also works with macOS / BSD sed:
# Prepends 'to be prepended' *followed by a newline*
# To prepend multiple lines, escape the ends of intermediate
# lines with '\'
sed -i.old -e '1 i\
to be prepended' inFile
Note that the literal newline after the \ is required.
If the input file must be edited in place (preserving its inode with all its attributes):
Using the venerable ed POSIX utility:
Note:
ed invariably reads the input file as a whole into memory first.
To prepend directly to the first line (as with sed, this won't work if the input file is completely empty (0 bytes)):
ed -s text.txt <<EOF
1 s/^/to be prepended/
w
EOF
-s suppressed ed's status messages.
Note how the commands are provided to ed as a multi-line here-document (<<EOF\n...\nEOF), i.e., via stdin; by default string expansion is performed in such documents (shell variables are interpolated); quote the opening delimiter to suppress that (e.g., <<'EOF').
1 makes the 1st line the current line
function s performs a regex-based string substitution on the current line, as in sed; you may include literal newlines in the substitution text, but they must be \-escaped.
w writes the result back to the input file (for testing, replace w with ,p to only print the result, without modifying the input file).
To prepend one or more whole lines:
As with sed, the i function invariably adds a trailing newline to the text to be inserted.
ed -s text.txt <<EOF
0 i
line 1
line 2
.
w
EOF
0 i makes 0 (the beginning of the file) the current line and starts insert mode (i); note that line numbers are otherwise 1-based.
The following lines are the text to insert before the current line, terminated with . on its own line.
This will work to form the output. The - means standard input, which is provide via the pipe from echo.
echo -e "to be prepended \n another line" | cat - text.txt
To rewrite the file a temporary file is required as cannot pipe back into the input file.
echo "to be prepended" | cat - text.txt > text.txt.tmp
mv text.txt.tmp text.txt
Prefer Adam's answer
We can make it easier to use sponge. Now we don't need to create a temporary file and rename it by
echo -e "to be prepended \n another line" | cat - text.txt | sponge text.txt
Probably nothing built-in, but you could write your own pretty easily, like this:
#!/bin/bash
echo -n "$1" > /tmp/tmpfile.$$
cat "$2" >> /tmp/tmpfile.$$
mv /tmp/tmpfile.$$ "$2"
Something like that at least...
Editor's note:
This command will result in data loss if the input file happens to be larger than your system's pipeline buffer size, which is typically 64 KB nowadays. See the comments for details.
In some circumstances prepended text may available only from stdin.
Then this combination shall work.
echo "to be prepended" | cat - text.txt | tee text.txt
If you want to omit tee output, then append > /dev/null.
Another way using sed:
sed -i.old '1 {i to be prepended
}' inFile
If the line to be prepended is multiline:
sed -i.old '1 {i\
to be prepended\
multiline
}' inFile
Solution:
printf '%s\n%s' 'text to prepend' "$(cat file.txt)" > file.txt
Note that this is safe on all kind of inputs, because there are no expansions. For example, if you want to prepend !##$%^&*()ugly text\n\t\n, it will just work:
printf '%s\n%s' '!##$%^&*()ugly text\n\t\n' "$(cat file.txt)" > file.txt
The last part left for consideration is whitespace removal at end of file during command substitution "$(cat file.txt)". All work-arounds for this are relatively complex. If you want to preserve newlines at end of file.txt, see this: https://stackoverflow.com/a/22607352/1091436
As tested in Bash (in Ubuntu), if starting with a test file via;
echo "Original Line" > test_file.txt
you can execute;
echo "$(echo "New Line"; cat test_file.txt)" > test_file.txt
or, if the version of bash is too old for $(), you can use backticks;
echo "`echo "New Line"; cat test_file.txt`" > test_file.txt
and receive the following contents of "test_file.txt";
New Line
Original Line
No intermediary file, just bash/echo.
Another fairly straight forward solution is:
$ echo -e "string\n" $(cat file)
% echo blaha > blaha
% echo fizz > fizz
% cat blaha fizz > buzz
% cat buzz
blaha
fizz
You can do that easily with awk
cat text.txt|awk '{print "to be prepended"$0}'
It seems like the question is about prepending a string to the file not each line of the file, in this case as suggested by Tom Ekberg the following command should be used instead.
awk 'BEGIN{print "to be prepended"} {print $0}' text.txt
If you like vi/vim, this may be more your style.
printf '0i\n%s\n.\nwq\n' prepend-text | ed file
For future readers who want to append one or more lines of text (with variables or even subshell code) and keep it readable and formatted, you may enjoy this:
echo "Lonely string" > my-file.txt
Then run
cat <<EOF > my-file.txt
Hello, there!
$(cat my-file.txt)
EOF
Results of cat my-file.txt:
Hello, there!
Lonely string
This works because the read of my-file.txt happens first and in a subshell. I use this trick all the time to append important rules to config files in Docker containers rather than copy over entire config files.
you can use variables
Even though a bunsh of answers here work pretty well, I want to contribute this one-liner, just for completeness. At least it is easy to keep in mind and maybe contributes to some general understanding of bash for some people.
PREPEND="new line 1"; FILE="text.txt"; printf "${PREPEND}\n`cat $FILE`" > $FILE
In this snippe just replace text.txt with the textfile you want to prepend to and new line 1 with the text to prepend.
example
$ printf "old line 1\nold line 2" > text.txt
$ cat text.txt; echo ""
old line 1
old line 2
$ PREPEND="new line 1"; FILE="text.txt"; printf "${PREPEND}\n`cat $FILE`" > $FILE
$ cat text.txt; echo ""
new line 1
old line 1
old line 2
$
# create a file with content..
echo foo > /tmp/foo
# prepend a line containing "jim" to the file
sed -i "1s/^/jim\n/" /tmp/foo
# verify the content of the file has the new line prepened to it
cat /tmp/foo
I'd recommend defining a function and then importing and using that where needed.
prepend_to_file() {
file=$1
text=$2
if ! [[ -f $file ]] then
touch $file
fi
echo "$text" | cat - $file > $file.new
mv -f $file.new $file
}
Then use it like so:
prepend_to_file test.txt "This is first"
prepend_to_file test.txt "This is second"
Your file contents will then be:
This is second
This is first
I'm about to use this approach for implementing a change log updater.
With ex,
ex - $file << PREPEND
-1
i
prepended text
.
wq
PREPEND
The ex commands are
-1 Go to the very beginning of the file
i Begin insert mode
. End insert mode
wq Save (write) and quit

Interpret as fixed string/literal and not regex using sed

For grep there's a fixed string option, -F (fgrep) to turn off regex interpretation of the search string.
Is there a similar facility for sed? I couldn't find anything in the man. A recommendation of another gnu/linux tool would also be fine.
I'm using sed for the find and replace functionality: sed -i "s/abc/def/g"
Do you have to use sed? If you're writing a bash script, you can do
#!/bin/bash
pattern='abc'
replace='def'
file=/path/to/file
tmpfile="${TMPDIR:-/tmp}/$( basename "$file" ).$$"
while read -r line
do
echo "${line//$pattern/$replace}"
done < "$file" > "$tmpfile" && mv "$tmpfile" "$file"
With an older Bourne shell (such as ksh88 or POSIX sh), you may not have that cool ${var/pattern/replace} structure, but you do have ${var#pattern} and ${var%pattern}, which can be used to split the string up and then reassemble it. If you need to do that, you're in for a lot more code - but it's really not too bad.
If you're not in a shell script already, you could pretty easily make the pattern, replace, and filename parameters and just call this. :)
PS: The ${TMPDIR:-/tmp} structure uses $TMPDIR if that's set in your environment, or uses /tmp if the variable isn't set. I like to stick the PID of the current process on the end of the filename in the hopes that it'll be slightly more unique. You should probably use mktemp or similar in the "real world", but this is ok for a quick example, and the mktemp binary isn't always available.
Option 1) Escape regexp characters. E.g. sed 's/\$0\.0/0/g' will replace all occurrences of $0.0 with 0.
Option 2) Use perl -p -e in conjunction with quotemeta. E.g. perl -p -e 's/\\./,/gi' will replace all occurrences of . with ,.
You can use option 2 in scripts like this:
SEARCH="C++"
REPLACE="C#"
cat $FILELIST | perl -p -e "s/\\Q$SEARCH\\E/$REPLACE/g" > $NEWLIST
If you're not opposed to Ruby or long lines, you could use this:
alias replace='ruby -e "File.write(ARGV[0], File.read(ARGV[0]).gsub(ARGV[1]) { ARGV[2] })"'
replace test3.txt abc def
This loads the whole file into memory, performs the replacements and saves it back to disk. Should probably not be used for massive files.
If you don't want to escape your string, you can reach your goal in 2 steps:
fgrep the line (getting the line number) you want to replace, and
afterwards use sed for replacing this line.
E.g.
#/bin/sh
PATTERN='foo*[)*abc' # we need it literal
LINENUMBER="$( fgrep -n "$PATTERN" "$FILE" | cut -d':' -f1 )"
NEWSTRING='my new string'
sed -i "${LINENUMBER}s/.*/$NEWSTRING/" "$FILE"
You can do this in two lines of bash code if you're OK with reading the whole file into memory. This is quite flexible -- the pattern and replacement can contain newlines to match across lines if needed. It also preserves any trailing newline or lack thereof, which a simple loop with read does not.
mapfile -d '' < file
printf '%s' "${MAPFILE//"$pat"/"$rep"}" > file
For completeness, if the file can contain null bytes (\0), we need to extend the above, and it becomes
mapfile -d '' < <(cat file; printf '\0')
last=${MAPFILE[-1]}; unset "MAPFILE[-1]"
printf '%s\0' "${MAPFILE[#]//"$pat"/"$rep"}" > file
printf '%s' "${last//"$pat"/"$rep"}" >> file
perl -i.orig -pse 'while (($i = index($_,$s)) >= 0) { substr($_,$i,length($s), $r)}'--\
-s='$_REQUEST['\'old\'']' -r='$_REQUEST['\'new\'']' sample.txt
-i.orig in-place modification with backup.
-p print lines from the input file by default
-s enable rudimentary parsing of command line arguments
-e run this script
index($_,$s) search for the $s string
substr($_,$i,length($s), $r) replace the string
while (($i = index($_,$s)) >= 0) repeat until
-- end of perl parameters
-s='$_REQUEST['\'old\'']', -r='$_REQUEST['\'new\'']' - set $s,$r
You still need to "escape" ' chars but the rest should be straight forward.
Note: this started as an answer to How to pass special character string to sed hence the $_REQUEST['old'] strings, however this question is a bit more appropriately formulated.
You should be using replace instead of sed.
From the man page:
The replace utility program changes strings in place in files or on the
standard input.
Invoke replace in one of the following ways:
shell> replace from to [from to] ... -- file_name [file_name] ...
shell> replace from to [from to] ... < file_name
from represents a string to look for and to represents its replacement.
There can be one or more pairs of strings.

UNIX: Replace Newline w/ Colon, Preserving Newline Before EOF

I have a text file ("INPUT.txt") of the format:
A<LF>
B<LF>
C<LF>
D<LF>
X<LF>
Y<LF>
Z<LF>
<EOF>
which I need to reformat to:
A:B:C:D:X:Y:Z<LF>
<EOF>
I know you can do this with 'sed'. There's a billion google hits for doing this with 'sed'. But I'm trying to emphasis readability, simplicity, and using the correct tool for the correct job. 'sed' is a line editor that consumes and hides newlines. Probably not the right tool for this job!
I think the correct tool for this job would be 'tr'. I can replace all the newlines with colons with the command:
cat INPUT.txt | tr '\n' ':'
There's 99% of my work done. I have a problem, now, though. By replacing all the newlines with colons, I not only get an extraneous colon at the end of the sequence, but I also lose the carriage return at the end of the input. It looks like this:
A:B:C:D:X:Y:Z:<EOF>
Now, I need to remove the colon from the end of the input. However, if I attempt to pass this processed input through 'sed' to remove the final colon (which would now, I think, be a proper use of 'sed'), I find myself with a second problem. The input is no longer terminated by a newline at all! 'sed' fails outright, for all commands, because it never finds the end of the first line of input!
It seems like appending a newline to the end of some input is a very, very common task, and considering I myself was just sorely tempted to write a program to do it in C (which would take about eight lines of code), I can't imagine there's not already a very simple way to do this with the tools already available to you in the Linux kernel.
This should do the job (cat and echo are unnecessary):
tr '\n' ':' < INPUT.TXT | sed 's/:$/\n/'
Using only sed:
sed -n ':a; $ ! {N;ba}; s/\n/:/g;p' INPUT.TXT
Bash without any externals:
string=($(<INPUT.TXT))
string=${string[#]/%/:}
string=${string//: /:}
string=${string%*:}
Using a loop in sh:
colon=''
while read -r line
do
string=$string$colon$line
colon=':'
done < INPUT.TXT
Using AWK:
awk '{a=a colon $0; colon=":"} END {print a}' INPUT.TXT
Or:
awk '{printf colon $0; colon=":"} END {printf "\n" }' INPUT.TXT
Edit:
Here's another way in pure Bash:
string=($(<INPUT.TXT))
saveIFS=$IFS
IFS=':'
newstring="${string[*]}"
IFS=$saveIFS
Edit 2:
Here's yet another way which does use echo:
echo "$(tr '\n' ':' < INPUT.TXT | head -c -1)"
Old question, but
paste -sd: INPUT.txt
Here's yet another solution: (assumes a character set where ':' is
octal 72, eg ascii)
perl -l72 -pe '$\="\n" if eof' INPUT.TXT

Resources