sed usage not able to understand - unix

I have come across unix sed command usage and not able to understand what it does. Could you please help me to understand the usage ? If possible please share some reference to understand such usages of sed command.
sed -i '/^export JAVA_HOME/ s:.*:export JAVA_HOME=/usr/java/default\nexport HADOOP_PREFIX=/usr/local/hadoop\nexport HADOOP_HOME=/usr/local/hadoop\n:' $HADOOP_PREFIX/etc/hadoop/hadoop-env.sh

The command is simple, though it assumes GNU sed because of the way it uses the -i option; for macOS Sierra and related systems, you'd need to use -i '' in place of just -i.
Overall, it corresponds to:
sed -i '/Pattern/ s:.*:Replacement:' file
where:
-i means overwrite each input file with its edited output without creating a backup copy.
/Pattern/ is ^export JAVA_HOME; a line starting with the word export and then JAVA_HOME separated by a single space.
s:.*:Replacement: is a substitute command, using : instead of the more conventional / (often s/.*/Replacement/) as the pattern delimiter. This is done because the replacement text contains slashes. The .* matches the whole line. The rest of the material is written in place of the original export JAVA_HOME line. The \n sequence expands to a newline, so it actually produces a number of lines in the output.
file is $HADOOP_PREFIX/etc/hadoop/hadoop-env.sh

As others have pointed out, this is a sed command invocation. The command is short for "Stream EDitor" and is quite useful for modifying files programaticallly. Your best bet is to read the man pages (man sed, but I've broken down your particular command here for instructive purposes:
sed # The command
-i # Edit file in place (no backup)
'/^export JAVA_HOME/ # For every line that begins with 'export JAVA_HOME'...
s: # substitue...
.*: # the entire line with...
export JAVA_HOME=/usr/java/default
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HOME=/usr/local/hadoop
:' # End of command
$HADOOP_PREFIX/etc/hadoop/hadoop-env.sh # Run on the following file
Points of interest:
Commands can be limited to a particular address range or scope. Here, the scope was a search.
The substitue command can be delimited by almost any character (usually it is /, but in this case, : was chosen to prevent escaping of the / in the filepaths
The sed expression was enclosed in ' to prevent shell expansion of variables. Although no expansions would have taken place in this scenario, it is fairly common to see the expression wrapped in ' to eliminate the possibility.

Related

Function of d and t in sed substitution context

I used a substitution generator for sed and it gives me
sed -E 's/([^ ]+)│m/\1│T/gm;t;d'
I am familiar with regular expression flags g and m, but I have never seen the t and d. After looking them up it seems that t is for testing and d for deleting. But in this particular context, what does that mean? What do they contribute to the full command?
Quoting sed manual:
d: Delete pattern space. Start next cycle.
t label: If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.
In other words, if the s command succeeded, print the resulting pattern space, else skip (delete) the line.
Another way to express this is:
sed -n -E 's/([^ ]+) m/\1 T/gmp'
sed is it's own "language" with it's own commands. It's not a "regular expression tool", but rather a "Stream EDitor".
I am familiar with regular expression flags g and m,
The s command has it's own modifiers and m is a GNU extension. I do not really see how it's used here, as sed reads here one line at a time, and m modifies the behavior of ^ and $....
But in this particular context, what does that mean?
In any possible context, t command jumps to the label if (any) s/// command was successful. If label is omitted, it jumps to the end of script.
The d command deletes pattern space, effectively removing the line in typical parsing.
The t;d is a mnemonic to remove the line if the last s command was unsuccessful. I prefer to do /pattern/!d; s//replacement/g which is more readable to my eyes.
A reference for sed behavior would be the posix sed documentation and gnu sed documentation.

unix SED command to replace part of key value pair

We have requirement where i need to replace part of param value in our configuration file.
Example
key1=123-456
I need to replace the value after hyphen with new value.
I got command which is being used in other projects but i am not sure how it works.
Command
[test]$ cat test_sed_key_value.txt
key1=123-456
[test]$ sed -i -e '/key1/ s/-.*$/-789/' test_sed_key_value.txt
[test]$
[test]$ cat test_sed_key_value.txt
key1=123-789
[test]$
It will be helpful if some one can explain how the above command or is there a simpler way to do this using sed.
Here is a list of parts of that commandline, each followed by a short explanation:
sed
which tool to use
-i
flag: apply the effect directly to the processed file (whithout creating a copy of the input file)
-e
expression parameter: the sed code to apply follows
/key1/
"address": only process lines on which this regex applies, i.e. those containing the text "key1"
s/replacethis/withthis/
command: do a search-and-replace, "replacethis" and "withthis" are the next to explanations
-.*$
regex: (what is actually in the commandline instead of "replacethis") a regular expression representing a "minus" followed by anything, in any number, until the end of the line
-789
literal: (what is actually in the commandline instead of "withthis") simply that string "-789"
test_sed_key_value.txt
file parameter: process this file
I cannot think of any way to do this simpler. The shown command already uses some assumptions on the formatting of the input file.
I'd add to Yunnosch's answer that here the "replacethis" is a regexp:
-.*$
See here for an overview of the syntax of sed's regular expressions by Gnu.
Asterisk means a repetition of the previous thing, dot means any character, so .* means a sequence of characters.
$ is the end of the line.
You might want to be a bit more restrictive, since here you'd lose something in a line like this one for instance:
key1=123-456, key2=abc-def
replacing it by:
key1=123-789
removing completely the key2 part (since the .* takes all characters after the first dash until end of line).
So depending on the format of your values, you might prefer something like
-[0-9]*
(without the $), meaning a sequence of numbers after the -
or
-[0-9a-zA-Z_]
meaning a sequence of numbers or letters or underscore after the -

Unix Text Processing - how to remove part of a file name from the results?

I'm searching through text files using grep and sed commands and I also want the file names displayed before my results. However, I'm trying to remove part of the file name when it is displayed.
The file names are formatted like this: aja_EPL_1999_03_01.txt
I want to have only the date without the beginning letters and without the .txt extension.
I've been searching for an answer and it seems like it's possible to do that with a sed or a grep command by using something like this to look forward and back and extract between _ and .txt:
(?<=_)\d+(?=\.)
But I must be doing something wrong, because it hasn't worked for me and I possibly have to add something as well, so that it doesn't extract only the first number, but the whole date. Thanks in advance.
Edit: Adding also the working command I've used just in case. I imagine whatever command is needed would have to go at the beginning?
sed '/^$/d' *.txt | grep -P '(^([A-ZÖÄÜÕŠŽ].*)?[Pp][Aa][Ll]{2}.*[^\.]$)' *.txt --colour -A 1
The results look like this:
aja_EPL_1999_03_02.txt:PALLILENNUD : korraga üritavad ümbermaailmalendu kaks meeskonda
A desired output would be this:
1999_03_02:PALLILENNUD : korraga üritavad ümbermaailmalendu kaks meeskonda
First off, you might want to think about your regular expression. While the one you have you say works, I wonder if it could be simplified. You told us:
(^([A-ZÖÄÜÕŠŽ].*)?[Pp][Aa][Ll]{2}.*[^\.]$)
It looks to me as if this is intended to match lines that start with a case insensitive "PALL", possibly preceded by any number of other characters that start with a capital letter, and that lines must not end in a backslash or a dot. So valid lines might be any of:
PALLILENNUD : korraga üritavad etc etc
Õlu on kena. Do I have appalling speling?
Peeter Pall is a limnologist at EMU!
If you'd care to narrow down this description a little and perhaps provide some examples of lines that should be matched or skipped, we may be able to do better. For instance, your outer parentheses are probably unnecessary.
Now, let's clarify what your pipe isn't doing.
sed '/^$/d' *.txt
This reads all your .txt files as an input stream, deletes any empty lines, and prints the output to stdout.
grep -P 'regex' *.txt --otheroptions
This reads all your .txt files, and prints any lines that match regex. It does not read stdin.
So .. in the command line you're using right now, your sed command is utterly ignored, as sed's output is not being read by grep. You COULD instruct grep to read from both files and stdin:
$ echo "hello" > x.txt
$ echo "world" | grep "o" x.txt -
x.txt:hello
(standard input):world
But that's not what you're doing.
By default, when grep reads from multiple files, it will precede each match with the name of the file from whence that match originated. That's also what you're seeing in my example above -- two inputs, one x.txt and the other - a.k.a. stdin, separated by a colon from the match they supplied.
While grep does include the most minuscule capability for filtering (with -o, or GNU grep's \K with optional Perl compatible RE), it does NOT provide you with any options for formatting the filename. Since you can'd do anything with the output of grep, you're limited to either parsing the output you've got, or using some other tool.
Parsing is easy, if your filenames are predictably structured as they seem to be from the two examples you've provided.
For this, we can ignore that these lines contain a file and data. For the purpose of the filter, they are a stream which follows a pattern. It looks like you want to strip off all characters from the beginning of each line up to and not including the first digit. You can do this by piping through sed:
sed 's/^[^0-9]*//'
Or you can achieve the same effect by using grep's minimal filtering to return every match starting from the first digit:
grep -o '[0-9].*'
If this kind of pipe-fitting is not to your liking, you may want to replace your entire grep with something in awk that combines functionality:
$ awk '
/[\.]$/ {next} # skip lines ending in backslash or dot
/^([A-ZÖÄÜÕŠŽ].*)?PALL/ { # lines to match
f=FILENAME
sub(/^[^0-9]*/,"",f) # strip unwanted part of filename, like sed
printf "%s:%s\n", f, $0
getline # simulate the "-A 1" from grep
printf "%s:%s\n", f, $0
}' *.txt
Note that I haven't tested this, because I don't have your data to work with.
Also, awk doesn't include any of the fancy terminal-dependent colourization that GNU grep provides through the --colour option.

What does this command do

I have nw executable file...
sed -i 's/udev\.so\.0/udev.so.1/g' nw
the Node-webkit application binary.
I am running ubuntu 14.04Lts i tried to open nw with ./nw. i doesn't open.
After that i typed in the above command. It started working.
I am too curious to know what that command did to my executable.
I know sed is for matching file patters with regular expression. how can it operate on .exe (application/x-executable) file.
Please anyone explain.
The sed command that you've used searches the nw executable file (like an ordinary text file imagine) for any occurrence of the string udev.so.0 and substitutes it with udev.so.1. The backslashes \. before the dots are just for escaping the . character, which is a special character for sed (it means any character - a wildcard like * is on windows systems). The g character at the end of your command denotes also to continue searching a specific line, even if a first occurrence has been already found. This way the whole file - nw here - is being searched and replaced.

To replace a set of strings in a file with another string in a unix file

I have a parameter name like
PAR="DBS_OUT"
and I have a text file(Repla.txt) with below values:
DB_TECH
DB_ADMIN
DB_TERA
DB_APS
These values in the files can defer but the parameter value will remain the same.
Now I have some Unix shell script where I need to find all such values mentioned in the file (Repla.txt)
and replace them with the parameter (PAR). Since the values in the Repla.txt is not fixed I am not able to use the sed command. for eg:
sed 's/old/new/g' input.txt > output.txt
Can anyone please help me.
Thanks
I'm not sure I completely understand what you are trying to do but if you are trying to use the values contained in Repla.txt as the strings that you want to replace in other files then the following bash line will do what you want:
PAR="DBS_OUT"; for FIND in `cat Repla.txt`; do $( find /path/to/files -name 'test?.txt' -exec sed -i "s/$FIND/$PAR/g" '{}' \;); done;
It will replace the strings contained in Repla.txt with the string DBS_OUT in all files that match test?.txt in the dir (and subdirs) /path/to/files. You will need to understand how find works.
Also note that I am not telling sed to backup, you probably want to test this out on some test files before you execute it for real. Hopefully you also have your scripts in source control so its not a big deal if you mess things up.
I hope your replacement on Capital letters only.
sed 's/DBS_[A-Z]*/DBS_OUT/g' repla.txt > destination file
or
sed 's/DBS_[A-Z]*/DBS_OUT/g' repla.txt

Resources