Unix ksh loop and put the result into variable - unix

I have a simple thing to do, but I'm novice in UNIX.
So, I have a file and on each line I have an ID.
I need to go through the file and put all ID's into one variable.
I've tried something like in Java but does not work.
for variable in `cat myFile.txt`
do
param=`echo "${param} ${variable}"`
done
It does not seems to add all values into param.
Thanks.

I'd use:
param=$(<myFile.txt)
The parameter has white space (actually newlines) between the names. When used without quotes, the shell will expand those to spaces, as in:
cat $param
If used with quotes, the file names will remain on separate lines, as in:
echo "$param"
Note that the Korn shell special-cases the '$(<file)' notation and does not fork and execute any command.
Also note that your original idea can be made to work more simply:
param=
for variable in `cat myFile.txt`
do
param="${param} ${variable}"
done
This introduces a blank at the front of the parameter; it seldom matters. Interestingly, you can avoid the blank at the front by having one at the end, using param="${param}${variable} ". This also works without messing things up, though it looks as though it jams things together. Also, the '${var}' notation is not necessary, though it does no harm either.
And, finally for now, it is better to replace the back-tick command with '$(cat myFile.txt)'. The difference becomes crucial when you need to nest commands:
perllib=$(dirname $(dirname $(which perl)))/lib
vs
perllib=`dirname \`dirname \\\`which perl\\\`\``/lib
I know which I prefer to type (and read)!

Try this:
param=`cat myFile.txt | tr '\n' ' '`
The tr command translates all occurrences of \n (new line) to spaces. Then we assign the result to the param variable.
Lovely.

param="$(< myFile.txt)"
or
while read line
do
param="$param$line"$'\n'
done < myFile.txt

awk
var=$(awk '1' ORS=" " file)
ksh
while read -r line
do
t="$t $line"
done < file
echo $t

Related

Parsing variable in loop incorrectly [duplicate]

I want to run certain actions on a group of lexicographically named files (01-09 before 10). I have to use a rather old version of FreeBSD (7.3), so I can't use yummies like echo {01..30} or seq -w 1 30.
The only working solution I found is printf "%02d " {1..30}. However, I can't figure out why can't I use $1 and $2 instead of 1 and 30. When I run my script (bash ~/myscript.sh 1 30) printf says {1..30}: invalid number
AFAIK, variables in bash are typeless, so how can't printf accept an integer argument as an integer?
Bash supports C-style for loops:
s=1
e=30
for i in ((i=s; i<e; i++)); do printf "%02d " "$i"; done
The syntax you attempted doesn't work because brace expansion happens before parameter expansion, so when the shell tries to expand {$1..$2}, it's still literally {$1..$2}, not {1..30}.
The answer given by #Kent works because eval goes back to the beginning of the parsing process. I tend to suggest avoiding making habitual use of it, as eval can introduce hard-to-recognize bugs -- if your command were whitelisted to be run by sudo and $1 were, say, '$(rm -rf /; echo 1)', the C-style-for-loop example would safely fail, and the eval example... not so much.
Granted, 95% of the scripts you write may not be accessible to folks executing privilege escalation attacks, but the remaining 5% can really ruin one's day; following good practices 100% of the time avoids being in sloppy habits.
Thus, if one really wants to pass a range of numbers to a single command, the safe thing is to collect them in an array:
a=( )
for i in ((i=s; i<e; i++)); do a+=( "$i" ); done
printf "%02d " "${a[#]}"
I guess you are looking for this trick:
#!/bin/bash
s=1
e=30
printf "%02d " $(eval echo {$s..$e})
Ok, I finally got it!
#!/bin/bash
#BSD-only iteration method
#for day in `jot $1 $2`
for ((day=$1; day<$2; day++))
do
echo $(printf %02d $day)
done
I initially wanted to use the cycle iterator as a "day" in file names, but now I see that in my exact case it's easier to iterate through normal numbers (1,2,3 etc.) and process them into lexicographical ones inside the loop. While using jot, remember that $1 is the numbers amount, and the $2 is the starting point.

Unix Text Processing - how to remove part of a file name from the results?

I'm searching through text files using grep and sed commands and I also want the file names displayed before my results. However, I'm trying to remove part of the file name when it is displayed.
The file names are formatted like this: aja_EPL_1999_03_01.txt
I want to have only the date without the beginning letters and without the .txt extension.
I've been searching for an answer and it seems like it's possible to do that with a sed or a grep command by using something like this to look forward and back and extract between _ and .txt:
(?<=_)\d+(?=\.)
But I must be doing something wrong, because it hasn't worked for me and I possibly have to add something as well, so that it doesn't extract only the first number, but the whole date. Thanks in advance.
Edit: Adding also the working command I've used just in case. I imagine whatever command is needed would have to go at the beginning?
sed '/^$/d' *.txt | grep -P '(^([A-ZÖÄÜÕŠŽ].*)?[Pp][Aa][Ll]{2}.*[^\.]$)' *.txt --colour -A 1
The results look like this:
aja_EPL_1999_03_02.txt:PALLILENNUD : korraga üritavad ümbermaailmalendu kaks meeskonda
A desired output would be this:
1999_03_02:PALLILENNUD : korraga üritavad ümbermaailmalendu kaks meeskonda
First off, you might want to think about your regular expression. While the one you have you say works, I wonder if it could be simplified. You told us:
(^([A-ZÖÄÜÕŠŽ].*)?[Pp][Aa][Ll]{2}.*[^\.]$)
It looks to me as if this is intended to match lines that start with a case insensitive "PALL", possibly preceded by any number of other characters that start with a capital letter, and that lines must not end in a backslash or a dot. So valid lines might be any of:
PALLILENNUD : korraga üritavad etc etc
Õlu on kena. Do I have appalling speling?
Peeter Pall is a limnologist at EMU!
If you'd care to narrow down this description a little and perhaps provide some examples of lines that should be matched or skipped, we may be able to do better. For instance, your outer parentheses are probably unnecessary.
Now, let's clarify what your pipe isn't doing.
sed '/^$/d' *.txt
This reads all your .txt files as an input stream, deletes any empty lines, and prints the output to stdout.
grep -P 'regex' *.txt --otheroptions
This reads all your .txt files, and prints any lines that match regex. It does not read stdin.
So .. in the command line you're using right now, your sed command is utterly ignored, as sed's output is not being read by grep. You COULD instruct grep to read from both files and stdin:
$ echo "hello" > x.txt
$ echo "world" | grep "o" x.txt -
x.txt:hello
(standard input):world
But that's not what you're doing.
By default, when grep reads from multiple files, it will precede each match with the name of the file from whence that match originated. That's also what you're seeing in my example above -- two inputs, one x.txt and the other - a.k.a. stdin, separated by a colon from the match they supplied.
While grep does include the most minuscule capability for filtering (with -o, or GNU grep's \K with optional Perl compatible RE), it does NOT provide you with any options for formatting the filename. Since you can'd do anything with the output of grep, you're limited to either parsing the output you've got, or using some other tool.
Parsing is easy, if your filenames are predictably structured as they seem to be from the two examples you've provided.
For this, we can ignore that these lines contain a file and data. For the purpose of the filter, they are a stream which follows a pattern. It looks like you want to strip off all characters from the beginning of each line up to and not including the first digit. You can do this by piping through sed:
sed 's/^[^0-9]*//'
Or you can achieve the same effect by using grep's minimal filtering to return every match starting from the first digit:
grep -o '[0-9].*'
If this kind of pipe-fitting is not to your liking, you may want to replace your entire grep with something in awk that combines functionality:
$ awk '
/[\.]$/ {next} # skip lines ending in backslash or dot
/^([A-ZÖÄÜÕŠŽ].*)?PALL/ { # lines to match
f=FILENAME
sub(/^[^0-9]*/,"",f) # strip unwanted part of filename, like sed
printf "%s:%s\n", f, $0
getline # simulate the "-A 1" from grep
printf "%s:%s\n", f, $0
}' *.txt
Note that I haven't tested this, because I don't have your data to work with.
Also, awk doesn't include any of the fancy terminal-dependent colourization that GNU grep provides through the --colour option.

Renaming multiple files using parameter unix

I have a rename script below rename.sh. I want to introduce a variable such that I can pass a date argument when executing the script like./rename.sh 20151103 such that 20151103 replaces 20140306 in the script.
for f in *.CDR*; do
echo mv "$f" "${f/-20140306/-0-20140306}"
done
Thinking of automating this as I don't want to manually edit the script each time i'm doing a rename. Any other method will be highly welcomed.
#!/bin/bash
pattern="$1"
for f in *.CDR*; do
echo mv "$f" "${f/-${pattern}/-0-${pattern}}"
done
Explanation:
The #!-line says we're running this as a bash script.
The script will populate the variables $1, $2 etc. with the arguments handed to it on the command line. These are called the positional parameters ($0 usually holds the name of the script).
We take $1, because we know that should contain the pattern we're replacing, and assign it to the variable $pattern. In much more complex scripts, here is where we would handle command line switches (with getopts, but that's an answer for another day).
We quote $1, just because. (It's good practice to quote user input, just to be sure no shell-globbing characters, such as * gets expanded).
The rest is the script like you had from before, but with the string 20140306 replaced by ${pattern}. I'm using ${pattern} rather than $pattern here for readability only. In general, you need to use ${a} rather than $a if you, for example, interpolate a string like "${a}nospaceafter".
Then it should just be a matter of making the script executable before testing it:
$ chmod +x rename.sh
This is the one of the method you can consider:
#!/bin/bash
input=$1
for f in *.CDR*; do
echo mv "$f" "${f/-$input/-0-$input}"
done

Zsh: How to force file completion everywhere following a set of characters?

I'm trying to figure out how to get file completion to work at any word position on the command line after a set of characters. As listed in a shell these characters would be [ =+-\'\"()] (the whitespace is tab and space). Zsh will do this, but only after the backtick character, '`', or $(. mksh does this except not after the characters [+-].
By word position on the command line, I'm talking about each set of characters you type out which are delimited by space and a few other characters. For example,
print Hello World,
has three words at positions 1-3. At position 1, when you're first typing stuff in, completion is pretty much perfect. File completion works after all of the characters I mentioned. After the first word, the completion system gets more limited since it's smart. This is useful for commands, but limiting where you can do file completion isn't particularly helpful.
Here are some examples of where file completion doesn't work for me but should in my opinion:
: ${a:=/...}
echo "${a:-/...}"
make LDFLAGS+='-nostdlib /.../crt1.o /.../crti.o ...'
env a=/... b=/... ...
I've looked at rebinding '^I' (tab) with the handful of different completion widgets Zsh comes with and changing my zstyle ':completion:*' lines. Nothing has worked so far to change this default Zsh behaviour. I'm thinking I need to create a completion function that I can add to the end of my zstyle ':completion:*' completer ... line as a last resort completion.
In the completion function, one route would be to cut out the current word I want to complete, complete it, and then re-insert the completion back into the line if that's possible. It could also be more like _precommand which shifts the second word to the first word so that normal command completion works.
I was able to modify _precommand so that you can complete commands at any word position. This is the new file, I named it _commando and added its directory to my fpath:
#compdef -
# precommands is made local in _main_complete
precommands+=($words[1,$(( CURRENT -1 ))])
shift words
CURRENT=1
_normal
To use it I added it to the end of my ':completion:*' completer ... line in my zshrc so it works with every program in $path. Basically whatever word you're typing in is considered the first word, so command completion works at every word position on the command line.
I'm trying to figure out a way to do the same thing for file completion, but it looks a little more complicated at first glace. I'm not really sure where to go with this, so I'm looking to get some help on this.
I took a closer look at some of Zsh's builtin functions and noticed a few that have special completion behaviour. They belong to the typeset group, which has a function _typeset in the default fpath. I only needed to extract a few lines for what I wanted to do. These are the lines I extracted:
...
elif [[ "$PREFIX" = *\=* ]]; then
compstate[parameter]="${PREFIX%%\=*}"
compset -P 1 '*='
_value
...
These few lines allow typeset completion after each slash in a command like this:
typeset file1=/... file2=~/... file3=/...
I extrapolated from this to create the following function. You can modify it to put in your fpath. I just defined it in my zshrc like this:
_reallyforcefilecompletion() {
local prefix_char
for prefix_char in ' ' $'\t' '=' '+' '-' "'" '"' ')' ':'; do
if [[ "$PREFIX" = *${prefix_char}* ]]; then
if [[ "$PREFIX" = *[\'\"]* ]]; then
compset -q -P "*${prefix_char}"
else
compset -P "*${prefix_char}"
fi
_value
break
fi
done
}
You can use this by adding it to a zstyle line like this:
zstyle ':completion:*' completer _complete _reallyforcefilecompletion
This way, it's only used as a last resort so that smarter completions can try before it. Here's a little explanation of the function starting with the few variables and the command involved:
prefix_char: This gets set to each prefix character we want to complete after. For example, env a=123 has the prefix character =.
PREFIX: Initially this will be set to the part of the current word from the beginning of the word up to the position of the cursor; it may be altered to give a common prefix for all matches.
IPREFIX (not shown in code): compset moves string matches from PREFIX to IPREFIX so that the rest of PREFIX can be completed.
compset: This command simplifies modification of the special parameters, while its return status allows tests on them to be carried out.
_value: Not really sure about this one. The documentation states it plays some sort of role in completion.
Documentation for the completion system
The function: In the second line, we declare prefix_char local to avoid variable pollution. In line three, we start a for loop selecting each prefix_char we want to complete after. In the next if block, we check if the variable PREFIX ends with one of the prefix_chars we want to complete after and if PREFIX contains any quotes. Since PREFIX contains quotes, we use compset -q to basically allow quotes to be ignored so we can complete in them. compset -P strips PREFIX and moves it to IPREFIX, basically so it gets ignored and completion can work.
The next elif statement is for a PREFIX ending with prefix_char but not containing quotes, so we only use compset -P. I added the return 0 to break the loop. A more correct way to make this function would be in a case statement, but we're not using the compset return value, so this works. You don't see anything about file completion besides _value. For the most part we just told the system to ignore part of the word.
Basically this is what the function does. We have a line that looks like:
env TERM=linux PATH=/<---cursor here
The cursor is at the end of that slash. This function allows PREFIX, which is PATH=, to be ignored, so we have:
env TERM=linux /<---cursor here
You can complete a file there with PATH= removed. The function doesn't actually remove the PATH= though, it just recategorizes it as something to ignore.
With this function, you can now complete in all of the examples I listed in the question and a lot more.
One last thing to mention, adding this force-list line in your zshrc cripples this function somehow. It still works but seems to choke. This new force-list function is way better anyway.
zstyle ':completion:*' force-list always
EDIT: There were a couple lines I forgot to copy into the function. Probably should have checked before posting. I think it's good now.

Extract Middle Substring from a given String in Unix

I have a string in different ranges :
WATSON_AJAY_AB04_DOTHING.data
WATSON_NAVNEET_CK4_DOTHING.data
WATSON_PRASHANTH_KJ56_DOTHING.data
WATSON_ABHINAV_KD323_DOTHING.data
On these above string how can I extract
AB04,CK4,KJ56,KD323
in Unix?
echo "$string" | cut -d'_' -f3
You could use sed or grep for this task. But since the string is so simple, I dont think you will need to.
One method is to use the bash 'cut' command. Below is an example directly on the BASH shell/command line:
jimm#pi$ string='WATSON_AJAY_AB04_DOTHING.data'
jimm#pi$ cut -d '_' -f 3 <<< "$string"
AB04 <-- outputs the result directly
(edit: of course Lucas' answer above is also a quick 'one-liner' that does the same thing as above - he beat me to it) :)
The cut will take an _ character as the delimiter (the -d '_' part), then display the 3rd slice of the string (the -f 3 part).
Or, if you want to output that 3rd slice from a list of content (using your list above), you can write a simple BASH script.
First, save the lines above ('WATSON...etc') into something like text.txt. Then open up your favorite text editor and type:
#!/bin/sh
cut -d '_' -f 3 < $1
Save that script to some useful name like slice.sh, and make sure it is executable with something like chmod 775 slice.sh.
Then at the command line you can execute the script against your text file, and immediately get an output of those parts of the file you want (in this case the third set of text, separated by the _ character):
$ ./slice.sh text.txt
AB04
CK4
KJ56
KD323
Hope that helps! Bear in mind that the commands above may vary a bit, depending on the flavor of *nix you are using, but it should at least point you in the right direction.

Resources