I have many part-00001, part-00002, ... files.
I want to use this way:
for ((i=0;i<1000;i++)); do <some command> <formatted string with i>; done.
How can I format "part-000xx"-like string with number i in zsh?
It could be done with:
typeset -Z 5 i (using builtin typeset -Z [N])
printf "part-%05d" $i (using builtin printf "%05d" $i)
${(l:5::0:)i} (using parameter expansion flags l:expr::string1:string2:)
like below:
typeset -Z 5 j
for ((i=0;i<1000;i++)); do
# <some command> <formatted string with i>
j=$i; echo "part-$j" # use $j here for sure the effects of below 2 lines
echo "$(printf "part-%05d" $i)"
echo "part-${(l:5::0:)j}"
done
# This outputs below:
# >> part-00000
# >> part-00000
# >> part-00000
# >> part-00001
# >> part-00001
# >> part-00001
# >> ...
# >> part-00999
Here is the description for each item.
typeset
-Z [N]
Specially handled if set along with the -L flag. Otherwise, similar to -R, except that leading zeros are used for padding instead of blanks if the first non-blank character is a digit. Numeric parameters are specially handled: they are always eligible for padding with zeroes, and the zeroes are inserted at an appropriate place in the output.
-- zshbuiltins(1), typeset, Shell Builtin Commands
printf
Print the arguments according to the format specification. Formatting rules are the same as used in C.
-- zshubuiltins(1), printf, Shell Builtin Commands
l:expr::string1::string2:
Pad the resulting words on the left. Each word will be truncated if required and placed in a field expr characters wide.
The arguments :string1: and :string2: are optional; neither, the first, or both may be given. Note that the same pairs of delimiters must be used for each of the three arguments. The space to the left will be filled with string1 (concatenated as often as needed) or spaces if string1 is not given. If both string1 and string2 are given, string2 is inserted once directly to the left of each word, truncated if necessary, before string1 is used to produce any remaining padding.
-- zshexpn(1), Parameter Expansion Flags
Related
I am attempting to use the sqrt function from awk command in my script, but all it returns is 0. Is there anything wrong with my script below?
echo "enter number"
read root
awk 'BEGIN{ print sqrt($root) }'
This is my first time using the awk command, are there any mistakes that I am not understanding here?
Maybe you can try this.
echo "enter number"
read root
echo "$root" | awk '{print sqrt($0)}'
You have to give a data input to awk. So, you can pipe 'echo'.
The BEGIN statement is to do things, like print a header...etc before
awk starts reading the input.
$ echo "enter number"
enter number
$ read root
3
$ awk -v root="$root" 'BEGIN{ print sqrt(root) }'
1.73205
See the comp.unix.shell FAQ for the 2 correct ways to pass the value of a shell variable to an awk script.
UPDATE : My proposed solution turns out to be potentially dangerous. See Ed Morton's answer for a better solution. I'll leave this answer here as a warning.
Because of the single quotes, $root is interpreted by awk, not by the shell. awk treats root as an uninitialized variable, whose value is the empty string, treated as 0 in a numeric context. $root is the root'th field of the current line -- in this case, as $0, which is the entire line. Since it's in a BEGIN block, there is no current line, so $root is the empty string -- which again is treated as 0 when passed to sqrt().
You can see this by changing your command line a bit:
$ awk 'BEGIN { print sqrt("") }'
0
$ echo 2 | awk '{ print sqrt($root) }'
1.41421
NOTE: The above is merely to show what's wrong with the original command, and how it's interpreted by the shell and by awk.
One solution is to use double quotes rather than single quotes. The shell expands variable references within double quotes:
$ echo "enter number"
enter number
$ read x
2
$ awk "BEGIN { print sqrt($x) }" # DANGEROUS
1.41421
You'll need to be careful when doing this kind of thing. The interaction between quoting and variable expansion in the shell vs. awk can be complicated.
UPDATE: In fact, you need to be extremely careful. As Ed Morton points out in a comment, this method can result in arbitrary code execution given a malicious value for $x, which is always a risk for a value read from user input. His answer avoids that problem.
(Note that I've changed the name of your shell variable from $root to $x, since it's the number whose square root you want, not the root itself.)
Hello I have a MVE where I am trying to catenate two variables and then pipe to cut.
all:
#echo $(APP_NAME)
#echo $(CURRENT_BRANCH)
#echo $(call EB_SAFE_NAME,$(CURRENT_BRANCH))
#echo $(shell echo "$(APP_NAME)-$(call EB_SAFE_NAME,$(CURRENT_BRANCH))" | cut -c 23)
Output:
$ cicdtest
$ issue#13-support-multi-branch
$ issue-13-support-multi-branch
$ o
If I remove the | cut -c 23 then the output is fine, but I do need to limit to 23 char. What am I doing wrong on the 4th echo statement above?
Different behavior in a test script then in make, but the issue is with explicit use of cut, not with make. The following works as expected:
#echo $(shell echo $(APP_NAME)-$(call EB_SAFE_NAME,$(CURRENT_BRANCH)) | cut -c 1-23)
Cut has some handling for the incomplete range, but in make (even though I am using bash) the complete range is needed:
Bytes, characters, and fields are are numbered starting at 1 and separated by commas.
Incomplete ranges can be given: -M means 1-M ; N- means N through end of line or last field.
Options
-b BYTE-LIST
--bytes=BYTE-LIST
Print only the bytes in positions listed in BYTE-LIST. Tabs and
backspaces are treated like any other character; they take up 1
byte.
-c CHARACTER-LIST
--characters=CHARACTER-LIST
Print only characters in positions listed in CHARACTER-LIST. The
same as `-b' for now, but internationalization will change that.
Tabs and backspaces are treated like any other character; they
take up 1 character.
Not a long question, what does this mean?
LogMsg "File:${#}"
LogMsg() is a method that logs a message with a timestamp.
But what the heck does
${#}
mean? I should also mention the script also has $1 and $2 as well. Google produces no results.
Literally:
f() { printf '%s\n' "File: $#"; }
f "First Argument" "Second Argument" "Third Argument"
will expand to and run the command:
printf '%s\n' "File: First Argument" "Second Argument" "Third Argument"
That is to say: It expands your argument list ($1, $2, $3, etc) while maintaining separation between subsequent arguments (not throwing away any information provided by the user by way of quoting).
This is different from:
printf '%s\n' File: $#
or
printf '%s\n' File: $*
which are both the same as:
printf '%s\n' "File:" "First" "Argument" "Second" "Argument" "Third" "Argument"
...these both string-split and glob-expand the argument list, so if the user had passed, say, "*" (inside quotes intended to make it literal), the unquoted use here would replace that character with the results of expanding it as a glob, ie. the list of files in the current directory. Also, string-splitting has other side effects such as changing newlines or tabs to spaces.
It is also different from:
printf '%s\n' "File: $*"
which is the same as:
printf '%s\n' "File: First Argument Second Argument Third Argument"
...which, as you can see above, combines all arguments by putting the first character in IFS (which is by default a space) between them.
in KSH there is two positional paremeters * and #
"$*" is a single string that consists of all of the positional parameters, separated by the first character in the variable IFS (internal field separator), which is a space, TAB, and newline by default.
On the other hand, "$#" is equal to "$1" "$2" … "$N ", where N is the number of positional parameters.
For more detailed information and example : http://oreilly.com/catalog/korn2/chapter/ch04.html
This is the set of the arguments of the command line.
If you launch a script via a command like cmd a b c d, there is 5 arguments, $0 will be the command cmd, $1the first argument a, $2 the second b, etc. ${#} will be all the arguments except the command.
The one piece that was not explained by the other posts is the use of {. $# is the same as ${#} but allows you to add letters, etc if needed and those letters will not have a space added in. e.g. you could say ${foo}dog and if $foo was set to little the result would be littledog with no spaces. In the case of ${#}dogdog and $# is set to a b c d the result is "a" "b" "c" "ddogdog".
I have a file that has lines that look like this
LINEID1:FIELD1=ABCD,&FIELD2-0&FIELD3-1&FIELD4-0&FIELD9-0;
LINEID2:FIELD1=ABCD,&FIELD5-1&FIELD6-0;
LINEID3:FIELD1=ABCD,&FIELD7-0&FIELD8-0;
LINEID1:FIELD1=XYZ,&FIELD2-0&FIELD3-1&FIELD9-0
LINEID3:FIELD1=XYZ,&FIELD7-0&FIELD8-0;
LINEID1:FIELD1=PQRS,&FIELD3-1&FIELD4-0&FIELD9-0;
LINEID2:FIELD1=PQRS,&FIELD5-1&FIELD6-0;
LINEID3:FIELD1=PQRS,&FIELD7-0&FIELD8-0;
I'm interested in only the lines that begin with LINEID1 and only some elements (FIELD1, FIELD2, FIELD4 and FIELD9) from that line. The output should look like this (no & signs.can replace with |)
FIELD1=ABCD|FIELD2-0|FIELD4-0|FIELD9-0;
FIELD1=XYZ|FIELD2-0|FIELD9-0;
FIELD1=PQRS|FIELD4-0|FIELD9-0;
If additional information is required, do let me know, I'll post them in edits. Thanks!!
This is not exactly what you asked for, but no-one else is answering and it is pretty close for you to get started with!
awk -F'[&:]' '/^LINEID1:/{print $2,$3,$5,$6}' OFS='|' file
Output
FIELD1=ABCD,|FIELD2-0|FIELD4-0|FIELD9-0;
FIELD1=XYZ,|FIELD2-0|FIELD9-0|
FIELD1=PQRS,|FIELD3-1|FIELD9-0;|
The -F sets the Input Field Separator to colon or ampersand. Then it looks for lines starting LINEID1: and prints the fields you need. The OFS sets the Output Field Separator to the pipe symbol |.
Pure awk:
awk -F ":" ' /LINEID1[^0-9]/{gsub(/FIELD[^1249]+[-=][A-Z0-9]+/,"",$2); gsub(/,*&+/,"|",$2); print $2} ' file
Updated to give proper formatting and to omit LINEID11, etc...
Output:
FIELD1=ABCD|FIELD2-0|FIELD4-0|FIELD9-0;
FIELD1=XYZ|FIELD2-0|FIELD9-0
FIELD1=PQRS|FIELD4-0|FIELD9-0;
Explanation:
awk -F ":" - split lines into LHS ($1) and RHS ($2) since output only requires RHS
/LINEID1[^0-9]/ - return only lines that match LINEID1 and also ignores LINEID11, LINEID100 etc...
gsub(/FIELD[^1249]+[-=][A-Z0-9]+/,"",$2) - remove all fields that aren't 1, 4 or 9 on the RHS
gsub(/,*&+/,"|",$2) - clean up the leftover delimiters on the RHS
To select rows from data with Unix command lines, use grep, awk, perl, python, or ruby (in increasing order of power & possible complexity).
To select columns from data, use cut, awk, or one of the previously mentioned scripting languages.
First, let's get only the lines with LINEID1 (assuming the input is in a file called input).
grep '^LINEID1' input
will output all the lines beginning with LINEID1.
Next, extract the columns we care about:
grep '^LINEID1' input | # extract lines with LINEID1 in them
cut -d: -f2 | # extract column 2 (after ':')
tr ',&' '\n\n' | # turn ',' and '&' into newlines
egrep 'FIELD[1249]' | # extract only fields FIELD1, FIELD2, FIELD4, FIELD9
tr '\n' '|' | # turn newlines into '|'
sed -e $'s/\\|\\(FIELD1\\)/\\\n\\1/g' -e 's/\|$//'
The last line inserts newlines in front of the FIELD1 lines, and removes any trailing '|'.
That last sed pattern is a little more challenging because sed doesn't like literal newlines in its replacement patterns. To put a literal newline, a bash escape needs to be used, which then requires escapes throughout that string.
Here's the output from the above command:
FIELD1=ABCD|FIELD2-0|FIELD4-0|FIELD9-0;
FIELD1=XYZ|FIELD2-0|FIELD9-0
FIELD1=PQRS|FIELD4-0|FIELD9-0;
This command took only a couple of minutes to cobble up.
Even so, it's bordering on the complexity threshold where I would shift to perl or ruby because of their excellent string processing.
The same script in ruby might look like:
#!/usr/bin/env ruby
#
while line = gets do
if line.chomp =~ /^LINEID1:(.*)$/
f1, others = $1.split(',')
fields = others.split('&').map {|f| f if f =~ /FIELD[1249]/}.compact
puts [f1, fields].flatten.join("|")
end
end
Run this script on the same input file and the same output as above will occur:
$ ./parse-fields.rb < input
FIELD1=ABCD|FIELD2-0|FIELD4-0|FIELD9-0;
FIELD1=XYZ|FIELD2-0|FIELD9-0
FIELD1=PQRS|FIELD4-0|FIELD9-0;
I want to check if a multiline text matches an input. grep comes close, but I couldn't find a way to make it interpret pattern as plain text, not regex.
How can I do this, using only Unix utilities?
Use grep -F:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified by
POSIX.)
EDIT: Initially I didn't understand the question well enough. If the pattern itself contains newlines, use -z option:
-z, --null-data
Treat the input as a set of lines, each terminated by a zero
byte (the ASCII NUL character) instead of a newline. Like the
-Z or --null option, this option can be used with commands like
sort -z to process arbitrary file names.
I've tested it, multiline patterns worked.
From man grep
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
If the input string you are trying to match does not contain a blank line (eg, it does not have two consecutive newlines), you can do:
awk 'index( $0, "needle\nwith no consecutive newlines" ) { m=1 }
END{ exit !m }' RS= input-file && echo matched
If you need to find a string with consecutive newlines, set RS to some string that is not in the file. (Note that the results of awk are unspecified if you set RS to more than one character, but most awk will allow it to be a string.) If you are willing to make the sought string a regex, and if your awk supports setting RS to more than one character, you could do:
awk 'END{ exit NR == 1 }' RS='sought regex' input-file && echo matched