Split a single string into two or three variables in UNIX - unix

I have a SESSION_TOKEN which gets generated dynamically every 30 mins. Its character length is greater than 530 and approximately 536 characters will be there in it.
How can i split this string in UNIX scripting. Need help.

You can use the "cut" utility for this kind of fixed length work:
echo "AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKK" | cut -c 10-20
CCCDDDDEEEE
The -c means "select by character" and the "10-20" says which characters to select.
You can also select by byte (using -b) which might make a difference if your data has some unusual encoding.
In your case, where you want to do multiple chunks of the same string, something like:
bradh#saxicola:~$ export somethingToChop="AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKK"
bradh#saxicola:~$ echo $somethingToChop | cut -c 1-10
AAAABBBBCC
bradh#saxicola:~$ echo $somethingToChop | cut -c 11-20
CCDDDDEEEE
bradh#saxicola:~$ echo $somethingToChop | cut -c 20-
EFFFFGGGGHHHHIIIIJJJJKKK
Would probably be the easiest to understand.

Bash variable expansion has substring operations built in:
$ string="abcdefghijklmnopqrstuvwxyz";
$ first=${string:0:8}
$ second=${string:8:8}
$ third=${string:16}
$ echo $first, $second, $third
abcdefgh, ijklmnop, qrstuvwxyz

Related

Divide the result of two grep and word count

I have a log file and I would like to divide the result of one grep and count by another grep and count.
$ echo $((cat log2.txt | grep timed\|error\|Error | wc -l)/(cat log2.txt | grep Duration | wc -l))
zsh: bad math expression: operator expected at `log2.txt |...'
It's ugly, doesn't work and I can probably do it in a better way but I don't know how.
Also I would like to know if it possible to id incrementaly on a log stream read by tail for example.
First of all, you should know that, both grep|wc -l will count number of matched lines instead of occurrences, I hope this is what you really want.
Regarding your requirement, indeed, your approach is ugly (7 processes), apart from the mistakes. The job can be done by a single awk line:
awk '/timed|[Ee]rror/{a++}/Duration/{b++}END{printf "%.2f\n",a/b}' log2.txt
The above line calculates the result based on matched number of lines, same as your grep|wc -l.
You have several problems:
You are trying to run shell commands directly inside an arithmetic expression.
You aren't passing the correct regular expression to grep.
You need to make sure at least one of the operands is a floating-point value to trigger zsh's floating-point division.
Each pipeline can also be reduced to a single command; use input redirection instead of cat, and use the -c option to get the number of lines that match the regular expression.
echo $(( 1.0 * $(grep -c 'timed\|error\|Error' log2.txt) / $(grep -c Duration log2.txt))
Basic regular expressions treat unescaped | as a literal character, not an alteration operator.
$ echo foo | grep foo\|bar
$ echo foo | grep foo\\\|bar # Pass a literal backslash as part of the regex
foo
$ echo foo | grep 'foo\|bar' # Use '...' instead of explicitly escaping \ and |
foo
$ echo foo | grep -E 'foo|bar' # Use extended regular expressions instead

GNU Make shell function breaks when piped to cut

Hello I have a MVE where I am trying to catenate two variables and then pipe to cut.
all:
#echo $(APP_NAME)
#echo $(CURRENT_BRANCH)
#echo $(call EB_SAFE_NAME,$(CURRENT_BRANCH))
#echo $(shell echo "$(APP_NAME)-$(call EB_SAFE_NAME,$(CURRENT_BRANCH))" | cut -c 23)
Output:
$ cicdtest
$ issue#13-support-multi-branch
$ issue-13-support-multi-branch
$ o
If I remove the | cut -c 23 then the output is fine, but I do need to limit to 23 char. What am I doing wrong on the 4th echo statement above?
Different behavior in a test script then in make, but the issue is with explicit use of cut, not with make. The following works as expected:
#echo $(shell echo $(APP_NAME)-$(call EB_SAFE_NAME,$(CURRENT_BRANCH)) | cut -c 1-23)
Cut has some handling for the incomplete range, but in make (even though I am using bash) the complete range is needed:
Bytes, characters, and fields are are numbered starting at 1 and separated by commas.
Incomplete ranges can be given: -M means 1-M ; N- means N through end of line or last field.
Options
-b BYTE-LIST
--bytes=BYTE-LIST
Print only the bytes in positions listed in BYTE-LIST. Tabs and
backspaces are treated like any other character; they take up 1
byte.
-c CHARACTER-LIST
--characters=CHARACTER-LIST
Print only characters in positions listed in CHARACTER-LIST. The
same as `-b' for now, but internationalization will change that.
Tabs and backspaces are treated like any other character; they
take up 1 character.

Get variables from HTTP and processing it with AppleScript

I get this result from a website:
Value2: 16
Value4: 34
It is possible to have multiple lines or just one. The value is always separated with a ":". The values should be used in AppleScript like this:
set Value2 to 16
set value4 to 34
...
This is what I have so far:
set someText to do shell script "curl http://asdress | textutil -stdin -stdout -format html -convert txt -encoding UTF-8"
set AppleScript's text item delimiters to {":"}
set delimitedList to every text item of someText
How can I set the variables individually?
Thanks for your help!
If you simulate the output of your curl command with echo like this:
echo -e "Value2: 16\n Value4: 34"
echo -e "Value2: 16 Value4: 34"
you can test out some filtering with grep. I would go for the following:
echo -e "Value2: 16\n Value4: 34" | grep -Ewo "\d+"
which will use Extended Regular Expressions (-E) which allows me to look for one or more digits with \d+, and the -w says what I find must have a word boundary either side (so that I don't find the Value2 or Value4, and the -o says to only output the part that matches.
So, to answer your question, I would change your curl to
curl .... | grep -Ewo "\d+"
and you will just get the output
16
34
You don't. Either use a map/hash/dictionary/associative list/table structure to store arbitrary key-value pairs, or use if theKey == "Value1" then \ set Value1 to theValue \ else if theKey == "Value2" then ... etc. if you have a limited number of known keys and really must use separate variables for some rare reason.
Frustratingly, AppleScript doesn't include a key-value data type - it only has arrays (lists) and structs (records) - but you can easily roll your own quick-n-dirty associative list routines for looping over a list of {theKey:"...",theValue:"..."} records (naive performance should be fine up to several dozen items; larger sets'll require more efficient algorithims). Or, if you're on 10.10+ and don't mind getting Cocoa in your AppleScript, you might consider using NSDictionary, which isn't totally ideal either but scales efficiently and saves you the hassle of writing your own code.

Returning Data Until N'th Occurence of Character X

Withing a directory I have multiple files that have multiple version numbers within the files. I am grepping each file within the directory for these version numbers, sorting them in order to get the most recent version number, and then piping that into 'tail -1' to only the most recent version number and not every grep result.
The data looks something like this:
file1: asdf garbage 1.2.4.1 garbagetext asdf
file2: fsdaf garbage asdfsda 4.3.2.10 fdsaf
and so on. I have already accomplished extracting the most recent version number. I did this with the following:
grep -o '[0-9]\{1,\}\.[0-9]\{1,\}\.[0-9]\{1,\}\.[0-9]\{1,\}' * | sort | tail -1
The next part is what I am having trouble on. I am trying to extract the number (whether it be one number character or two number characters) before the first period and return that result. Then, I am assuming with a slightly different command do the same thing but for the number after the first period. And again for the number after the second period and finally after the third period.
I have little to no experience with sed or awk, but after a little research I believe either one of these tools are the way to accomplish this.
Thank you!
Edit: Alright I got it, but I am certain this can be done in a much easier way. What I have is the following:
grep -o '[0-9]\{1,\}\.[0-9]\{1,\}\.[0-9]\{1,\}\.[0-9]\{1,\}' * | sort | tail -1 | grep -o '[0-9]\{1,\}' | sed -n 2p
or sed -n 1p, 3p, 4p depending on which value I want.
to get the lastest version number:
grep -P -o "\d+\.\d+\.\d+\.\d+" * |sed 's/.*://g'|awk -F'.' '{v[$0]=($1"."$2$3$4)+0;}END{m=0;for(x in v)if(v[x]>m){m=v[x];n=x;}print n}'
to extract numbers:
kent$ echo "10.2.30.4"|awk -F'.' -v OFS="\n" '$1=$1'
10
2
30
4
you can put the two line together.
To extract a version number, without having to know how many dots are in it, I would use
grep -o '[0-9.]\+' filename | sort --version-sort | tail -1
(assuming you have GNU sort, with the --version-sort option)
To get just the major version number, pipe the above into one of
sed 's/\..*//'
while read line; do echo ${line%%.*}; done

Generate a random filename in unix shell

I would like to generate a random filename in unix shell (say tcshell). The filename should consist of random 32 hex letters, e.g.:
c7fdfc8f409c548a10a0a89a791417c5
(to which I will add whatever is neccesary). The point is being able to do it only in shell without resorting to a program.
Assuming you are on a linux, the following should work:
cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32
This is only pseudo-random if your system runs low on entropy, but is (on linux) guaranteed to terminate. If you require genuinely random data, cat /dev/random instead of /dev/urandom. This change will make your code block until enough entropy is available to produce truly random output, so it might slow down your code. For most uses, the output of /dev/urandom is sufficiently random.
If you on OS X or another BSD, you need to modify it to the following:
cat /dev/urandom | env LC_CTYPE=C tr -cd 'a-f0-9' | head -c 32
why do not use unix mktemp command:
$ TMPFILE=`mktemp tmp.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX` && echo $TMPFILE
tmp.MnxEsPDsNUjrzDIiPhnWZKmlAXAO8983
One command, no pipe, no loop:
hexdump -n 16 -v -e '/1 "%02X"' -e '/16 "\n"' /dev/urandom
If you don't need the newline, for example when you're using it in a variable:
hexdump -n 16 -v -e '/1 "%02X"' /dev/urandom
Using "16" generates 32 hex digits.
uuidgen generates exactly this, except you have to remove hyphens. So I found this to be the most elegant (at least to me) way of achieving this. It should work on linux and OS X out of the box.
uuidgen | tr -d '-'
As you probably noticed from each of the answers, you generally have to "resort to a program".
However, without using any external executables, in Bash and ksh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); done; echo $string
in zsh:
string=''; for i in {0..31}; do string+=$(printf "%x" $(($RANDOM%16)) ); dummy=$RANDOM; done; echo $string
Change the lower case x in the format string to an upper case X to make the alphabetic hex characters upper case.
Here's another way to do it in Bash but without an explicit loop:
printf -v string '%X' $(printf '%.2s ' $((RANDOM%16))' '{00..31})
In the following, "first" and "second" printf refers to the order in which they're executed rather than the order in which they appear in the line.
This technique uses brace expansion to produce a list of 32 random numbers mod 16 each followed by a space and one of the numbers in the range in braces followed by another space (e.g. 11 00). For each element of that list, the first printf strips off all but the first two characters using its format string (%.2) leaving either single digits followed by a space each or two digits. The space in the format string ensures that there is then at least one space between each output number.
The command substitution containing the first printf is not quoted so that word splitting is performed and each number goes to the second printf as a separate argument. There, the numbers are converted to hex by the %X format string and they are appended to each other without spaces (since there aren't any in the format string) and the result is stored in the variable named string.
When printf receives more arguments than its format string accounts for, the format is applied to each argument in turn until they are all consumed. If there are fewer arguments, the unmatched format string (portion) is ignored, but that doesn't apply in this case.
I tested it in Bash 3.2, 4.4 and 5.0-alpha. But it doesn't work in zsh (5.2) or ksh (93u+) because RANDOM only gets evaluated once in the brace expansion in those shells.
Note that because of using the mod operator on a value that ranges from 0 to 32767 the distribution of digits using the snippets could be skewed (not to mention the fact that the numbers are pseudo random in the first place). However, since we're using mod 16 and 32768 is divisible by 16, that won't be a problem here.
In any case, the correct way to do this is using mktemp as in Oleg Razgulyaev's answer.
Tested in zsh, should work with any BASH compatible shell!
#!/bin/zsh
SUM=`md5sum <<EOF
$RANDOM
EOF`
FN=`echo $SUM | awk '// { print $1 }'`
echo "Your new filename: $FN"
Example:
$ zsh ranhash.sh
Your new filename: 2485938240bf200c26bb356bbbb0fa32
$ zsh ranhash.sh
Your new filename: ad25cb21bea35eba879bf3fc12581cc9
Yet another way[tm].
R=$(echo $RANDOM $RANDOM $RANDOM $RANDOM $RANDOM | md5 | cut -c -8)
FILENAME="abcdef-$R"
This answer is very similar to fmarks, so I cannot really take credit for it, but I found the cat and tr command combinations quite slow, and I found this version quite a bit faster. You need hexdump.
hexdump -e '/1 "%02x"' -n32 < /dev/urandom
Another thing you can add is running the date command as follows:
date +%S%N
Reads nonosecond time and the result adds a lot of randomness.
The first answer is good but why fork cat if not required.
tr -dc 'a-f0-9' < /dev/urandom | head -c32
Grab 16 bytes from /dev/random, convert them to hex, take the first line, remove the address, remove the spaces.
head /dev/random -c16 | od -tx1 -w16 | head -n1 | cut -d' ' -f2- | tr -d ' '
Assuming that "without resorting to a program" means "using only programs that are readily available", of course.
If you have openssl in your system you can use it for generating random hex (also it can be -base64) strings with defined length. I found it pretty simple and usable in cron in one line jobs.
openssl rand -hex 32
8c5a7515837d7f0b19e7e6fa4c448400e70ffec88ecd811a3dce3272947cb452
Hope to add a (maybe) better solution to this topic.
Notice: this only works with bash4 and some implement of mktemp(for example, the GNU one)
Try this
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//}
This one is twice as faster as head /dev/urandom | tr -cd 'a-f0-9' | head -c 32, and eight times as faster as cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32.
Benchmark:
With mktemp:
#!/bin/bash
# a.sh
for (( i = 0; i < 1000; i++ ))
do
fn=$(mktemp -u -t 'XXXXXX')
echo ${fn/\/tmp\//} > /dev/null
done
time ./a.sh
./a.sh 0.36s user 1.97s system 99% cpu 2.333 total
And the other:
#!/bin/bash
# b.sh
for (( i = 0; i < 1000; i++ ))
do
cat /dev/urandom | tr -dc 'a-zA-Z0-9' | head -c 32 > /dev/null
done
time ./b.sh
./b.sh 0.52s user 20.61s system 113% cpu 18.653 total
If you are on Linux, then Python will come pre-installed. So you can go for something similar to the below:
python -c "import uuid; print str(uuid.uuid1())"
If you don't like the dashes, then use replace function as shown below
python -c "import uuid; print str(uuid.uuid1()).replace('-','')"

Resources