Awk script for substracting from the above field - unix

Hi I have my input file with one field:
30
58
266
274
296
322
331
I need the output to be the difference of 2nd and 1st rows(58-30=28) and 3rd and 2nd rows(266-58=208) and so on.
my output should look like below:
30 30
58 28
266 208
274 8
any help please?

data=`cat file | xargs`
echo $data | awk '{a=0; for(i=1; i<=NF;i++) { print $i, $i-a; a=$i}}'
30 30
58 28
266 208
274 8
296 22
322 26
331 9
Update upon comment Without cat/xargs:
awk '{printf "%d %d\n", $1, $1-a; a=$1;}' file

You don't actually need the for loop from Khachick's answer as Awk will go through all the rows anyway. Simpler is:
cat file | awk '{ BEGIN { a=0 }; { print $1, $1-a; a=$1 }'
However it is also possible to skip the first row that you don't really want by initialising a variable in the BEGIN block and not doing the print if the variable is so initialised before changing its value. Sort of like:
BEGIN { started=0 }; { if(0 == started) { started = 1 } else { print $1, $1-a } }

Related

How can I use kbgetc in xv6

I want use kbdgetc() in user mode.
I need to use it to program a vim-like software in xv6.
I try to use kernel mode, but I totally don't know how to do it.
I guess that you want some non-buffering reading on fd 0?
To achieve this behavior, you can modify consoleintr function that is responsible of it.
First add some variable internal to the kernel to disable or not buffering.
Lets call it non_buffering and set its default values to 0.
Then add a system call to change this value (or modify an existing, as you want)
Change consoleintr this way (modification is line 221):
191 void
192 consoleintr(int (*getc)(void))
193 {
194 int c, doprocdump = 0;
195
196 acquire(&cons.lock);
197 while((c = getc()) >= 0){
198 switch(c){
....
216 default:
217 if(c != 0 && input.e-input.r < INPUT_BUF){
218 c = (c == '\r') ? '\n' : c;
219 input.buf[input.e++ % INPUT_BUF] = c;
220 consputc(c);
/* NON_BUFFERING added here */
221 if(non_buffering || c == '\n' || c == C('D') || input.e == input.r+INPUT_BUF){
222 input.w = input.e;
223 wakeup(&input.r);
224 }
225 }
226 break;
227 }
228 }
229 release(&cons.lock);
230 if(doprocdump) {
231 procdump(); // now call procdump() wo. cons.lock held
232 }
233 }

Awk program to compare number of fields by space of each line

I am trying to check if each line has a same length(or number of fields) in a file.
I am doing the following but it seems not to work.
NR==1 {length=NF}
NR>1 && NF!=length {print}
Can this be done by a one-liner awk? or a program is fine.
A sample of input would be:
12 34 54 56
12 89 34 33
12
29 56 42 42
My expected output would be "yes" or "no" if they have the same number of fields or not.
You could try this command which checks the number of fields in each line and compares it to the number of fields of the first line:
awk 'NR==1{a=NF; b=0} (NR>1 && NF!=a){print "No"; b=1; exit 1}END{if (b==0) print "Yes"}' test.txt
Checking is aborted in the first line whose number of fields is distinct from the first line of input.
For input
12 43 43
12 32
you will get "No"
Try:
awk 'BEGIN{a="yes"} last!="" && NF!=last{a="no"; exit} {last=NF} END{print a}' file
How it works
BEGIN{a="yes"}
This initializes the variable a to yes. (We assume all lines have the same number fields until proven otherwise.)
last!="" && NF!=last{a="no"; exit}
If last has been assigned a value and the number of fields on the current line is not the same as last, then set a to no and exit.
{last=NF}
Update last to the number of fields on the current line.
END{print a}
Before exiting, print a.
Examples
$ cat file1
2 34 54 56
12 89 34 33
12
29 56 42 42
$ awk 'BEGIN{a="yes"} last!="" && NF!=last{a="no"; exit} {last=NF} END{print a}' file1
no
$ cat file2
2 34 54 56
12 89 34 33
29 56 42 42
$ awk 'BEGIN{a="yes"} last!="" && NF!=last{a="no"; exit} {last=NF} END{print a}' file2
yes
I am assuming that you want to check fields of all lines, if they are equal or not if this is case then try following.
awk '
FNR==1{
value=NF
count++
next
}
{
count=NF==value?++count:count
}
END{
if(count==FNR){
print "All lines are of same fields"
}
else{
print "All lines are NOT of same fields."
}
}
' Input_file
Additional stuff(only if require): In case you want to print contents of file whose all lines are having same fields along with yes or all are same fields in file message in output then try following.
awk '
{
val=val?val ORS $0:$0
}
FNR==1{
value=NF
count++
next
}
{
count=NF==value?++count:count
}
END{
if(count==FNR){
print "All lines are of same fields" ORS val
}
else{
print "All lines are NOT of same fields."
}
}
' Input_file
this should do
$ awk 'NR==1{p=NF} p!=NF{s=1; exit} END{print s?"No":"Yes"}' file
however, setting the exit status would be better if this will be part of a workflow.
Since equivalence has transitive property, there is no need to keep NF other than the first line; setting 0 as your success value doesn't require initialization to default value.
An efficient even fields shell function, using sed to construct a regex, (based on the first line of input), to feed to GNU grep, which looks for field length mismatches:
# Usage: ef filename
ef() { sed '1s/[^ ]*/[^ ]*/g;q' "$1" | grep -v -m 1 -q -f - "$1" \
&& echo no || echo yes ; }
For files with uneven fields grep -m 1 quits after the first non-uniform line -- so if the file is a million lines long, but the mismatch occurs on line #2, grep only needs to read two lines, not a million. On the other hand, if there's no mismatch grep would have to read a million lines.

Unix Pipeling "AWK" - The summation whilst matching

Below I have some raw data. My goal is to match 'column one' values and have the total number of bytes in a single line of output for each ip address.
For example output:
81.220.49.127 6654
81.226.10.238 328
81.227.128.93 84700
Raw Data:
81.220.49.127 328
81.220.49.127 328
81.220.49.127 329
81.220.49.127 367
81.220.49.127 5302
81.226.10.238 328
81.227.128.93 84700
Can anyone advise me on how to do this.
Using an associative array:
awk '{a[$1]+=$2}END{for (i in a){print i,a[i]}}' infile
Alternative to preserve order:
awk '!($1 in a){b[++cont]=$1}{a[$1]+=$2}END{for (c=1;c<=cont;c++){print b[c],a[b[c]]}}' infile
Another way where arrays are not needed:
awk 'lip != $1 && lip != ""{print lip,sum;sum=0}
{sum+=$NF;lip=$1}
END{print lip,sum}' infile
Result
81.220.49.127 6654
81.226.10.238 328
81.227.128.93 84700

extract a string after a pattern

I want to extract the numbers following client_id and id and pair up client_id and id in each line.
For example, for the following lines of log,
User(client_id:03)) results:[RelatedUser(id:204, weight:10),_RelatedUser(id:491,_weight:10),_RelatedUser(id:29, weight: 20)
User(client_id:04)) results:[RelatedUser(id:209, weight:10),_RelatedUser(id:301,_weight:10)
User(client_id:05)) results:[RelatedUser(id:20, weight: 10)
I want to output
03 204
03 491
03 29
04 209
04 301
05 20
I know I need to use sed or awk. But I do not know exactly how.
Thanks
This may work for you:
awk -F "[):,]" '{ for (i=2; i<=NF; i++) if ($i ~ /id/) print $2, $(i+1) }' file
Results:
03 204
03 491
03 29
04 209
04 301
05 20
Here's a awk script that works (I put it on multiple lines and made it a bit more verbose so you can see what's going on):
#!/bin/bash
awk 'BEGIN{FS="[\(\):,]"}
/client_id/ {
cid="no_client_id"
for (i=1; i<NF; i++) {
if ($i == "client_id") {
cid = $(i+1)
} else if ($i == "id") {
id = $(i+1);
print cid OFS id;
}
}
}' input_file_name
Output:
03 204
03 491
03 29
04 209
04 301
05 20
Explanation:
awk 'BEGIN{FS="[\(\):,]"}: invoke awk, use ( ) : and , as delimiters to separate your fields
/client_id/ {: Only do the following for the lines that contain client_id:
for (i=1; i<NF; i++) {: iterate through the fields on each line one field at a time
if ($i == "client_id") { cid = $(i+1) }: if the field we are currently on is client_id, then its value is the next field in order.
else if ($i == "id") { id = $(i+1); print cid OFS id;}: otherwise if the field we are currently on is id, then print the client_id : id pair onto stdout
input_file_name: supply the name of your input file as first argument to the awk script.
This might work for you (GNU sed):
sed -r '/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!d;s//\2 \3\n\1/;P;D' file
/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!d if the line doesn't have the intended strings delete it.
s//\2 \3\n\1/ re-arrange the line by copying the client_id and moving the first id ahead thus reducing the line for successive iterations.
P print upto the introduced newline.
D delete upto the introduced newline.
I would prefer awk for this, but if you were wondering how to do this with sed, here's one way that works with GNU sed.
parse.sed
/client_id/ {
:a
s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/
ta
s/^[^\n]+\n//
}
Run it like this:
sed -rf parse.sed infile
Or as a one-liner:
<infile sed '/client_id/ { :a; s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/; ta; s/^[^\n]+\n//; }'
Output:
03 204
03 491
03 29
04 209
04 301
05 20
Explanation:
The idea is to repeatedly match client_id:([0-9]+) and id:([0-9]+) pairs and put them at the end of pattern space. On each pass the id:([0-9]+) is removed.
The final replace removes left-overs from the loop.

Read a column value from previous line and next line but insert them as additional fields in the current line using awk

I hope you can help me out with my problem.
I have an input file with 3 columns of data which looks like this:
Apl_No Act_No Sfx_No
100 10 0
100 11 1
100 12 2
100 13 3
101 20 0
101 21 1
I need to create an output file which contains the data as in the input and 3 additional fileds in its output. It should look like this:
Apl_No Act_No Sfx_No Crt_Act_No Prs_Act_No Cd_Act_No
100 10 0 - - -
100 11 1 10 11 12
100 12 2 11 12 13
100 13 3 12 13 10
101 20 0 - - -
101 21 1 20 21 20
Every Apl_No has a set of Act_No that are mapped to it. 3 new fields need to be created: Crt_Act_No Prs_Act_No Cd_Act_No. When the first unique Apl_No is encountered the column values 4, 5 and 6 (Crt_Act_No Prs_Act_No Cd_Act_No) need to be dashed out. For every following occurrence of the same Apl_No the Crt_Act_No is the same as the Act_No on the previous line, the Prs_Act_No is same as the Act_No on the current line and the Cd_Act_No is same as the Act_No on the next line. This continues for all the following rows bearing the same Apl_No except for the last row. In the last row the Crt_Act_No and Prs_Act_No is filled in the same way as the above rows but the Cd_Act_No needs to be pulled from the Act_No from the first row when the first unique Apl_No is encountered.
I wish to achieve this using awk. Can anyone please help me out how to go about this.
One solution:
awk '
## Print header in first line.
FNR == 1 {
printf "%s %s %s %s\n", $0, "Crt_Act_No", "Prs_Act_No", "Cd_Act_No";
next;
}
## If first field not found in the hash means that it is first unique "Apl_No", so
## print line with dashes and save some data for use it later.
## "line" variable has the content of the previous iteration. Print it if it is set.
! apl[ $1 ] {
if ( line ) {
sub( /-/, orig_act, line );
print line;
line = "";
}
printf "%s %s %s %s\n", $0, "-", "-", "-";
orig_act = prev_act = $2;
apl[ $1 ] = 1;
next;
}
## For all non-unique "Apl_No"...
{
## If it is the first one after the line with
## dashes (line not set) save it is content in "line" and the variable
## that I will have to check later ("Act_No"). Note that I leave a dash in last
## field to substitute in the following iteration.
if ( ! line ) {
line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
prev_act = $2;
next;
}
## Now I know the field, so substitute the dash with it, print and repeat
## the process with current line.
sub( /-/, $2, line );
print line;
line = sprintf( "%s %s %s %s", $0, prev_act, $2, "-" );
prev_act = $2;
}
END {
if ( line ) {
sub( /-/, orig_act, line );
print line;
}
}
' infile | column -t
That yields:
Apl_No Act_No Sfx_No Crt_Act_No Prs_Act_No Cd_Act_No
100 10 0 - - -
100 11 1 10 11 12
100 12 2 11 12 13
100 13 3 12 13 10
101 20 0 - - -
101 21 1 20 21 20

Resources