Adding a counter to a specific string using unix - unix

I am trying to add a counter to a specific string using unix, I have tried some sed and awk commands but I can't seem to do it properly.
My input file is:
Event_ A D L K
Event_ B P R
Event_ C F I
Event_ J K
M
N
O
Event_ Q S
X
Y
Z
G
T
What I'm hoping to get is:
Event_00000001 A D L K
Event_00000002 B P R
Event_00000003 C F I
Event_00000004 J K
M
N
O
Event_00000005 Q S
X
Y
Z
G
T
Can anyone help?

Use this awk:
awk '/^Event/{$1=sprintf("%s%06d", $1,++counter)}1' yourfile
If fields are delimited by \t(Tab),
awk -F"\t" '/^Event/{$1=sprintf("%s%06d", $1,++counter)}1' OFS='\t' yourfile
Test:
$ awk '/^Event/{$1=sprintf("%s%06d", $1,++counter)}1' file
Event_000001 A D L K
Event_000002 B P R
Event_000003 C F I
Event_000004 J K
M
N
O
Event_000005 Q S
X
Y
Z
G
T

Related

Is there a infinite loop in my codes? in ocaml

I want to get the sum of function f(i) values when i is equal from a to b
= f(a)+f(a+1)+...+f(b-1)+f(b)
So I wrote code like this.
let rec sigma : (int -> int) -> int -> int -> int
= fun f a b ->
if a=b then f a
else f b + sigma f a b-1 ;;
but result is that there is stack overflow during evaluation. Is there a infinite loop? and why?
sigma f a b-1 is parsed as (sigma f a b) - 1 instead of your intention, sigma f a (b-1). Since sigma f a b calls sigma f a b recursively in your code, it never stops.
The best practice is to put white spaces around binary operators like sigma f a b - 1 so that you would not misread what you write.

Breadth and depth first search on a graph with returning edges

I do understand depth and breadth first search but this graph got me confused as there is nodes that points to preceding nodes in the graph.
So let's say for instant that N is a goal state, then using Depth first search we would have
A B E J K L F G M N
So we is it correct this way ? I don't repeat the A because it was visited before right.
And using breadth first search I would go level by level and so I would have
A B C D E F G H I J K L M N
Is this correct ?
And if we change the Goal state to P
then DFS will give us A B E J K L F G M N H O P
and BFS will give us A B C D E F G H I J K L M N O P
I feel I got this right, I am just uncertain if I am right because of the returning edges in this graph. So I just want someone to confirm that I am on the right track here.
That sounds correct to me. When pointing to a node that's already in your result set, it should not be added into the result set a second time.

Turning row-based data into columns by header

I have one (fairly large) file, formatted like such:
SET1
A B C D E F G
SET2
H I J K L M
SETX
(...)
etc.
I would prefer to have them
SET1 SET2 SETX
A H (...)
B I
C J
D K
E L
F M
G
Note that the columns are unequally long, and they are not ordered by size. My file is too big to use the column function inbuilt in unix, and attempts at getting cute by splicing the file and then pasting it together have had problematic results (that is, it has resulted in the empty columns getting the same content as the separator, which doesn't work for my purposes - they both ended up being "\t"). Note that each set may contain several hundred entries, and I have thousands of sets, making awk impractical (at least with my admittedly limited skills there).
Ideally, the output should be readable in R, but at this point I'd be very happy for something that is practically translatable into R input. Note that I can totally live with this having a non-whitespace separator if that is more practical.
Many thanks in advance for any help! Working in an external linux environment.
Edit:
I also have the file available as
SET1
A
B
C
D
E
F
G
SET2
H
I
J
K
L
M
If that could make it easier.
I guess this is more what you wanted:
awk -v OFS="\t"
'/^SET/ {sets[++cols]=$0; set=$0; max_recs=(c>max_recs?c:max_recs); c=0; next}
NF{a[cols,++c]=$0}
END {
for (i=1;i<=cols; i++) printf "%s%s", sets[i], OFS
print ""
for (i=1; i<=max_recs; i++) {
for (j=1; j<=cols; j++) printf "%s%s", a[j,i], OFS
print ""
}
}' file
For this given input:
SET1
B
C
D
E
F
G
SET2
H
I
J
K
L
M
AAA
SET3
A
B
C
D
It returns:
$ awk -v OFS="\t" '/^SET/ {sets[++cols]=$0; set=$0; max_recs=(c>max_recs?c:max_recs); c=0; next} NF{a[cols,++c]=$0} END {for (i=1;i<=cols; i++) printf "%s%s", sets[i], OFS; print ""; for (i=1; i<=max_recs; i++) { for (j=1; j<=cols; j++) printf "%s%s", a[j,i], OFS; print ""}}' file
SET1 SET2 SET3
B H A
C I B
D J C
E K D
F L
G M
AAA
Previous solution with just one block.
You can use paste to show files side by side.
In this case, let's use head and tail to the get half and half. Then, xargs to print one block of text per line. Then they are ready to be pasted:
paste -d"\t" <(head -2 file | xargs -n1) <(tail -2 file | xargs -n1)
For your given input it returns:
SET1 SET2
A H
B I
C J
D K
E L
F M
G

How to remove/add spaces in all textfiles?

I have several files that look like these, e.g. test.in:
apple foo bar
hello world
I need to achieve this desired output, a space after every character:
a p p l e f o o b a r
h e l l o w o r l d
I though possibly i'll first remove all spaces and then add spaces to each character, as such:
sed 's/\s//g' test.in | sed -e 's/\(.\)/\1 /g'
but is there other ways?
This awk may do:
awk -v FS="" '{gsub(/ /,"");$1=$1}1' file
a p p l e f o o b a r
h e l l o w o r l d
This first remove all space, then since FS (Field Separator) is set to nothing, the $1=$1 reconstruct all fields with one space.
This does not add space at the end as most of the other sed and perl command here.
Or based on sed posted here.
awk '{gsub(/ /,"");gsub(/./,"& ")}1' file
a p p l e f o o b a r
h e l l o w o r l d
You can combine your two sed commands into a single command instead:
$ sed 's/\s//g;s/./& /g' test.in
a p p l e f o o b a r
h e l l o w o r l d
Note the use of . and & instead of \(.\) and \1.
On systems that do not support \s to designate matching whitespace, you can use [[::blank::]] instead:
$ sed 's/[[:blank:]]//g;s/./& /g' test.in
a p p l e f o o b a r
h e l l o w o r l d
Through perl,
$ perl -ple 's/([^ ]|^)(?! )/\1 /g' file
a p p l e f o o b a r
h e l l o w o r l d
Add an inline edit option -i to save the changes made,
perl -i -ple 's/([^ ]|^)(?! )/\1 /g' file
sed 's/ //g;s/./& /g' filename
&: refers to that portion of the pattern space which matched
Or maybe something like this with sed :
$ sed 's/./& /g;s/ //g' file
a p p l e f o o b a r
h e l l o w o r l d
This might work for you (GNU sed):
sed 's/\B/ /g' file

2-D FFT using MPI_Alltoall in MPI

I have a 4X6 matrix split as two 2X6 matrices on 2 processors. Since the rows have been split in a contiguous way (C Language) on the processors, we can carry out a 1-D FFT on the rows. The problem is we need an MPI_Alltoall() to collect contiguous columns on a particular processor for 1-D column FFT i.e. Processor 0 has:
A B C D E F
G H I J K L
Processor 1 has:
M N O P Q R
S T U V W X
The MPI_Alltoall() needs to convert this to the following on Processor 0:
A G M S
B H N T
C I O U
And the following on Processor 2:
D J P V
E K Q W
F L R X
I tried to define a vector having count=2, blocklength=1, stride=6 as the sending type vector, set its extent to MPI_INT, displacement to 0 and sent 3 instances of this vector using MPI_Alltoall(). This according to me should send A G, B H, and C I to processor 0 and D J, E K, and F L to processor 1. Similarly M S, N T, and O U to processor 0 and P V, Q W, and R X to processor 1.
Is my interpretation correct ? If yes, then how do I receive these 3 instances of vectors on the receiving side ( What/How do I define the datatype?).

Resources