How to check the status of a parent and child process? - unix

I have just written a simple piece of code to check how the child and parent process run. But I am not getting the desired output.
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
int main()
{
pid_t x;
int n=1;
x=fork();
if(x>0)
{
n+=2;
printf("Parent process exist %d\n",n);
}
else if(x==0)
{
n+=5;
printf(" Child process %d\n ",n);
}
printf("done %d",n);
return 0;
}
The code is very trivial but is there any hidden problem which gives unexpected output?

Warning: Answer is not rules lawyery!
Yes. n+=5 is not an atomic operation. It consists of three "sub-operations": load n, add 5, store n.
Except it doesn't even do that, necessarily, since the compiler is free to go "hey; all this loading and storing n into RAM is pointless when I can just keep the value in a register". Declare the variable volatile int to fix this.
The importance of that non-atomic thing can be seen with this example compilation and execution of your code (with the volatile change):
0 SYSCALL fork_into_register_zero
1 STORE 1 INTO RAM #386
2 COMPARE REGISTER #0 TO 0
3 JUMP IF <= TO INSTRUCTION #23
4 LOAD RAM #386 INTO REGISTER #1
5 ADD 2 TO REGISTER #1
6 STORE REGISTER #1 INTO RAM #386
7 LOAD "Parent process exist " INTO REGISTER #0
8 LOAD 1 INTO REGISTER #1
9 SYSCALL output_string_from_register_zero_to_file_descriptor_from_register_one
10 LOAD 1000000 INTO REGISTER #2
11 LOAD RAM #386 INTO REGISTER #1
12 STORE REGISTER #1 INTO REGISTER #0
13 DIVBY REGISTER #2 TO REGISTER #0
14 ADD '0' TO REGISTER #0
15 SYSCALL putch_from_register_zero
16 MODBY REGISTER #2 TO REGISTER #1
17 DIVBY 10 TO REGISTER #2
18 COMPARE REGISTER #2 TO 0
19 JUMP IF > TO INSTRUCTION #12
20 STORE '\n' TO REGISTER #0
21 SYSCALL putch_from_register_zero
22 COMPARE 1 TO 0
23 JUMP IF != TO INSTRUCTION 44
24 LOAD RAM #386 INTO REGISTER #1
25 ADD 5 TO REGISTER #1
26 STORE REGISTER #1 INTO RAM #386
27 LOAD " Child process " INTO REGISTER #0
28 LOAD 1 INTO REGISTER #1
29 SYSCALL output_string_from_register_zero_to_file_descriptor_from_register_one
30 LOAD 1000000 INTO REGISTER #2
31 LOAD RAM #386 INTO REGISTER #1
32 STORE REGISTER #1 INTO REGISTER #0
33 DIVBY REGISTER #2 TO REGISTER #0
34 ADD '0' TO REGISTER #0
35 SYSCALL putch_from_register_zero
36 MODBY REGISTER #2 TO REGISTER #1
37 DIVBY 10 TO REGISTER #2
38 COMPARE REGISTER #2 TO 0
39 JUMP IF > TO INSTRUCTION #12
40 STORE '\n' TO REGISTER #0
41 SYSCALL putch_from_register_zero
42 STORE ' ' TO REGISTER #0
43 SYSCALL putch_from_register_zero
44 LOAD "Done " INTO REGISTER #0
45 LOAD 1 INTO REGISTER #1
46 SYSCALL output_string_from_register_zero_to_file_descriptor_from_register_one
47 LOAD 1000000 INTO REGISTER #2
48 LOAD RAM #386 INTO REGISTER #1
49 STORE REGISTER #1 INTO REGISTER #0
50 DIVBY REGISTER #2 TO REGISTER #0
51 ADD '0' TO REGISTER #0
52 SYSCALL putch_from_register_zero
53 MODBY REGISTER #2 TO REGISTER #1
54 DIVBY 10 TO REGISTER #2
55 COMPARE REGISTER #2 TO 0
56 JUMP IF > TO INSTRUCTION #12
57 STORE '\n' TO REGISTER #0
58 SYSCALL putch_from_register_zero
59 STORE 0 TO REGISTER #0
60 RETURN
Here there's a kernel-level putch-to-stdout operation, because I don't think consistency is important enough to retype 23 lines of fake code. Note that SYSCALLs are handled by the kernel and so are atomic (when dealing with output to a file descriptor, at least).
Each one of these instructions is atomic. However, everything after the SYSCALL fork_into_register_zero is run twice with different values of REGISTER #0, and can be interleaved in any way. Let that sink in. The chances are that n isn't going to be 8 at the end. In fact, the output could be this:
Child process 3
Done Parent process exist 33
Done 3
Doesn't seem right? That's threading for you!

Related

"Out of memory" in hibrid MPI/OpenMP for GPU acceleration

I have compiled Quantum ESPRESSO (Program PWSCF v.6.7MaX) for GPU acceleration (hibrid MPI/OpenMP) with the next options:
module load compiler/intel/2020.1
module load hpc_sdk/20.9
./configure F90=pgf90 CC=pgcc MPIF90=mpif90 --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp BLAS_LIBS='-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core'
make -j8 pw
Apparently, the compilation ends succesfully. Then, I execute the program:
export OMP_NUM_THREADS=1
mpirun -n 2 /home/my_user/q-e-gpu-qe-gpu-6.7/bin/pw.x < silverslab32.in > silver4.out
Then, the program starts running and print out the next info:
Parallel version (MPI & OpenMP), running on 8 processor cores
Number of MPI processes: 2
Threads/MPI process: 4
...
GPU acceleration is ACTIVE
...
Estimated max dynamical RAM per process > 13.87 GB
Estimated total dynamical RAM > 27.75 GB
But after 2 minutes of execution the job ends with error:
0: ALLOCATE: 4345479360 bytes requested; status = 2(out of memory)
0: ALLOCATE: 4345482096 bytes requested; status = 2(out of memory)
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[47946,1],1]
Exit code: 127
--------------------------------------------------------------------------
This node has > 180GB of available RAM. I check the Memory use with the top command:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
89681 my_user 20 0 30.1g 3.6g 2.1g R 100.0 1.9 1:39.45 pw.x
89682 my_user 20 0 29.8g 3.2g 2.0g R 100.0 1.7 1:39.30 pw.x
I noticed that the process stops when RES memory reaches 4GB. This are the caracteristics of the node:
(base) [my_user#gpu001]$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41
node 0 size: 95313 MB
node 0 free: 41972 MB
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
node 1 size: 96746 MB
node 1 free: 70751 MB
node distances:
node 0 1
0: 10 21
1: 21 10
(base) [my_user#gpu001]$ free -lm
total used free shared buff/cache available
Mem: 192059 2561 112716 260 76781 188505
Low: 192059 79342 112716
High: 0 0 0
Swap: 8191 0 8191
The version of MPI is:
mpirun (Open MPI) 3.1.5
This node is a compute node in a cluster, but no matter if I submit the job with SLURM or run it directly on the node, the error is the same.
Note that I compile it on the login node and run it on this GPU node, the difference is that on the login node it has no GPU connected.
I would really appreciate it if you could help me figure out what could be going on.
Thank you in advance!

Reliability or Robustness Testing a Web Server

I have just built a proof of concept for an asp.net MVC controller to (1) generate a barcode from user input using barcode rendering framework and (2) embed it in a PDF document using wkhtmltopdf.exe
Before telling my client it's a working solution, I want to know it's not going to bring down their website. My main concern is long-term reliability -- whether for instance creating and disposing the unmanaged system process for wkhtmltopdf.exe might leak something. (Peak performance and load is not expected to be such an issue - only a few requests per minute at peak).
So, I run a couple of tests from the Windows command line:
(1) 1,000 Requests in Sequence (ie 1 at a time)
for /l %i in (1,1,1000) do curl ^
"http://localhost:8003/Home/Test?text=Iteration_%i___012345&scale=2&height=50" -o output.pdf
(2) Up to 40 requests sent within 2 seconds
for /l %i in (1,1,40) do start "Curl %i" curl ^
"http://localhost:8003/Home/Test?text=Iteration_%i___012345&scale=2&height=50" -o output%i.pdf
And I record some performance counters in perfmon before, during & after. Specifically I look at total processes, threads, handles, memory, disk use on the machine and on the IIS process specifically.
So, my questions:
1) What would you consider acceptable evidence that the the solution looks to be at low risk of bringing down the server? Would you amend what I've done, or would you do something completely different?
2) Given my concern is reliability, I think that the 'Before' vs 'After' figures are the ones I most care about. Agree or not?
3) Looking at the Before vs After figures, the only concern I see is the 'Processes Total Handle Count'. I conclude that launching wkhtmltopdf.exe nearly a thousand times has probably not leaked anything or destabilised the machine. But I might be wrong and should run the same tests for hours or days to reduce the level of doubt. Agree or not?
(The risk level: A couple of people's jobs might be on the line if it went pear-shaped. Revenue on the site is £1,000s per hour).
My perfmon results were as follows.
700 Requests in Sequence
1-5 Mins 10 Mins
Counter Before Test Min Ave Max After Test
System
System Processes 95 97 100 101 95
System Threads 1220 1245 1264 1281 1238
Memory Available MB 4888 4840 4850 4868 4837
Memory % Committed 23 24 24 24 24
Processes Total Handle Cou 33255 34147 34489 34775 34029
Processor % Processor Time 4 to 30 40 57 78 1 to 30
Physical Disk % Disk Time 1 0 7 75 0 to 30
IIS Express
% Processor Time 0 0 2 6 0
Handle Count 610 595 640 690 614
Thread Count 34 35 35 35 35
Working Set 138 139 139 139 139
IO Data KB/sec 0 453 491 691 0
20 Requests sent within 2 seconds followed by 40 Requests sent within 3 seconds
1-5 Mins 10 Mins
Counter Before Test Min Ave Max After Test
System
System Processes 95 98 137 257 96
System Threads 1238 1251 1425 1913 1240
Memory Available MB 4837 4309 4694 4818 4811
Memory % Committed 24 24 25 29 24
Processes Total Handle Cou 34029 34953 38539 52140 34800
Processor % Processor Time 1 to 30 1 48 100 1 to 10
Physical Disk % Disk Time 0 to 30 0 7 136 0 to 10
IIS Express
% Processor Time 0 0 1 29 0
Handle Count 610 664 818 936 834
Thread Count 34 37 50 68 37
Working Set 138 139 142 157 141
IO Data KB/sec 0 0 186 2559 0

Having issues with adding two positions of array

Question in regards to adding arrays. I have this code below:
B[row][col] = B[row+1][col+1] + B[row][col+1];
Let say row = 2, col = 3; I don't quite understand what happens how. We have the (=) assignment so I'm guessing would assign whatever is on the right but I don't know how to count it. In this example it come up to me to be: 13 on the right side but that doesn't make sense. I would assign 13 value to b[row][col] ??? In the tracing program showed as 2. I don't understand, please help!
I'm not entirely sure what it is you're asking but essentially you have a 2D array and the B[row][col] syntax is to access a specific "cell" within the 2D array. Think of it like a grid. So what you're doing with the assignment operator is taking the values in cells B[row+1][col+1] and B[row][col+1], adding them together, and assigning that resulting value to the cell B[row][col]. Does that make sense? Also it'll be good to make sure you don't get any index out of bounds exceptions doing this.
This does somewhat depend on the tool/language you are using, for instance matlab starts indexing arrays at 1 so the first element of an array a is a[1] while languages like C/Java start indexing at 0 so the first element of an array a is a[0].
Lets assume that indexing is done like in C/Java, then consider a multidimensional array B
12 13 14 11
41 17 23 22
18 10 20 38
81 17 32 61
Then with row = 2 and col = 3 you will have that B[row][col] as the element that sits on the third row (remembering indexing starts at 0, so B[2] is the third row) and fourth column, marked here between * signs.
12 13 14 11
41 17 23 22
18 10 20 *38*
81 17 32 61
As for changing a value in the multidimensional array, it is done by assigning a new value to the index of the old value.
B[row][col] = B[row+1][col+1] + B[row][col+1];
With row=1 and col=0 we get
B[1][0] = B[2][1] + B[1][1];
B[1][0] = 10 + 17;
B[0][0] = 27;
Or:
12 13 14 11 12 13 14 11
(41) 17 23 22 (27) 17 23 22
18 10 20 38 ==> 18 10 20 38
81 17 32 61 81 17 32 61

How to show a row total amount by group CG Group in Report

I am using Axapta 3.0 with language X++.
I am making a report based on available report. The new report only shows a total row by group CG Group instead of showing all detail row as old report.
Exam: Available report
CG Code Amount Current 1-30days 31-60days 61-180days >180days
1.1 50 10 100 30 10 5
----------
1.1 30 20 60 35 20 20
----------
1.1 20 30 80 7 80 60
----------
1.2 30 50 50 1 100 80
----------
1.2 40 70 90 5 75 15
----------
2.3 100 20 20 150 20 30
----------
3.1 60 10 10 80 10 4
----------
3.1 30 60 5 100 5 60
New report as sample:
CG Code Amount Current 1-30days 31-60days 61-180days >180days
1.1Total 100 60 240 92 110 85
----------
1.2Total 70 120 140 6 175 95
----------
2.3Total 100 20 20 150 0 30
----------
3.1Total 90 70 15 180 15 64
I see the code of available report has SQL sentence as
select AmountMST from CustTransOpen where
custTransOpen.AccountNum == CustTable.AccountNum
&& custTransOpen.TransDate <= balanceAs
&& CustTransOpen.TransDate >= compareDate1
&& CustTransOpen.TransDate <= compareDate2
I have created view to get data from 2table (Custtransopen, Custtable) with name SKV_CustAgỉng3, then I also write SQL to group CG Group as:
select sum(AmountMST),StatisticsGroup from SKV_CustAging3
group by StatisticsGroup
where SKV_CustAging3.TransDate <= balanceAs
&& SKV_CustAging3.TransDate < compareDate1;
I aslo try to use "Section Group" to total amount every CG Group but the report still shows detail record and The end of section group shows total amount.
In my opinion, I want to show a row total amount by group CG Group the same example new report above.
Are there any way to show only one record sum total for every CG?
Please help me. I am new officers to make this report, so I don't many experience about Axapta.Thanks.
Try to override send() and fetch() methods of your report. Axapta call fetch() to fetch records which will be printed and Axapta calls send() method for each row, printed in the report. Axapta developers guide contains a detailed information for this methods.
Override fetch() method and select all necessary data on it
Use instance of Map class to group the data
Call send() method to print data.
Find AOT by words "send" and "fetch" to get more examples.

cursor loop and continue statement : unexpected behaviour

I might be overlooking something due to deadline stress. But this behaviour amazes me.
It looks as if the cursor caches 100 rows and the continue statement flushes the cache
and begins with the first record of a new cache fetch.
I narrowed it down to the following script:
drop table test1;
create table test1 (test1_id NUMBER);
begin
for i in 1..300
loop
insert into test1 values (i);
end loop;
end;
/
declare
cursor c_test1 is
select *
from test1;
begin
for c in c_test1
loop
if mod(c.test1_id,10) = 0
then
dbms_output.put_line(c_test1%ROWCOUNT||' '||c.test1_id||' Continue');
continue;
end if;
dbms_output.put_line(c_test1%ROWCOUNT||' '||c.test1_id||' Process');
end loop;
end;
/
1 1 Process
2 2 Process
3 3 Process
4 4 Process
5 5 Process
6 6 Process
7 7 Process
8 8 Process
9 9 Process
10 10 Continue **Where are tes1_id's 11 to 100?**
11 101 Process
12 102 Process
13 103 Process
14 104 Process
15 105 Process
16 106 Process
17 107 Process
18 108 Process
19 109 Process
20 110 Continue **Where are tes1_id's 111 to 200?**
21 201 Process
22 202 Process
23 203 Process
24 204 Process
25 205 Process
26 206 Process
27 207 Process
28 208 Process
29 209 Process
30 210 Continue **Where are tes1_id's 211 to 300?**
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
PL/SQL Release 11.1.0.7.0 - Production
redhat release 5
2 node RAC
It's a bug: 7306422
Pawel Barut wrote:
http://pbarut.blogspot.com/2009/04/caution-for-loop-and-continue-in-oracle.html
Workaround :
SQL> ALTER SESSION SET PLSQL_OPTIMIZE_LEVEL = 1;
Regards,
Rob

Resources