why does aio_write() act wrong?

why does aio_write() act wrong? - asynchronous

I want to write 2 files by using aio_write.
Used 32KB buffer and repeat aio_write 2048 times for 1 file.(file size is 64MB)
However result is not 64MB but size is 64MB + 32KB, now.
Also sometimes file is written by garbage.
I want to fill 'A' to file.
Please help me.
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <aio.h>
#include <fcntl.h>
#include <siginfo.h>
#define TNAME "testAio.c"
#define BUFFER_SIZE 32 * 1024 //(32 * 1024 * 1024)
#define FILE_COUNT 2
#define FILE_PATH 256
#define FALSE 0
#define TRUE 1
int main ()
{
char sTmpFileName[FILE_COUNT][FILE_PATH];
char * sBuf;
char * sAlignedBuf;
int sFd[FILE_COUNT];
struct aiocb sAiocb[FILE_COUNT];
int sError;
int sRet;
int i;
int j;
int sWritten[FILE_COUNT];
int sWrittenSize;
int sWrittenCnt;
int sFrequence = 2048;
sBuf = (char*) malloc( BUFFER_SIZE + 512 );
sAlignedBuf = (char*)( ((long)sBuf) + (512 - ((long)sBuf) % 512));
memset( sAlignedBuf, 0x41, BUFFER_SIZE );
for( i = 0; i < FILE_COUNT; i++ )
{
memset( &sAiocb[i], 0, sizeof(struct aiocb) );
sAiocb[i].aio_buf = sAlignedBuf;
sAiocb[i].aio_nbytes = BUFFER_SIZE;
snprintf( sTmpFileName[i],
FILE_PATH,
"testAio_%d",
i);
unlink( sTmpFileName[i] );
sFd[i] = open( sTmpFileName[i],
O_CREAT | O_RDWR | O_EXCL |
O_DIRECT | O_LARGEFILE,
S_IRUSR | S_IWUSR );
sAiocb[i].aio_fildes = sFd[i];
if( sFd[i] == -1 )
{
printf( TNAME " Error at open(): %s\n", strerror( errno ) );
exit(1);
}
}
for( j = 0; j < sFrequence; j++ )
{
for( i = 0; i < FILE_COUNT; i++ )
{
if( sWrittenSize = aio_write( &sAiocb[i] ) == -1 )
{
printf( TNAME " Error at aio_write(): %s\n", strerror( errno ) );
close( sFd[i] );
exit(2);
}
sAiocb[i].aio_offset += sAiocb[i].aio_nbytes;
// printf( "offset %ld\n", sAiocb[i].aio_offset );
}
}
printf( "offset %ld %ld\n",
sAiocb[0].aio_offset,
sAiocb[1].aio_offset );
/* Wait until completion */
i = 0;
sWritten[0] = FALSE;
sWritten[1] = FALSE;
sWrittenCnt = 0;
while( 1 )
{
sError = aio_error( &sAiocb[i] );
if( sError != EINPROGRESS )
{
if( sWritten[i] == FALSE )
{
sWrittenCnt++;
sWritten[i] = TRUE;
}
}
if( sWrittenCnt == FILE_COUNT )
{
break;
}
i = (i + 1) % FILE_COUNT;
}
for( i = 0; i < FILE_COUNT; i++ )
{
sError = aio_error( &sAiocb[i] );
sRet = aio_return( &sAiocb[i] );
if( sError != 0 )
{
printf( TNAME " Error at aio_error() : %d, %s\n",
i,
strerror( sError ) );
close( sFd[i] );
exit(2);
}
if( sRet != BUFFER_SIZE )
{
printf( TNAME " Error at aio_return()\n" );
close( sFd[i] );
exit(2);
}
}
for( i = 0; i < FILE_COUNT; i++ )
{
close( sFd[i] );
}
printf( "Test PASSED\n" );
return 0;
}

Most POSIX implementations enforce severe limits on the number of concurrent asynchronous i/o operations which can be in flight in total on the system, and per process. This limit is 16 on some major implementations. You therefore cannot simply call aio_write 2048 times in sequence, you must call it only up until AIO_LISTIO_MAX which is the maximum possible, always checking error codes for system resource exhaustion before that maximum possible limit. Even on NT which has no hard limits, performance noticeably nosedives after a certain amount of concurrency when FILE_FLAG_NO_BUFFERING is on, especially on older Windows kernels.
Once you have scheduled as many aio_write as the system will take, you then need to call aio_suspend on what you've scheduled and retire out any ops which complete, trying again to refill the pending i/o queue. If you'd like to see a production example of usage, try https://github.com/ned14/boost.afio/blob/master/include/boost/afio/v2.0/detail/impl/posix/io_service.ipp.
I should emphasise that POSIX aio scales poorly, provides virtually no performance benefit, and on Linux or FreeBSD your "asynchronous i/o" is really a thread pool of workers which call the synchronous i/o APIs for you. Virtually no POSIX OS implements much asynchronicity in practice unless O_DIRECT or its equivalent is turned on, it's only really worth bothering with on NT.
As many other posts on Stackoverflow have said, async filesystem i/o is not worth the time nor hassle for 99% of users, just use a thread pool calling the synchronous APIs instead, it scales far far better and is portable across all platforms, doesn't have daft problems with signals, plus always on Linux or on FreeBSD when O_DIRECT is off it's how POSIX aio is implemented anyway.

Thanks for the comment.
Now I noticed about the code that I made mistake.
I don't know whether it is certain or not.
I assume that aio_nbytes should be handled concurrently.
After I called aio_write for certain file(ex file1), I have to wait until end of the call for next aio_write call for file1.
Is my assumption right?

Related

How could you develop a simple DEALER/ROUTER message flow, using ZeroMQ?

I'm fairly new to TCP messaging (and programming in general) and I am trying to develop a simple ROUTER/DEALER message pair with ZeroMQ but am struggling in getting the router to receive a message from the dealer and send one back.
I can do a simple REQ/REP pattern with no problem, where I can send one message from my machine to my VM.
However, when trying to develop a ROUTER/DEALER pair, I can't seem to get the ROUTER-instance to receive the message (ROUTER on VM, DEALER on main box). I have had some success where I could spam 50 messages in a while(){...} loop, but can't send a single message and have the ROUTER send one back.
So from what I have read, a TCP message in a ROUTER/DEALER pair are sent with a delimiter of 0 at the beginning, and this 0 must be sent first to the ROUTER to register an incoming message.
So I just want to send the message "ROUTER_TEST" to my server, and for my server to respond with "RECEIVED".
DEALER
#include <cstdlib>
#include <iostream>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <assert.h>
#include <stdio.h>
#include "zmq.h"
const char connection[] = "tcp://10.0.10.76:5555";
int main(void)
{
int major, minor, patch;
zmq_version(&major, &minor, &patch);
printf("\nInstalled ZeroMQ version: %d.%d.%d\n", major, minor, patch);
printf("Connecting to: %s\n", connection);
void *context = zmq_ctx_new();
void *requester = zmq_socket(context, ZMQ_DEALER);
int zc = zmq_connect(requester, connection);
std::cout << "zmq_connect = " << zc << std::endl;
int sm = zmq_socket_monitor(requester, connection, ZMQ_EVENT_ALL);
std::cout << "zmq_socket_monitor = " << sm << std::endl;
char messageSend[] = "ROUTER_TEST";
int request_nbr;
int n = zmq_send(requester, NULL, 0, ZMQ_DONTWAIT|ZMQ_SNDMORE );
int ii = 0;
if(n==0) {
std::cout << "n = " << n << std::endl;
while (ii < 50)
{
n = zmq_send(requester, messageSend, 31, ZMQ_DONTWAIT);
ii++;
}
}
return 0;
}
ROUTER
// SERVER
#include <cstdlib>
#include <iostream>
#include <string.h>
#include <assert.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include "zmq.h"
int main(void)
{
void *context = zmq_ctx_new();
void *responder = zmq_socket(context, ZMQ_ROUTER);
printf("THIS IS WORKING - ROUTER\n");
int rc = zmq_bind(responder, "tcp://*:5555");
assert(rc == 0);
zmq_pollitem_t pollItems[] = {
{responder, 0, ZMQ_POLLIN, -1}};
int sm = zmq_socket_monitor(responder, "tcp://*:5555", ZMQ_EVENT_LISTENING);
std::cout << "zmq_socket_monitor = " << sm << std::endl;
uint8_t buffer[15];
while (1)
{
int rc = zmq_recv(responder, buffer, 5, ZMQ_DONTWAIT);
if (rc == 0)
{
std::cout << "zmq_recv = " << rc << std::endl;
zmq_send(responder, "RECIEVED", 9,0);
}
zmq_poll(pollItems, sizeof(pollItems), -1);
}
return 0;
}

Your code calls, on the DEALER-side a series of:
void *requester = zmq_socket( context,
ZMQ_DEALER // <-- .STO <ZMQ_DEALER>, *requester
);
...
int n = zmq_send( requester, // <~~ <~~ <~~ <~~ <~~ <~~ .STO 0, n
NULL, // NULL,sizeof(NULL)== 0
0, // explicitly declared 0
ZMQ_DONTWAIT // _DONTWAIT flag
| ZMQ_SNDMORE //---- 1x ZMQ_SNDMORE flag==
); // 1.Frame in 1st MSG
int ii = 0; // MSG-under-CONSTRUCTION
if ( n == 0 ) // ...until complete, not yet sent
{
std::cout << "PREVIOUS[" << ii << ".] CALL of zmq_send() has returned n = " << n << std::endl;
while ( ii < 50 )
{ ii++;
n = zmq_send( requester, //---------//---- 1x ZMQ_SNDMORE following
messageSend, // // 2.Frame in 1st MSG
31, // // MSG-under-CONSTRUCTION, but
ZMQ_DONTWAIT // // NOW complete & will get sent
); //---------//----49x monoFrame MSGs follow
}
}
...
What happens on the opposite side, the ROUTER-side code ?
...
while (1)
{
int rc = zmq_recv( responder, //----------------- 1st .recv()
buffer,
5,
ZMQ_DONTWAIT
);
if ( rc == 0 )
{
std::cout << "zmq_recv = " << rc << std::endl;
zmq_send( responder, // _____________________ void *socket
"RECEIVED", // _____________________ void *buffer
9, // _____________________________ size_t len
0 // _____________________________ int flags
);
}
zmq_poll( pollItems,
sizeof( pollItems ),
-1 // __________________________________ block ( forever )
);// till ( if ever ) ...?
}
Here, most probably, the rc == 0 but once, if not missed, but never more
Kindly notice, that your code does not detect in any way if a .recv()-call is also being flagged by a ZMQ_RECVMORE - signaling a need to first also .recv()-all-the-rest multi-Frame parts of the first message, before becoming able to .send()-any-answer...
An application that processes multi-part messages must use the ZMQ_RCVMORE zmq_getsockopt(3) option after calling zmq_recv() to determine if there are further parts to receive.
Next, the buffer and messageSend message-"payloads" are a kind of fragile entities and ought be re-composed ( for details best read again all details about how to carefully initialise, work with and safely-touch any zmq_msg_t object(s) ), as after a successful .send()/.recv() the low level API ( since 2.11.x+ ) considers 'em disposed-off, not re-useable. Also note, that messageSend is not (as imperatively put into the code) a 31-char[]-long, was it? Was there any particular intention to do this?
The zmq_send() function shall return number of bytes in the message if successful. Otherwise it shall return -1 and set errno to one of the values defined below. { EAGAIN, ENOTSUP, EINVAL, EFSM, ETERM, ENOTSOCK, EINTR, EHOSTUNREACH }
Not testing error-state means knowing nothing about the actual state ( see EFSM and other potential trouble explainers ) of REQ/REP and DEALER/ROUTER (extended) .send()/.recv()/.send()/.recv()/... mandatory dFSA's order of these steps
"So from what I have read, a TCP message in a ROUTER/DEALER pair are sent with a delimiter of 0 at the beginning, and this 0 must be sent first to the ROUTER to register an incoming message."
This seems to be a misunderstood part. The app-side is free to compose any number of monoframe or multi-frame messages, yet the "trick" of a ROUTER prepended identity-frame is performed without users assistance ( message-labelling is performed automatically, before any ( now, principally all ) multi-frame(d) messages get delivered to the app-side ( using the receiver's side .recv()-method ). Due handling of multi-frame messages was noted above.

qDebug function char FirstDriveFromMask( ULONG unitmask )

trying to make project in QT, I need to detect any new usb device and return the letter in my main.cpp.
I found this with google and it should work but I don't know how to have a print of the driver letter in my main.cpp with simple qDebug() by calling the function char FirstDriveFromMask(ULONG unitmask).
Could you help me?
void Main_OnDeviceChange( HWND hwnd, WPARAM wParam, LPARAM lParam )
{
PDEV_BROADCAST_HDR lpdb = (PDEV_BROADCAST_HDR)lParam;
TCHAR szMsg[80];
switch(wParam )
{
case DBT_DEVICEARRIVAL:
// Check whether a CD or DVD was inserted into a drive.
if (lpdb -> dbch_devicetype == DBT_DEVTYP_VOLUME)
{
PDEV_BROADCAST_VOLUME lpdbv = (PDEV_BROADCAST_VOLUME)lpdb;
if (lpdbv -> dbcv_flags & DBTF_MEDIA)
{
StringCchPrintf( szMsg, sizeof(szMsg)/sizeof(szMsg[0]),
TEXT("Drive %c: Media has arrived.\n"),
FirstDriveFromMask(lpdbv ->dbcv_unitmask) );
MessageBox( hwnd, szMsg, TEXT("WM_DEVICECHANGE"), MB_OK );
}
}
break;
case DBT_DEVICEREMOVECOMPLETE:
// Check whether a CD or DVD was removed from a drive.
if (lpdb -> dbch_devicetype == DBT_DEVTYP_VOLUME)
{
PDEV_BROADCAST_VOLUME lpdbv = (PDEV_BROADCAST_VOLUME)lpdb;
if (lpdbv -> dbcv_flags & DBTF_MEDIA)
{
StringCchPrintf( szMsg, sizeof(szMsg)/sizeof(szMsg[0]),
TEXT("Drive %c: Media was removed.\n" ),
FirstDriveFromMask(lpdbv ->dbcv_unitmask) );
MessageBox( hwnd, szMsg, TEXT("WM_DEVICECHANGE" ), MB_OK );
}
}
break;
default:
/*
Process other WM_DEVICECHANGE notifications for other
devices or reasons.
*/
;
}
}
/*------------------------------------------------------------------
FirstDriveFromMask( unitmask )
Description
Finds the first valid drive letter from a mask of drive letters.
The mask must be in the format bit 0 = A, bit 1 = B, bit 2 = C,
and so on. A valid drive letter is defined when the
corresponding bit is set to 1.
Returns the first drive letter that was found.
--------------------------------------------------------------------*/
char FirstDriveFromMask( ULONG unitmask )
{
char i;
for (i = 0; i < 26; ++i)
{
if (unitmask & 0x1)
break;
unitmask = unitmask >> 1;
}
return( i + 'A' );
}

Either this:
#include <QDebug>
///
qDebug() <<
"Drive" << FirstDriveFromMask(lpdbv ->dbcv_unitmask) << ": Media has arrived";
or with a bit better formatting
qDebug() <<
QString("Drive %1: Media has arrived").arg(FirstDriveFromMask(lpdbv ->dbcv_unitmask));
And if that output going to default debug console rather than Windows you have to follow the answer: Qt qDebug() doesn't work in Windows shell and make a small change in project.pro file:
CONFIG += console

Array access is invalid in MQL5 error

I am trying to access the arrays, delivered via a call-signature into the system invoked OnCalculation() event-handler.
This the way it is written:
int OnCalculate(const int rates_total,
const int prev_calculated,
const datetime &time[],
const double &open[],
const double &high[],
const double &low[],
const double &close[],
const long &tick_volume[],
const long &volume[],
const int &spread[]
)
{
/* The rest code is written here
...
*/
}
I am trying to merge the code with the OpenCL functions so that the program uses GPU for the tremendous calculations. But the issue is when I am trying to pass the values from OnCalculation() to the kernel for execution, I am getting error. See the following code is written inside OnCalculation()
CLSetKernelArg( cl_krn, 0, start );
CLSetKernelArg( cl_krn, 1, rates_total );
CLSetKernelArg( cl_krn, 2, time );
CLSetKernelArg( cl_krn, 3, high );
CLSetKernelArg( cl_krn, 4, low );
Getting the following error:
'time' - invalid array access ADX.mq5 285 31
'high' - invalid array access ADX.mq5 286 31
'low' - invalid array access ADX.mq5 287 31
I don't know why is this problem happening. I am not able to pass the arrays from the OnCalculation().
Kindly, help me what I can do?

It is impossible to just reference an MQL5 array[] object here
OpenCL starts a completely new code-execution eco-system, and MQL5-side data has to get "transferred" correctly there and back...
Using a mock-up trivial GPU-kernel that doubles an array received:
const string // by default some GPU doesn't support doubles
cl_SOURCE = "#pragma OPENCL EXTENSION cl_khr_fp64 : enable \r\n" // cl_khr_fp64 directive is used to enable work with doubles
" \r\n"
"__kernel void Test_GPU( __global double *data, \r\n" // [0]____GPU-kernel-side_CALL-SIGNATURE
" const int N, \r\n" // [1]____GPU-kernel-side_CALL-SIGNATURE
" const int N_arrays \r\n" // [2]____GPU-kernel-side_CALL-SIGNATURE
" ) \r\n"
"{ \r\n"
" uint kernel_index = get_global_id( 0 ); \r\n"
" if ( kernel_index > N_arrays ) return; \r\n"
" \r\n"
" uint local_start_offset = kernel_index * N; \r\n"
" for ( int i = 0; i < N; i++ ) \r\n"
" data[i+local_start_offset] *= 2.0; \r\n"
"} \r\n";
// AFTER FIRST TESTING THE OpenCL DEVICES & THEIR CAPABILITIES ... ( see prev. posts )
#define ARRAY_SIZE 100 // size of the array
#define TOTAL_ARRAYS 5 // total arrays
// ONE CAN:
//--- SET OpenCL-specific handles' holders
int cl_CONTEXT, // an OpenCL-Context handle
cl_PROGRAM, // an OpenCL-Program handle
cl_KERNEL, // an OpenCL Device-Kernel handle
cl_BUFFER; // an OpenCL-buffer handle
uint cl_offset[] = { 0 }; //--- prepare CLExecute() params
uint cl_work[] = { TOTAL_ARRAYS }; //--- global work size
double DataArray2[]; //--- global mapping-object for data aimed to reach the GPU
ArrayResize( DataArray2, //--- size it to fit data in
ARRAY_SIZE * TOTAL_ARRAYS
);
for ( int j = 0; j < TOTAL_ARRAYS; j++ ) //--- fill mapped-arrays with data
{ uint local_offset = j * ARRAY_SIZE; //--- set local start offset for j-th array
for ( int i = 0; i < ARRAY_SIZE; i++ ) //--- for j-th array
DataArray2[i+local_offset] = MathCos(i+j); //--- fill array with some data
}
The principal structure of MQL5 / OpenCL setup is similar to this:
//--- INIT OpenCL
if ( INVALID_HANDLE == ( cl_CONTEXT = CLContextCreate() ) )
{ Print( "EXC: CLContextCreate() error = ", GetLastError() );
return( 1 ); // ---------------^ EXC/RET
}
//--- NEXT create OpenCL program
if ( INVALID_HANDLE == ( cl_PROGRAM = CLProgramCreate( cl_CONTEXT,
cl_SOURCE
)
)
)
{ Print( "EXC: CLProgrameCreate() error = ", GetLastError() );
CLContextFree( cl_CONTEXT );
return( 1 ); // ----------------^ EXC/RET
}
//--- NEXT create OpenCL kernel
if ( INVALID_HANDLE == ( cl_KERNEL = CLKernelCreate( cl_PROGRAM,
"Test_GPU"
)
)
)
{ Print( "EXC: CLKernelCreate() error = ", GetLastError() );
CLProgramFree( cl_PROGRAM );
CLContextFree( cl_CONTEXT );
return( 1 ); // --------------^ EXC/RET
}
//--- TRY: create an OpenCL cl_BUFFER object mapping
if ( INVALID_HANDLE == ( cl_BUFFER = CLBufferCreate( cl_CONTEXT,
(uint) ( ARRAY_SIZE * TOTAL_ARRAYS * sizeof( double ),
CL_MEM_READ_WRITE
)
)
)
{ Print( "EXC: CLBufferCreate() error == ", GetLastError() );
CLKernelFree( cl_KERNEL );
CLProgramFree( cl_PROGRAM );
CLContextFree( cl_CONTEXT );
return(1); // ----------------^ EXC/RET
}
//--- NEXT: set OpenCL cl_KERNEL GPU-side-kernel call-parameters
CLSetKernelArgMem( cl_KERNEL, 0, cl_BUFFER ); // [0]____GPU-kernel-side_CALL-SIGNATURE
CLSetKernelArg( cl_KERNEL, 1, ARRAY_SIZE ); // [1]____GPU-kernel-side_CALL-SIGNATURE
CLSetKernelArg( cl_KERNEL, 2, TOTAL_ARRAYS ); // [2]____GPU-kernel-side_CALL-SIGNATURE
//--- NEXT: write data into to OpenCL cl_BUFFER mapping-object
CLBufferWrite( cl_BUFFER,
DataArray2
);
//--- MAY execute OpenCL kernel
CLExecute( cl_KERNEL, 1, cl_offset, cl_work );
//--- MAY read data back, from OpenCL cl_BUFFER mapping-object
CLBufferRead( cl_BUFFER, DataArray2 );
CLBufferFree( cl_BUFFER ); //--- FINALLY free OpenCL buffer cl_BUFFER mapping-object
CLKernelFree( cl_KERNEL ); //--- FINALLY free OpenCL kernel object
CLProgramFree( cl_PROGRAM ); //--- FINALLY free OpenCL programme object / handle
CLContextFree( cl_CONTEXT ); //--- FINALLY free OpenCL cl_CONTEXT object / handle

fopen auto complete changing directory

Working on a C program in Debian and I need to access a directory that has numbers at the end of it that occasionally change. When accessing from the command prompt I can tab complete or use the *, how can I do this from a C program using fopen or some other method?
pwm = fopen("/sys/devices/ocp.3/pwm_test_P8_19.15/duty // this is the changing directory
pwm = fopen("/sys/devices/ocp.3/pwm_t*/duty // this did not work

using stdio.h, stdlib.h, unistd.h
int k = 0;
char pwm_path[100];
for (k = 14; k < 20; k++)
{
sprintf( pwm_path, "/sys/devices/ocp.3/pwm_test_P8_19.%d/period", k );
puts(pwm_path); //debug
if (access( pwm_path, F_OK ) == 0) // if it finds path, then = 0
{
//printf("Files does exists, %d\n", k); // debug
pwm = fopen( pwm_path, "w" );
fseek(pwm,0,SEEK_SET);
fprintf(pwm,"20000000"); // pulse period in uS
fflush(pwm); // flush free up memory
break; // break out of loop once found
}
}

Segmentation fault while using MPI_File_open

I'm trying to read from a file for an MPI application. The cluster has 4 nodes with 12 cores in each node. I have tried running a basic program to compute rank and that works. When I added MPI_File_open it throws an exception at runtime
BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 139
The cluster has MPICH2 installed and has a Network File System. I check MPI_File_open with different parameters like ReadOnly mode, MPI_COMM_WORLD etc.
Can I use MPI_File_open with Network File System?
int main(int argc, char* argv[])
{
int myrank = 0;
int nprocs = 0;
int i = 0;
MPI_Comm icomm = MPI_COMM_WORLD;
MPI_Status status;
MPI_Info info;
MPI_File *fh = NULL;
int error = 0;
MPI_Init(&argc, &argv);
MPI_Barrier(MPI_COMM_WORLD); // Wait for all processor to start
MPI_Comm_size(MPI_COMM_WORLD, &nprocs); // Get number of processes
MPI_Comm_rank(MPI_COMM_WORLD, &myrank); // Get own rank
usleep(myrank*100000);
if ( myrank == 1 || myrank == 0 )
printf("Hello from %d\r\n", myrank);
if (myrank == 0)
{
error = MPI_File_open( MPI_COMM_SELF, "lw1.wei", MPI_MODE_UNIQUE_OPEN,
MPI_INFO_NULL, fh);
if ( error )
{
printf("Error in opening file\r\n");
}
else
{
printf("File successfully opened\r\n");
}
MPI_File_close(fh);
}
MPI_Barrier(MPI_COMM_WORLD); //! Wait for all the processors to end
MPI_Finalize();
if ( myrank == 0 )
{
printf("Number of Processes %d\n\r", nprocs);
}
return 0;
}

You forgot to allocate an MPI_File object before opening the file. You may either change this line:
MPI_File *fh = NULL;
into:
MPI_File fh;
and open file by giving fh's address to MPI_File_open(..., &fh). Or you may simply allocate memory from heap using malloc().
MPI_File *fh = malloc(sizeof(MPI_File));

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex