MPI non-blocking send and receive ordering

MPI non-blocking send and receive ordering - mpi

On a 2-core system, for the following code
if(rank == 0)
{
MPI_Isend(A) // to rank 1
// Do something else.
MPI_Isend(B) // to rank 1
// Do something else.
MPI_Wait(B is sent)
MPI_Wait(A is sent)
}
else
{
MPI_Irecv(buffer1) // Listen to rank 0
// Do something else.
MPI_Irecv(buffer2) // Listen to rank 0
// Do something else.
MPI_Wait(buffer2 is finished receiving)
MPI_Wait(buffer1 is finished receiving)
}
Would rank 1 be guaranteed to receive A in buffer 1, and B in buffer 2?
Thank you!

MPI messages are "non-overtaking". Two messages from the same source can not arrive in a different order from how they are sent (blocking case) or initiated (non-blocking case). Of course, you can always set your mind at ease by specifying different tags.

Related

Lwip w/ Raw Api - Unefficiency Issue

I'm working on lwip with using STM32H7 as a Tcp client, I also tried on STM32F4. According to my design, I need to transmit a packet with 1 ms time period to host, continuously. Also host sends a packet with 2 ms time period. So I created the software design according to these request.
But my code does not do job on time. For example, size of my packet is 40 bytes. Sometimes it is sending to host 400 bytes, sometimes 200 bytes, even if I use tcp_output(). Sometimes it is waiting 500-1000ms, without doing anything. When it does not send on time, queue length exceeds, and I'm gettin ERR_MEM as it is. But what I need here is to only send 40 bytes of data on. By the way, I observed the transmission time, it takes only 3-4 us with 100Mbps. I tried also in 10Mbps, it takes 40-50 us, but the result did not change.
I'm having same trouble in receive process. I'm getting the packet with recv callback function, but it is also missing most of packages, if I send periodically from host. Sometimes it is taking on time, sometimes doing nothing.
I think there is no problem in my transmit and received callback function, they are very simple, but I want to share here.
uint8_t tcpClientTransmit(const unsigned char *p_data, uint8_t len)
{
err_t err_ret;
err_ret = tcp_write(m_tcpPcbClient, p_data, len, 0); // Queues up data to be sent.
if(err_ret == ERR_OK)
{
if(tcp_output(m_tcpPcbClient) != ERR_OK) // Force to sent.
{
return 1;
}
else
{
return 0;
}
}
else
{
return 1; //error occured
}
}
uint8_t tcpReceivedCallback(void *arg, struct tcp_pcb *tpcb, struct pbuf *p, err_t err)
{
if(p == NULL)
{
g_ethernetConnected = false;
tcpClientDisconnect();
}
else
{
tcp_recved(tpcb, p->tot_len);
pbuf_free(p);
parseProcess((unsigned char*)p->payload);
}
return err_ret;
}
I've been stuck here for long days. I've seen various similar issues in questions, but I could not see the solution, or they do not solve my problem. Thanks a lot.

i want to check timeout response

In the case of general TCP communication, there is a procedure to check whether the received data comes within a certain time, but there is no capl api. So I want to add this logic (this code is written in a network module, not a test module)
To explain the code below, if there is no error in gtTpRxbuffer, the data is read.
I want to add time related logic to this part.
long TcpRecv( dword socket)
{
int result = 0;
result = TcpReceive( socket, gTcpRxBuffer, elcount( gTcpRxBuffer));
if ( 0 != result)
{
gIpLastErr = IpGetLastSocketError( socket);
if ( WSA_IO_PENDING != gIpLastErr)
{
IpGetLastSocketErrorAsString( socket, gIpLastErrStr, elcount( gIpLastErrStr));
writelineex( 0, 2, "TcpReceive error (%d): %s", gIpLastErr, gIpLastErrStr);
}
}
else{
sysGetVariableString(sysvar::TCPIP::TcpData,gTcpRxBuffer,elcount(gTcpRxBuffer));
return result;
}

Begin Transmission and Receiving Byte using I2C, PSOC

I'm new to the PSoC board and I'm trying to read the x,y,z values from a Digital Compass but I'm having a problem in beginning the Transmission with the compass itself.
I found some Arduino tutorial online here but since PSoC doesn't have the library I can't duplicate the code.
Also I was reading the HMC5883L datasheet here and I'm suppose to write bytes to the compass and obtain the values but I was unable to receive anything. All the values I received are zero which might be caused by reading values from wrong address.
Hoping for your answer soon.

PSoC is sorta tricky when you are first starting out with it. You need to read over the documentation carefully of both the device you want to talk to and the i2c module itself.
The datasheet for the device you linked states this on page 18:
All bus transactions begin with the master device issuing the start sequence followed by the slave address byte. The
address byte contains the slave address; the upper 7 bits (bits7-1), and the Least Significant bit (LSb). The LSb of the
address byte designates if the operation is a read (LSb=1) or a write (LSb=0). At the 9
th clock pulse, the receiving slave
device will issue the ACK (or NACK). Following these bus events, the master will send data bytes for a write operation, or
the slave will clock out data with a read operation. All bus transactions are terminated with the master issuing a stop
sequence.
If you use the I2C_MasterWriteBuf function, it wraps all that stuff the HMC's datasheet states above. The start command, dealing with that ack, the data handling, etc. The only thing you need to specify is how to transmit it.
If you refer to PSoC's I2C module datasheet, the MasterWriteBuf function takes in the device address, a pointer to the data you want to send, how many bytes you want to send, and a "mode". It shows what the various transfer modes in the docs.
I2C_MODE_COMPLETE_XFER Perform complete transfer from Start to Stop.
I2C_MODE_REPEAT_START Send Repeat Start instead of Start.
I2C_MODE_NO_STOP Execute transfer without a Stop
The MODE_COMPLETE_XFRE transfer will send the start and stop command for you if I'm not mistaken.
You can "bit-bang" this also if you want but calling directly on the I2C_MasterSendStart, WriteByte, SendStop, etc. But it's just easier to call on their writebuf functions.
Pretty much you need to write your code like follows:
// fill in your data or pass in the buffer of data you want to write
// if this is contained in a function call. I'm basing this off of HMC's docs
uint8 writeBuffer[3];
uint8 readBuffer[6];
writeBuffer[0] = 0x3C;
writeBuffer[1] = 0x00;
writeBuffer[2] = 0x70;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
while((I2C_MasterStatus() & I2C_MSTAT_WR_CMPLT) == 0u)
{
// wait for operation to finish
}
writeBuffer[1] = 0x01;
writeBuffer[2] = 0xA0;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
writeBuffer[1] = 0x02;
writeBuffer[2] = 0x00;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 3, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
CyDelay(6); // docs state 6ms delay before you can start looping around to read
for(;;)
{
writeBuffer[0] = 0x3D;
writeBuffer[1] = 0x06;
I2C_MasterWriteBuf(HMC_SLAVE_ADDRESS, &writeBuffer, 2, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish
// Docs don't state any different sort of bus transactions for reads.
// I'm assuming it'll be the same as a write
I2C_MasterReadBuf(HMC_SLAVE_ADDRESS, readBuffer, 6, I2C_MODE_COMPLETE_XFER);
// wait for operation to finish, wait on I2C_MSTAT_RD_CMPLT instead of WR_COMPLT
// You should have something in readBuffer to work with
CyDelay(67); // docs state to wait 67ms before reading again
}
I just sorta wrote that off the top of my head. I have no idea if that'll work or not, but I think that should be a good place to start and try. They have I2C example projects to look at also I think.
Another thing to look at so the WriteBuf function doesn't just seem like some magical command, if you right-click on the MasterWriteBuf function and click on "Find Definition" (after you build the project) it'll show you what it's doing.

Following are the samples for I2C read and write operation on PSoC,
simple Write operation:
//Dumpy data values to write
uint8 writebuffer[3]
writebuffer[0] = 0x23
writebuffer[1] = 0xEF
writebuffer[2] = 0x0F
uint8 I2C_MasterWrite(uint8 slaveAddr, uint8 nbytes)
{
uint8 volatile status;
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
status = I2C_MasterWriteBuf(slaveAddr, (uint8 *)&writebuffer, nbytes, I2C_MODE_COMPLETE_XFER);
if(status == I2C_MSTR_NO_ERROR)
{
/* wait for write complete and no error */
do
{
status = I2C_MasterStatus();
} while((status & (I2C_MSTAT_WR_CMPLT | I2C_MSTAT_ERR_XFER)) == 0u);
}
else
{
/* translate from I2CM_MasterWriteBuf() error output to
* I2C_MasterStatus() error output */
status = I2C_MSTAT_ERR_XFER;
}
}
return status;
}
Read Operation:
void I2C_MasterRead(uint8 slaveaddress, uint8 nbytes)
{
uint8 volatile status;
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
/* Then do the read */
status = I2C_MasterClearStatus();
if(!(status & I2C_MSTAT_ERR_XFER))
{
status = I2C_MasterReadBuf(slaveaddress,
(uint8 *)&(readbuffer),
nbytes, I2C_MODE_COMPLETE_XFER);
if(status == I2C_MSTR_NO_ERROR)
{
/* wait for reading complete and no error */
do
{
status = I2C_MasterStatus();
} while((status & (I2C_MSTAT_RD_CMPLT | I2C_MSTAT_ERR_XFER)) == 0u);
if(!(status & I2C_MSTAT_ERR_XFER))
{
/* Decrement all RW bytes in the EZI2C buffer, by different values */
for(uint8 i = 0u; i < nbytes; i++)
{
readbuffer[i] -= (i + 1);
}
}
}
else
{
/* translate from I2C_MasterReadBuf() error output to
* I2C_MasterStatus() error output */
status = I2C_MSTAT_ERR_XFER;
}
}
}
if(status & I2C_MSTAT_ERR_XFER)
{
/* add error handler code here */
}
}

MPI Send/Recv millions of messages

I have this loop over NT (millions of iterations) for procs greater than 0. Messages of 120 bytes are sent to proc 0 for each iteration and proc 0 receives them (I have the same loop over NT for proc 0).
I want proc 0 to receive them ordered so I can store them in array nhdr1.
The problem is that proc 0 does not receive messages properly and I have often 0 values in array nhdr.
How can I modify the code so that the messages are received in the same order are they were sent?
[...]
if (rank == 0) {
nhdr = malloc((unsigned long)15*sizeof(*nhdr));
nhdr1 = malloc((unsigned long)NN*15*sizeof(*nhdr1));
itr = 0;
jnode = 1;
for (l=0; l<NT; l++) {
MPI_Recv(nhdr, 15, MPI_LONG, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
if (l == status.MPI_TAG) {
for (i=0; i<nkeys; i++)
nhdr1[itr*15+i] = nhdr[i];
}
itr++;
if (itr == NN) {
ipos = (unsigned long)(jnode-1)*NN*15*sizeof(*nhdr1);
fseek(ismfh, ipos, SEEK_SET);
nwrite += fwrite(nhdr1, sizeof(*nhdr1), NN*15, ismfh);
itr = 0;
jnode++;
}
}
free(nhdr);
free(nhdr1);
} else {
nhdr = malloc(15*sizeof(*nhdr));
irecmin = (rank-1)*NN+1;
irecmax = rank*NN;
for (l=0; l<NT; l++) {
if (jrec[l] >= irecmin && jrec[l] <= irecmax) {
indx1 = (unsigned long)(jrec[l]-irecmin) * 15;
for (i=0; i<15; i++)
nhdr[i] = nhdr1[indx1+i]; // nhdr1 is allocated before for rank>0!
MPI_Send(nhdr, 15, MPI_LONG, 0, l, MPI_COMM_WORLD);
}
}
free(nhdr);
}

There is no way to guarantee that your messages will arrive on rank 0 in the same order they were sent from different ranks. For example, if you have a scenario like this (S1 means send message 1) :
rank 0 ----------------
rank 1 ---S1------S3---
rank 2 ------S2------S4
There is no guarantee that the messages will arrive at rank 0 in the order S1, S2, S3, S4. The only guarantee made by MPI is that the messages from each rank that are sent on the same communicator with the same tag (which you are doing) will arrive in the same order they were sent. This means that the resulting order could be:
S1, S2, S3, S4
Or it could be:
S1, S3, S2, S4
or:
S2, S1, S3, S4
...and so on.
For most applications, this doesn't really matter. The ordering that's important is the logical ordering, not the real time ordering. You might take another look at your application and make sure you can't relax your requirements a bit.

What do you mean by " messages are received in the same order are they were sent"?
In the code now, the message ARE received in (roughly) the order that they are actually sent...but that order has nothing to do with the rank numbers, or really anything else. See #Wesley Bland's response for more on this.
If you mean "receive the messages in rank order"...then there are a few options.
First, a collective like MPI_Gather or MPI_Gatherv would be an "obvious" choice to ensure that the data is ordered by the rank that produced it. This only works if each rank does the same number of iterations, and those iterations stay roughly sync'd.
Second, you could remove the MPI_ANY_SOURCE, and post a set of MPI_IRevc with the buffers supplied "in order". When a message arrives, it will be in the correct buffer location "automatically." For each message that is received, a new MPI_Irecv could be posted with the correct recv buffer location supplied. Any un-matched MPI_Irecv's would need to be canceled at the end of the job.

keeping in mind that:
messages from a given rank are received in order and
messages have the originating processor rank in the status structure (status.MPI_SOURCE) returned by MPI_Recv()
you can use these two elements to properly place the received data into nhdr1.

MPI call and receive in a nested loop

I have a nested loop and from inside the loop I call the MPI send which I want it to
send to the receiver a specific value then at the receiver takes the data and again sends MPI messages
to another set of CPUs ... I used something like this but it looks like there is a problem in the receive ... and I cant see where I went wrong ..."the machine goes to infinite loop somewhere ...
I am trying to make it work like this :
master CPU >> send to other CPUs >> send to slave CPUs
.
.
.
int currentCombinationsCount;
int mp;
if (rank == 0)
{
for (int pr = 0; pr < combinationsSegmentSize; pr++)
{
int CblockBegin = CombinationsSegementsBegin[pr];
int CblockEnd = CombinationsSegementsEnd [pr];
currentCombinationsCount = numOfCombinationsEachLoop[pr];
prossessNum = 1; //specify which processor we are sending to
// now substitute and send to the main Processors
for (mp = CblockBegin; mp <= CblockEnd; mp++)
{
MPI_Send(&mp , 1, MPI_INT , prossessNum, TAG, MPI_COMM_WORLD);
prossessNum ++;
}
}//this loop goes through all the specified blocks for the combinations
} // end of rank 0
else if (rank > currentCombinationsCount)
{
// here I want to put other receives that will take values from the else below
}
else
{
MPI_Recv(&mp , 1, MPI_INT , 0, TAG, MPI_COMM_WORLD, &stat);
// the code stuck here in infinite loop
}

You've only initialised currentCombinationsCount within the if(rank==0) branch so all other procs will see an uninitialised variable. That will result in undefined behaviour and the outcome depends on your compiler. Your program may crash or the value may be set to 0 or an undetermined value.
If you're lucky, the value may be set to 0 in which case your branch reduces to:
if (rank == 0) { /* rank == 0 will enter this }
else if (rank > 0) { /* all other procs enter this }
else { /* never entered! Recvs are never called to match the sends */ }
You therefore end up with sends that are not matched by any receives. Since MPI_Send is potentially blocking, the sending proc may stall indefinitely. With procs blocking on sends, it can certainly look as thought "...the machine goes to infinite loop somewhere...".
If currentCombinationsCount is given an arbitrary value (instead of 0) then rank!=0 procs will enter arbitrary branchss (with a higher chance of all entering the final else). You then end up with second set of receives not being called resulting in the same issue as above.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

MPI non-blocking send and receive ordering - mpi

MPI messages are "non-overtaking". Two messages from the same source can not arrive in a different order from how they are sent (blocking case) or initiated (non-blocking case). Of course, you can always set your mind at ease by specifying different tags.

Related

Lwip w/ Raw Api - Unefficiency Issue

i want to check timeout response

Begin Transmission and Receiving Byte using I2C, PSOC

MPI Send/Recv millions of messages

MPI call and receive in a nested loop

Categories

Resources