byte aligning in serial communication - serial-port

So I am trying to define a communication protocol for serial communication, I want to be able to send 4 byte numbers to the device, but I'm unsure how to make sure that the device starts to pick it up on the right byte.
For instance if I want to send
0x1234abcd 0xabcd3f56 ...
how do I makes sure that the device doesn't start reading at the wrong spot and get the first word as:
0xabcdabcd
Is there a clever way of doing this? I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?

Why not send a start-of-message byte followed by a length-of-data byte if you know how big the data is going to be?
Alternatively, do as other binary protocols and only send fixed sizes of packages with a fixed header. Say that you will only send 4 bytes, then you know that you'll have one or more bytes of header before the actual data content.
Edit: I think you're misunderstanding me. What I mean is that the client is supposed to always regard bytes as either header or data, not based on value but rather based on the position in the stream. Say you're sending four bytes of data, then one byte would be the header byte.
+-+-+-+-+-+
|H|D|D|D|D|
+-+-+-+-+-+
The client would then be a pretty basic state machine, along the lines of:
int state = READ_HEADER;
int nDataBytesRead = 0;
while (true) {
byte read = readInput();
if (state == READ_HEADER) {
// process the byte as a header byte
state = READ_DATA;
nDataBytesRead = 0;
} else {
// Process the byte as incoming data
++nDataBytesRead;
if (nDataBytesRead == 4)
{
state = READ_HEADER;
}
}
}
The thing about this setup is that what determines if the byte is a header byte is not the actual content of a byte, but rather the position in the stream. If you want to have a variable number of data bytes, add another byte to the header to indicate the number of data bytes following it. This way, it will not matter if you are sending the same value as the header in the data stream since your client will never interpret it as anything but data.

netstring
For this application, perhaps the relatively simple "netstring" format is adequate.
For example, the text "hello world!" encodes as:
12:hello world!,
The empty string encodes as the three characters:
0:,
which can be represented as the series of bytes
'0' ':' ','
The word 0x1234abcd in one netstring (using network byte order), followed by the word 0xabcd3f56 in another netstring, encodes as the series of bytes
'\n' '4' ':' 0x12 0x34 0xab 0xcd ',' '\n'
'\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n'
(The newline character '\n' before and after each netstring is optional, but makes it easier to test and debug).
frame synchronization
how do I makes sure that the device doesn't start reading at the wrong spot
The general solution to the frame synchronization problem is to read into a temporary buffer, hoping that we have started reading at the right spot.
Later, we run some consistency checks on the message in the buffer.
If the message fails the check, something has gone wrong,
so we throw away the data in the buffer and start over.
(If it was an important message, we hope that the transmitter will re-send it).
For example, if the serial cable is plugged in halfway through the first netstring,
the receiver sees the byte string:
0xab 0xcd ',' '\n' '\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n'
Because the receiver is smart enough to wait for the ':' before expecting the next byte to be valid data, the receiver is able to ignore the first partial message and then receive the second message correctly.
In some cases, you know ahead of time what the valid message length(s) will be;
that makes it even easier for the receiver to detect it has started reading at the wrong spot.
sending start-of-message marker as data
I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?
After sending the netstring header, the transmitter sends the raw data as-is -- even if it happens to look like the start-of-message marker.
In the normal case, the reciever already has frame sync.
The netstring parser has already read the "length" and the ":" header,
so the netstring parser
puts the raw data bytes directly into the correct location in the buffer -- even if those data bytes happen to look like the ":" header byte or the "," footer byte.
pseudocode
// netstring parser for receiver
// WARNING: untested pseudocode
// 2012-06-23: David Cary releases this pseudocode as public domain.
const int max_message_length = 9;
char buffer[1 + max_message_length]; // do we need room for a trailing NULL ?
long int latest_commanded_speed = 0;
int data_bytes_read = 0;
int bytes_read = 0;
int state = WAITING_FOR_LENGTH;
reset_buffer()
bytes_read = 0; // reset buffer index to start-of-buffer
state = WAITING_FOR_LENGTH;
void check_for_incoming_byte()
if( inWaiting() ) // Has a new byte has come into the UART?
// If so, then deal with this new byte.
if( NEW_VALID_MESSAGE == state )
// oh dear. We had an unhandled valid message,
// and now another byte has come in.
reset_buffer();
char newbyte = read_serial(1); // pull out 1 new byte.
buffer[ bytes_read++ ] = newbyte; // and store it in the buffer.
if( max_message_length < bytes_read )
reset_buffer(); // reset: avoid buffer overflow
switch state:
WAITING_FOR_LENGTH:
// FIXME: currently only handles messages of 4 data bytes
if( '4' != newbyte )
reset_buffer(); // doesn't look like a valid header.
else
// otherwise, it looks good -- move to next state
state = WAITING_FOR_COLON;
WAITING_FOR_COLON:
if( ':' != newbyte )
reset_buffer(); // doesn't look like a valid header.
else
// otherwise, it looks good -- move to next state
state = WAITING_FOR_DATA;
data_bytes_read = 0;
WAITING_FOR_DATA:
// FIXME: currently only handles messages of 4 data bytes
data_bytes_read++;
if( 4 >= data_bytes_read )
state = WAITING_FOR_COMMA;
WAITING_FOR_COMMA:
if( ',' != newbyte )
reset_buffer(); // doesn't look like a valid message.
else
// otherwise, it looks good -- move to next state
state = NEW_VALID_MESSAGE;
void handle_message()
// FIXME: currently only handles messages of 4 data bytes
long int temp = 0;
temp = (temp << 8) | buffer[2];
temp = (temp << 8) | buffer[3];
temp = (temp << 8) | buffer[4];
temp = (temp << 8) | buffer[5];
reset_buffer();
latest_commanded_speed = temp;
print( "commanded speed has been set to: " & latest_commanded_speed );
}
void loop () # main loop, repeated forever
# then check to see if a byte has arrived yet
check_for_incoming_byte();
if( NEW_VALID_MESSAGE == state ) handle_message();
# While we're waiting for bytes to come in, do other main loop stuff.
do_other_main_loop_stuff();
more tips
When defining a serial communication protocol,
I find it makes testing and debugging much easier if the protocol always uses human-readable ASCII text characters, rather than any arbitrary binary values.
frame synchronization (again)
I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?
We already covered the case where the reciever already has frame sync.
The case where the receiver does not yet have frame sync is pretty messy.
The simplest solution is for the transmitter to send a series of harmless bytes
(perhaps newlines or space characters),
the length of the maximum possible valid message,
as a preamble just before each netstring.
No matter what state the receiver is in when the serial cable is plugged in,
those harmless bytes eventually drive the receiver into the
"WAITING_FOR_LENGTH" state.
And then when the tranmitter sends the packet header (length followed by ":"),
the receiver correctly recognizes it as a packet header and has recovered frame sync.
(It's not really necessary for the transmitter to send that preamble before every packet.
Perhaps the transmitter could send it for 1 out of 20 packets; then the receiver is guaranteed to recover frame sync in 20 packets (usually less) after the serial cable is plugged in).
other protocols
Other systems use a simple Fletcher-32 checksum or something more complicated to detect many kinds of errors that the netstring format can't detect ( a, b ),
and can synchronize even without a preamble.
Many protocols use a special "start of packet" marker, and use a variety of "escaping" techniques to avoid actually sending a literal "start of packet" byte in the transmitted data, even if the real data we want to send happens to have that value.
( Consistent Overhead Byte Stuffing, bit stuffing, quoted-printable and other kinds of binary-to-text encoding, etc.).
Those protocols have the advantage that the reciever can be sure that when we see the "start of packet" marker, it is the actual start of packet (and not some data byte that coincidentally happens to have the same value).
This makes handling loss of synchronization much easier -- simply discard bytes until the next "start of packet" marker.
Many other formats, including the netstring format, allow any possible byte value to be transmitted as data.
So receivers have to be smarter about handling the start-of-header byte that might be an actual start-of-header, or might be a data byte -- but at least they don't have to deal with "escaping" or the surprisingly large buffer required, in the worst case, to hold a "fixed 64-byte data message" after escaping.
Choosing one approach really isn't any simpler than the other -- it just pushes the complexity to another place, as predicted by waterbed theory.
Would you mind skimming over the discussion of various ways of handling the start-of-header byte, including these two ways, at the Serial Programming Wikibook,
and editing that book to make it better?

Related

How to convert byte*payload to an int?

I am programming ESP8266thing dev board using arduino.
I have a value stored in byte*payload. I want to convert that value and store it into an int variable. I tried different methods but non of them is working fine. Can anyone suggest me a good method ? Thank You!!
How you do this depends entirely upon how you represented the value when you transmitted it via MQTT.
If you transmitted it in binary - for instance, you published the integer as series of bytes - then you also need to know the byte order and the number of bytes. Most likely it's least-significant-byte first (so if the integer in hex were 0x1234 it would be transmitted as two bytes - 0x34 followed by 0x12) and 32 bits.
If you're transmitting binary between two identical computers running similar software then you'll probably be fine (as long as that never changes), but if the computers differ or the software differs, you're dealing with representations of your integer that will be dependent on the platform you're using. Even using different languages on the two ends might matter - Python might represent an integer one way and C another, even if they're running on identical processors.
So if you transmit in binary you should really choose a machine-independent representation.
If you did transmit in binary and made no attempt at a machine-independent representation, the code would be something like:
byte *payload;
int payload_length;
int result;
if(payload_length < sizeof(int)) {
HANDLE THIS ERROR
} else {
result = *(int *)payload;
}
That checks to make sure there are enough bytes to represent a binary integer, and then uses a cast to retrieve the integer from the payload.
If you transmitted in binary in a machine-independent format then you'd need to do whatever transformation is necessary for the receiving architecture.
I don't really recommend transmitting in binary unless you know what you're doing and have good reasons for it. Most applications today will be fine transmitting as text - which you could say is the machine-independent representation.
The most likely alternative to transmitting in binary is in text - which can be a machine independent format. If you're transmitting an integer as text, your code would look something like this:
byte *payload;
int payload_length;
char payload_string[payload_length + 1];
int result;
memcpy(payload_string, payload, payload_length);
payload_string[payload_length] = '\0';
result = atoi(payload_string);
This code uses a temporary buffer to copy the payload into. We need to treat the payload like a C string, and C strings have an extra byte on the end - '\0' - which indicates end-of-string. There's no space for this in the payload and an end-of-string indicator may or may not have been sent as part of the payload, so we'll guarantee there's one by copying the payload and then adding one.
After that it's simple to call atoi() to convert the string to an integer.
Don't know if you found an answer yet, but I had the exact same issue and eventually came up with this:
payload[length] = '\0'; // Add a NULL to the end of the char* to make it a string.
int aNumber = atoi((char *)payload);
Pretty simple in the end!

How do I read gyro information from CleanFlight using MSP?

Using an Arduino, how would I go about getting attitude from the gyro in my flight controller using MultiWii Serial Protocol?
The following is based on just getting gyro attitude information, although it includes some information about using MSP in general. The example code referenced can be found here. The most important part of this that I couldn't find anywhere else (a friend had figured it out and let me in on the secret) is the Data section below.
MultiWii Serial Protocol
Firstly, let's look at how MSP works. I found this link (alternative link here) very useful in understanding this, but I'll summarize here.
There are three types of message that can be sent.
Command -- a message sent to the flight controller which has some information to be sent.
Request -- a message sent to the flight controller asking for some information to be returned.
Response -- a message sent by the flight controller with information responding to a request.
MSP messages have a specific structure. They have a header, size, type, data and checksum, in that order.
I found this diagram from here very helpful:
Header
The header is three bytes and contains the message start characters "$M" and a character showing which direction the message is going. "<" denotes going to the flight controller (command and request), ">" denotes coming from the flight controller (response).
Size
The fourth byte is the length (in bytes) of the data section. For example, if the data section had three INT 16 variables then the size byte would be 6.
Type
The type byte specifies what information is being sent in the message. You can find a list of types here. An example of this would be MSP_ATTITUDE which has type number of 108.
Data
The data is where all the information is sent. Request messages have no data in them. Commands and responses do, because they contain information. The types of data returned can again be found here.
The difficult part of the data section is that the bytes are reversed in order, and this is extremely poorly documented. So, for example, if I get the following two bytes in this order:
10011010
01001111
You would think that should become 10011010 01001111 but in fact does not. It would actually become 01001111 10011010.
In the example code, this operation is done as follows:
int16_t roll;
byte c; // The current byte we read in.
c = mspSerial.read(); // The first sent byte of the number.
roll = c; // Put the first sent byte into the second byte of the int 16.
c = mspSerial.read(); // The second sent byte of the number.
roll <<= 8; // Move the first sent byte into the first byte of the int16.
roll += c; // Put the second sent byte into the second byte of the int 16.
roll = (roll & 0xFF00) >> 8 | (roll & 0x00FF) << 8; // Reverse the order of bytes in the int 16.
Checksum
The final byte of an MSP message is the checksum. "The checksum is the XOR of size, type and payload bytes". For a request message the checksum is equal to the type.
Example
An example response message for an "MSP_ATTITUDE" request would be as follows:
00100100 -- '$' - Byte 1 of the header.
01001101 -- 'M' - Byte 2 of the header.
00111110 -- '>' - Byte 3 of the header.
00000110 -- '6' - The size byte.
01101100 -- '108' - The type number corresponding to "MSP_ATTITUDE".
11100010 -- The first sent byte of the roll INT16.
11111111 -- The second sent byte of the roll INT16.
00010010 -- The first sent byte of the pitch INT16.
00000000 -- The second sent byte of the pitch INT16.
11000010 -- The first sent byte of the yaw INT16.
00000000 -- The second sent byte of the yaw INT16.
10100111 -- The checksum byte.
Roll would become: 11111111 11100010 = -30.
Pitch would become: 00000000 00010010 = 18.
Yaw would become: 11000010 00000000 = 194.
As documented here the roll and pitch are in units of 1/10th of a degree. So the final values would be as follows:
Roll = -3.0
Pitch = 1.8
Yaw = 194
Setting up CleanFlight
To get these values the flight controller must be correctly configured to use MSP. I'll assume you've already got CleanFlight Configurator running.
You may want to use the main Serial connection for something else while your code is running, so we'll use the Soft Serial 2 port for this (pins 7 and 8 on the left of the board).
Go to the "Configuration" tab and scroll down to "Other Features". Make sure "SOFTSERIAL" and "TELEMETRY" are on. Save and Reboot.
Go to the "Ports" tab and set the "Data" column for "SOFTSERIAL2" to be active and set to 9600 (you could also use 19200 if you so desired, higher values may not work on the Arduino end). Save and Reboot.
The flight controller is now correctly configured.
Setting up the Arduino
To setup the Arduino, simply upload the example code to the Arduino. Connect Pin 12 on the Arduino to Pin 7 on the left-hand side of the Naze board, and Pin 11 on the Arduino to Pin 8 on the left-hand side of the Naze board.
Opening a Serial connection to the Arduino should now output Roll, Pitch and Yaw.
Using for other MSP communication
Although the code here is an example using MSP_ATTITUDE, the same theory applies to any of the MSP communications. The main differences would be a command message requiring data to be setup correctly (I haven't tested the code with that purpose in mind), and the readData function would need to have the switch statement modified depending on the data being received.

UDP API in Rust

1) API for send here returns Result<usize>. Why is that ? In my head, a UDP send is all or none. The return value seems to suggest that sending can succeed but entire data may not be written which makes me code like:
let mut bytes_written = 0;
while bytes_written < data.len() {
bytes_written += match udp_socket.send_to(&data[bytes_written..]) {
Ok(bytes_tx) => bytes_tx,
Err(_) => break,
}
}
Recently someone told me this is completely unnecessary. But I don't understand. If that were true why is the return not Result<()> instead, which is also what i was expecting ?
2)
For reads though I understand. I could give it a buffer of size 100 bytes but the datagram might only be 50 bytes long. So essentially I should utilise only read_buf[..size_read]. Here my question is what happens if the buffer size is 100 but the datagram size is say 150 bytes ? Will recv_from fill in only 100 bytes and return Ok(100, some_peer_addr) ? If i re-read will it fill in the remaining of the datagram ? What if another datagram of 50 bytes arrived before my second read ? Will i get just the remaining 50 bytes the second time and 50 bytes of new datagram the 3rd time or complete 100 bytes the 2nd time which also contains the new datagram ? Or will be an error and i will lose the 1st datagram on my initial read and never be able to recover it ?
The answer to both of these questions lies in the documentation of the respective BSD sockets functions, sendto() and recvfrom(). If you use some *nix system (OS X or Linux, for example), you can use man sendto and man recvfrom to find it.
1) sendto() man page is rather vague on this; Windows API page explicitly says that it is possible for the return value be less than len argument. See also this question. It looks like that this particular moment is somewhat under-documented. I think that it is probably safe to assume that the return value will always be equal either to len or to the error code. Problems may happen if the length of the data sent through sendto() exceeds the internal buffer size inside the OS kernel, but it seems that at least Windows will return an error in this case.
2) recvfrom() man page unambiguously states that the part of a datagram which does not fit into the buffer will be discarded:
The recvfrom() function shall return the length of the message
written to the buffer pointed to by the buffer argument. For
message-based sockets, such as SOCK_RAW, SOCK_DGRAM, and
SOCK_SEQPACKET, the entire message shall be read in a single
operation. If a message is too long to fit in the supplied buffer,
and MSG_PEEK is not set in the flags argument, the excess bytes shall
be discarded.
So yes, recv_from() will fill exactly 100 bytes, the rest will be discarded, and further calls to recv_from() will return new datagrams.
If you dig down, it just wrapping the C sendto function. This function returns number of bytes sent, so Rust just passes that on (while handling the -1 case and turning errno into actual errors).

recv function gives malformed data Winsock2 C++

In my simple TCP client server application, server send repetitively 1 kB message to the client and client send a reply acknowledgement (just send 'ACK') for each packet. Just think this scenario like client and server passing 1 kB messages here and there in a infinite loop.
I send the same message every time and the fist byte (first char) is always 1. But while testing this client and server application in the same machine for a long time, I noticed first character of some of the received messages are something else in the receive buffer and recv function also returned 1024 (1 kB). This is not happen frequently.
This is the how I receive.
char recvBuff[DEFAULT_BUFFER_SIZE];
int iResult = SOCKET_ERROR;
iResult = recv(curSocket, recvBuff, DEFAULT_BUFFER_SIZE, 0);
if (iResult == SOCKET_ERROR)
{
return iResult;
}
if (recvBuff[0] != 1)
{
//malformed receive
}
MessageHeader *q = (MessageHeader*)recvBuff;
message.header = *q; q++;
std::string temp((char*)q, message.header.fragmentSize);
message.message = temp;
Actually the problem is in constructing the temp string. It breaks since the correct fragment size not received. I tried to drop these kind of malformed data. But the problem is there is a gap between last successfully received fragment ID and first successfully received fragment ID after malformed receives. Any idea why these malformed receives happen?
You’re assuming that you’ve received a complete message when the recv() call completes. If this is a TCP connection (as opposed to UDP), it is byte-oriented, and that means that recv() will return whenever there are any bytes available.
Put more explicitly, there is no reason that doing
send (toServerSocket, someMessage, 1024, 0);
on the client side will cause
recv (fromClientSocket, myBuffer, 1024, 0);
to receive 1,024 bytes. It could just as well receive 27 bytes, with the remaining 997 coming from future calls to recv().
What’s happening in your program, then, is that you’re getting one of these short returns, and it’s causing your program to lose sync. with the message stream. How to fix it? Use recv() to read enough of your message that you know the length (or set a fixed length, though that’s inefficient in many cases). Then continue calling recv() into your buffer until you have read at least that many bytes. Note that you might read more bytes than the length of your message — that is, you may read some bytes that belong to the next message, so you will need to keep those in the buffer after processing the current message.

XoffLimit, XonLimit - what does it mean?

I am adjusting some parameters of RS-232 line. I ran accross these two parameters - XoffLimit, XonLimit. Wikipedia article suggets that this has to do with software flow control. Their values are XoffLimit = 1024, XonLimit = 1024. What is significance of this? What difference if I set XoffLimit = 1024, XonLimit = 2048?
Xon / Xoff is software flow control done by inserting escape characters in the data stream. I would doubt you need it and you probably shouldn't use it. Both sides would have to be configured to understand the escape chars.
But basically, when your input serial buffer reaches XonLimit, then the serial port would send an XOFF byte to make the other side stop sending. Then when your input buffer dropped down to XoffLimit, the XON byte would be sent to allow sender to resume.

Resources