How to convert byte*payload to an int? - arduino

I am programming ESP8266thing dev board using arduino.
I have a value stored in byte*payload. I want to convert that value and store it into an int variable. I tried different methods but non of them is working fine. Can anyone suggest me a good method ? Thank You!!

How you do this depends entirely upon how you represented the value when you transmitted it via MQTT.
If you transmitted it in binary - for instance, you published the integer as series of bytes - then you also need to know the byte order and the number of bytes. Most likely it's least-significant-byte first (so if the integer in hex were 0x1234 it would be transmitted as two bytes - 0x34 followed by 0x12) and 32 bits.
If you're transmitting binary between two identical computers running similar software then you'll probably be fine (as long as that never changes), but if the computers differ or the software differs, you're dealing with representations of your integer that will be dependent on the platform you're using. Even using different languages on the two ends might matter - Python might represent an integer one way and C another, even if they're running on identical processors.
So if you transmit in binary you should really choose a machine-independent representation.
If you did transmit in binary and made no attempt at a machine-independent representation, the code would be something like:
byte *payload;
int payload_length;
int result;
if(payload_length < sizeof(int)) {
HANDLE THIS ERROR
} else {
result = *(int *)payload;
}
That checks to make sure there are enough bytes to represent a binary integer, and then uses a cast to retrieve the integer from the payload.
If you transmitted in binary in a machine-independent format then you'd need to do whatever transformation is necessary for the receiving architecture.
I don't really recommend transmitting in binary unless you know what you're doing and have good reasons for it. Most applications today will be fine transmitting as text - which you could say is the machine-independent representation.
The most likely alternative to transmitting in binary is in text - which can be a machine independent format. If you're transmitting an integer as text, your code would look something like this:
byte *payload;
int payload_length;
char payload_string[payload_length + 1];
int result;
memcpy(payload_string, payload, payload_length);
payload_string[payload_length] = '\0';
result = atoi(payload_string);
This code uses a temporary buffer to copy the payload into. We need to treat the payload like a C string, and C strings have an extra byte on the end - '\0' - which indicates end-of-string. There's no space for this in the payload and an end-of-string indicator may or may not have been sent as part of the payload, so we'll guarantee there's one by copying the payload and then adding one.
After that it's simple to call atoi() to convert the string to an integer.

Don't know if you found an answer yet, but I had the exact same issue and eventually came up with this:
payload[length] = '\0'; // Add a NULL to the end of the char* to make it a string.
int aNumber = atoi((char *)payload);
Pretty simple in the end!

Related

AES 128 decryption with ciphertext shorter than key

We are developing an application that has to work with data that is enycrpted by LoraWan (https://www.lora-alliance.org)
We have already found the documentation of how they encrypt their data, and have been reading through it for the past few days (https://www.lora-alliance.org/sites/default/files/2018-04/lorawantm_specification_-v1.1.pdf) but currently still can't solve our problem.
We have to use AES 128-bit ECB decryption with zero-padding to decrypt the messages, but the problem is it's not working because the encrypted messages we are receiving are not long enough for AES 128 so the algorithm returns a "Data is not a complete block" exception on the last line.
An example key we receive is like this: D6740C0B8417FF1295D878B130784BC5 (not a real key). It is 32 characters long, so 32 bytes, but if treat it as hexadecimal, then it becomes 16 bytes long, which is what is needed for AES 128-bit. This is the code we use to convert the Hex from String:
public static string HextoString(string InputText)
{byte[] hex= Enumerable.Range(0, InputText.Length)
.Where(x => x % 2 == 0)
.Select(x => Convert.ToByte(InputText.Substring(x, 2), 16))
.ToArray();
return System.Text.Encoding.ASCII.GetString(hex);}
(A small thing to note for the above code is that we are not sure what Encoding to use, as we could not find it in the Lora documentation and they have not told us, but depending on this small setting we could be messing up our decryption (though we have tried all possible combinations, ascii, utf8, utf7, etc))
An example message we receive is: d3 73 4c which we are assuming is also in hexadecimal. This is only 6 bytes, and 3 bytes if we convert it from hexa to normal, compared to the 16 bytes we'd need minimum to match the key length.
This is the code for Aes 128 decrypt we are using:
private static string Aes128Decrypt(string cipherText, string key){
string decrypted = null;
var cipherPlainTextBytes = HexStringToByteArray(cipherText);
//var cipherPlainTextBytes = ForcedZeroPadding(HexStringToByteArray(cipherText));
var keyBytes = HexStringToByteArray(key);
using (var aes = new AesCryptoServiceProvider())
{
aes.KeySize = 128;
aes.Key = keyBytes;
aes.Mode = CipherMode.ECB;
aes.Padding = PaddingMode.Zeros;
ICryptoTransform decryptor = aes.CreateDecryptor(aes.Key, aes.IV);
using (MemoryStream ms = new MemoryStream(cipherPlainTextBytes, 0, cipherPlainTextBytes.Length))
{
using (CryptoStream cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read))
{
using (StreamReader sr = new StreamReader(cs))
{
decrypted = sr.ReadToEnd();
}
}
}
}
return decrypted;}
So obviously this is going to return "Data is an incomplete block" at sr.ReadToEnd().
As you can see from the example, in that one commented out line, we have also tried to "Pad" the text to the correct size with a full zero byte array of correct length (16 - cipherText), in which case the algorithm runs fine, but it returns complete gibberish and not the original text.
We already have tried all of the modes of operation and messed around with padding modes as well. They are not providing us with anything but a cipherText, and a key for that text. No Initialization vector either, so we are assuming we are supposed to be generating that every time (but for ECB it isn't even needed iirc)
What's more is, they are able to encrypt-decrypt their messages just fine. What is most puzzling about this is that I have been googling this for days now and I cannot find a SINGLE example on google where the CIPHERTEXT is shorter than the key during decryption.
Obviously I have found examples where the message they are Encrypting is shorter than what is needed, but that is what padding is for on the ENCRYPTION side (right?). So that when you then receive the padded message, you can tell the algorithm what padding mode was used to make it correct length, so then it can seperate the padding from the message. But in all of those cases the recieved message during decryption is of correct length.
So the question is - what are we doing wrong? is there some way to decrypt with ciphertexts that are shorter than the key? Or are they messing up somewhere by producing ciphers that are too short?
Thanks for any help.
In AES-ECB, the only valid ciphertext shorter than 16-byte is empty. That 16-byte limit is the block (not key) size of AES, which happens to match the key size for AES-128.
Therefore, the question's
An example message we receive is: d3 73 4c
does not show an ECB encrypted message (since a comment tells that's from a JSON, that can't be bytes that happen to show as hex). And that's way too short to be a FRMPayload (per this comment) for a Join-Accept, since the spec says of the later:
1625 The message is either 16 or 32 bytes long.
Could it be that whatever that JSON message contains is not a full FRMPayload, but a fragment of a packet, encoded as hexadecimal pair with space separator? As long as it is not figured out how to build a FRMPayload, there's not point in deciphering it.
Update: If that mystery message is always 3 bytes, and if it is always the same for a given key (or available a single time per key), then per Maarten Bodewes's comment it might be a Key Check Value. The KCV is often the first 3 bytes of the encryption of the all-zero value with the key per the raw block cipher (equivalently: per ECB). Herbert Hanewinkel's javascript AES can work fully offline (which is necessary to not expose the key), and be used to manually validate an hypothesis. It tells that for the 16-byte key given in the question, a KCV would be cd15e1 (or c076fc per the variant in the next section).
Also: it is used CreateDecryptor to craft a gizmo in charge of the ECB decryption. That's likely incorrect in the context of decryption of a LoraWan payload, which requires ECB encryption for decryption of some fields:
1626 Note: AES decrypt operation in ECB mode is used to encrypt the join-accept message so that the end-device can use an AES encrypt operation to decrypt the message. This way an end device only has to implement AES encrypt but not AES decrypt.
In the context of decryption of a LoraWan packets, you want to communicate with the AES engine using byte arrays, not strings. Strings have an encoding, when LoraWan ciphertext and corresponding plaintext does not. Others seems to have managed to coerce the nice .NET do-it-all crypto API to get a low-level job done.
In the HextoString code, I vaguely get that the intention and perhaps outcome is that hex becomes the originally hex input as a byte array (fully rid of hexadecimal and other encoding sin; in which case the variable hex should be renamed to something on the tune of pure_bytes). But then I'm at loss about System.Text.Encoding.ASCII.GetString(hex). I'd be surprised if it just created a byte string from a byte array, or turned the key back to hexadecimal for later feeding to HexStringToByteArray in Aes128Decrypt. Plus this makes me fear that any byte in [0x80..0xFF] might turn to 0x3F, which is not nice for key, ciphertext, and corresponding LoraWan payload. These have no character encoding when de-hexified.
My conclusion is that if HexStringToByteArray does what its name suggests, and given the current interface of Aes128Decrypt, HextoString should simply remove whitespace (or is unneeded if HexStringToByteArray removes whitespace on the fly). But my recommendation is to change the interface to use byte arrays, not strings (see previous section).
As an aside: creating an ICryptoTransform object from its key is supposed to be performed once for multiple uses of the object.

C# BinaryReader/Writer equivalent in JAVA

I have a stream (hooked to an azure blob) which contains strings and integers. The same stream is consumed by a .net process also.
In C# the writing and reading is done through the type specific methods of BinaryWriter and BinaryReader classes e,g., BinaryWriter.Write("path1;path2") and BinaryReader.ReadString().
In Java, I couldn't find the relevant libraries to achieve the same. Most of the InputStream methods are capable of reading the whole line of the string.
If there are such libraries in Java, please share with me.
Most of the InputStream methods are capable of reading the whole line of the string.
None of the InputStream methods is capable of doing that.
What you're looking for is DataInputStreamand DataOutputStream.
If you are trying to read in data generated from BinaryWriter in C# you are going to have to mess with this on the bit level. The data you actually want is prefixed with an integer to show the length of the data. You can read about how the prefix is generated here:
C# BinaryWriter length prefix - UTF7 encoding
It's worth mentioning that from what I tested the length is written backwards. In my case the first two bytes of the file were 0xA0 0x54 convert this to binary to get 10100000 01010100. The first byte here starts with a 1 so it is not the last byte. The second byte starts with a 0 however so it is the last (or in this case first byte) for the length. So the resulting length prefix is 1010100 (taken from the last byte removing the indicator that it is the last byte) Then all previous bytes 0100000 which gives us the result of 10101000100000 or 10784 bytes. The file I was dealing with was 10786 bytes so with the two byte prefix indicating the length this is correct.

invalid pixel in Firefox because of content charset setting in Netty server

I am developing an http server with Netty. On some occasions, the server must answer a 1x1 transparent pixel. So I hard-coded a GIF transparent pixel in base64, and returned it with the following code :
String pixel_string= new String (Base64.decodeBase64("R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw=="));
HttpResponse response = new DefaultHttpResponse(HttpVersion.HTTP_1_1, HttpResponseStatus.OK);
response.setContent(ChannelBuffers.copiedBuffer(pixel_string, CharsetUtil.UTF_8));
EDIT : I also set the content-type :
response.setHeader(HttpHeaders.Names.CONTENT_TYPE,
"image/gif");
In Chrome, everything is fine. However, Firefox tells me that it cannot display the pixel (which is pretty bad for my app), as the pixel data in invalid.
After many investigations, I finally figured out a fix, by changing the charset to Iso-8859-1.
response.setContent(ChannelBuffers.copiedBuffer(
responseBuilder.pixel_string, CharsetUtil.ISO_8859_1));
I don't understand why it works, which makes me think that I may run into troubles in some cases. I tried to change the Firefox preferences (to have UTF8 as default), but it doesn't change much.
Why does Firefox accept the ISO-8859 encoding, and not UTF-8 ? Can I change that ? Would someone have a clue on the origin of the issue and how to be sure that it will work whatever the user's setting ?
Thanks
It's not Firefox that's accepting the encoding or not. It's your server.
When you do your base64 decode you produce a string that contains some characters... but what you really produced was bytes that you're then thinking of as characters somehow. Since a Java String is a container that holds a UTF-16 string, in practice what you're doing is taking each byte, treating it as a a 16-bit integer and constructing the UTF-16 "string" made up of those code units.
But when you want to put all this on the network, you have to convert you string to bytes, and the argument to copiedBuffer says how to do that. If converting to UTF-8, any character that came from a byte that had the high bit set will end up getting encoded as a two-byte UTF-8 sequence. On the other hand, if converting to ISO-8859-1, the conversion just drops the high byte of each UTF-16 code unit (which in your case is always zero anyway).
So the conversion to ISO-8859-1 produces the actual byte array you got out of base64-decoding, while the conversion to UTF-8 produces.... something else which may or may not actually make any sense depending on the exact byte values.
The copiedBuffer constructor you call is not appropriate for the type of data (binary) you are using. According to the JavaDoc of the Netty API, the one you are calling is:
Creates a new big-endian buffer whose content is the specified string
encoded in the specified charset.
Which means that your binary data is being "converted" to UTF-8 (which is meaningless). If you try to save the generated file and look at it with a hex editor, you'll probably see that it is corrupted.
Try with something like this (untested code):
static byte[] pixel_data = Base64.decodeBase64("R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==");
HttpResponse response = ...
response.setHeader(HttpHeaders.Names.CONTENT_TYPE, "image/gif");
response.setContent(ChannelBuffers.copiedBuffer(pixel_data));

Decrypting DUKPT Encrypted Track Data

As the title says, I am trying to decrypt DUKPT encrypted track data coming from a DUKPT enabled scanner.
I have the ANSI Standard (X9.24) for DUKPT and have successfully implemented the ability to generate the IPEK from the KSN and BDK. Furthermore, I have successfully implemented the ability to generate the Left and Right MAC Request and Response Keys by XORing the PIN Encryption Keys. Lastly, I am able to generate the EPB.
From here, I don't understand how to generate the MAC Request and Response from the L/R Keys that I have generated.
Lastly, once I get to that step, what comes next? When do I actually have the key that decrypts the track data sent by a DUKPT enabled device?
I am aware of the Thales Simulator and jPOS. My code is currently referencing the Thales Simulator to do all of its work. But, the file decryption process just isn't returning the expected data.
If anybody can offer some insight into decrypting track data, it would be much appreciated.
http://thalessim.codeplex.com/
http://jpos.org/
I spent too much time studying the horrible X9.24 spec and finally got both the encryption and decryption working with my vendor’s examples and marketing promptly decided to switch vendors. Since it is a standard, you would think that anybody’s implementation would be the same. I wish. Anyway, there are variations on how things are implemented. You have to study the fine print to make sure you are working things the same as your other side.
But that is not your question.
First if you need to decrypt a data track from a credit card, you are probably interested in producing a key that will decrypt the data based upon the original super secret Base Derivation Key. That has nothing to do with the MAC generation and is only mentioned in passing in that dreadful spec. You need to generate the IPEK for that key serial number and device ID and repeatedly apply the “Non-reversible Key Generation Process” from the spec if bits are set in the counter part of the full key serial number from the HSM.
That part of my code looks like this: (Sorry for the long listing in a posting.)
/*
* Bit "zero" set (this is a 21 bit register)(ANSI counts from the left)
* This will be used to test each bit of the encryption counter
* to decide when to find another key.
*/
testBit=0x00100000;
/*
* We have to "encrypt" the IPEK repeatedly to find the current key
* (See Section A.3). Each time we encrypt (generate a new key),
* we need to use the all prior bits to the left of the current bit.
* The Spec says we will have a maximum of ten bits set at any time
* so we should not have to generate more than ten keys to find the
* current encryption key.
*/
cumBits=0;
/*
* For each of the 21 possible key bits,
* if it is set, we need to OR that bit into the cumulative bit
* variable and set that as the KSN count and "encrypt" again.
* The encryption we are using the goofy ANSI Key Generation
* subroutine from page 50.
*/
for(int ii=0; ii<21; ii++)
{
if( (keyNumber&testBit) != 0)
{
char ksr[10];
char eightByte[8]={0};
cumBits |= testBit;
ksn.count=cumBits; /* all bits processed to date */
memcpy(ksr, &ksn,10); /* copy bit structure to char array*/
memcpy(crypt,&ksr[2],8); /* copy bytes 2 through 9 */
/*
* Generate the new Key overwriting the old.
* This will apply the "Non-reversible Key Generation Process"
* to the lower 64 bits of the KSN.
*/
keyGen(&key, &crypt, &key);
}
testBit>>=1;
}
Where
keyNumber is the current counter from the ksn
ksn is an 80 bit structure that contains the 80 bit Key Serial Number from the HSM
crypt is a 64 bit block of data I have it of type DES_cblock since I am using openSSL.
key is a 128 bit double DES_cblock structure.
The keyGen routine is almost verbatim from the “Non-reversible Key Generation Process” local subroutine on page 50 of the spec.
At the end of this, the key variable will contain the key that can be used for the decryption, almost. The dudes that wrote the spec added some “variant” behavior to the key to keep us on our toes. If the key is to be used for decrypting a data stream such as a credit card track, you will need to XOR bytes 5 and 13 with 0xFF and Triple DES encrypt the key with itself (ECB mode). My code looks like:
DOUBLE_KEY keyCopy;
char *p;
p=(char*)&key;
p[ 5]^=0xff;
p[13]^=0xff;
keyCopy=key;
des3(&keyCopy, (DES_cblock *)&key.left, &key.left);
des3(&keyCopy, (DES_cblock *)&key.right, &key.right);
If you are using this to decrypt a PIN block, you will need to XOR bytes 7 and 15 with 0xFF. (I am not 100% sure this should not be applied for the stream mode as well but my vendor is leaving it out.)
If it is a PIN block, it will be encrypted with 3-DES in ECB mode. If it is a data stream, it will be encrypted in CBC mode with a zero initialization vector.
(Did I mention I don’t much care for the spec?) It is interesting to note that the encryption side could be used in a non-hardware, tamper resistant security module if the server side (above) remembers and rejects keys that have been used previously. The technology is pretty neat. The ANSI spec leaves something to be desired but the technology is all right.
Good luck.
/Bob Bryan
For data encryption, the variant is 0000000000FF0000.0000000000FF0000 so you need to XOR bytes 5 and 13 instead of 7 and 15. In addition, you need an additional 3DES self-encryption step of each key parts (left and right).
Here is the relevant code in jPOS
https://github.com/jpos/jPOS/blob/master/jpos/src/main/java/org/jpos/security/jceadapter/JCESecurityModule.java#L1843-1856

byte aligning in serial communication

So I am trying to define a communication protocol for serial communication, I want to be able to send 4 byte numbers to the device, but I'm unsure how to make sure that the device starts to pick it up on the right byte.
For instance if I want to send
0x1234abcd 0xabcd3f56 ...
how do I makes sure that the device doesn't start reading at the wrong spot and get the first word as:
0xabcdabcd
Is there a clever way of doing this? I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?
Why not send a start-of-message byte followed by a length-of-data byte if you know how big the data is going to be?
Alternatively, do as other binary protocols and only send fixed sizes of packages with a fixed header. Say that you will only send 4 bytes, then you know that you'll have one or more bytes of header before the actual data content.
Edit: I think you're misunderstanding me. What I mean is that the client is supposed to always regard bytes as either header or data, not based on value but rather based on the position in the stream. Say you're sending four bytes of data, then one byte would be the header byte.
+-+-+-+-+-+
|H|D|D|D|D|
+-+-+-+-+-+
The client would then be a pretty basic state machine, along the lines of:
int state = READ_HEADER;
int nDataBytesRead = 0;
while (true) {
byte read = readInput();
if (state == READ_HEADER) {
// process the byte as a header byte
state = READ_DATA;
nDataBytesRead = 0;
} else {
// Process the byte as incoming data
++nDataBytesRead;
if (nDataBytesRead == 4)
{
state = READ_HEADER;
}
}
}
The thing about this setup is that what determines if the byte is a header byte is not the actual content of a byte, but rather the position in the stream. If you want to have a variable number of data bytes, add another byte to the header to indicate the number of data bytes following it. This way, it will not matter if you are sending the same value as the header in the data stream since your client will never interpret it as anything but data.
netstring
For this application, perhaps the relatively simple "netstring" format is adequate.
For example, the text "hello world!" encodes as:
12:hello world!,
The empty string encodes as the three characters:
0:,
which can be represented as the series of bytes
'0' ':' ','
The word 0x1234abcd in one netstring (using network byte order), followed by the word 0xabcd3f56 in another netstring, encodes as the series of bytes
'\n' '4' ':' 0x12 0x34 0xab 0xcd ',' '\n'
'\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n'
(The newline character '\n' before and after each netstring is optional, but makes it easier to test and debug).
frame synchronization
how do I makes sure that the device doesn't start reading at the wrong spot
The general solution to the frame synchronization problem is to read into a temporary buffer, hoping that we have started reading at the right spot.
Later, we run some consistency checks on the message in the buffer.
If the message fails the check, something has gone wrong,
so we throw away the data in the buffer and start over.
(If it was an important message, we hope that the transmitter will re-send it).
For example, if the serial cable is plugged in halfway through the first netstring,
the receiver sees the byte string:
0xab 0xcd ',' '\n' '\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n'
Because the receiver is smart enough to wait for the ':' before expecting the next byte to be valid data, the receiver is able to ignore the first partial message and then receive the second message correctly.
In some cases, you know ahead of time what the valid message length(s) will be;
that makes it even easier for the receiver to detect it has started reading at the wrong spot.
sending start-of-message marker as data
I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?
After sending the netstring header, the transmitter sends the raw data as-is -- even if it happens to look like the start-of-message marker.
In the normal case, the reciever already has frame sync.
The netstring parser has already read the "length" and the ":" header,
so the netstring parser
puts the raw data bytes directly into the correct location in the buffer -- even if those data bytes happen to look like the ":" header byte or the "," footer byte.
pseudocode
// netstring parser for receiver
// WARNING: untested pseudocode
// 2012-06-23: David Cary releases this pseudocode as public domain.
const int max_message_length = 9;
char buffer[1 + max_message_length]; // do we need room for a trailing NULL ?
long int latest_commanded_speed = 0;
int data_bytes_read = 0;
int bytes_read = 0;
int state = WAITING_FOR_LENGTH;
reset_buffer()
bytes_read = 0; // reset buffer index to start-of-buffer
state = WAITING_FOR_LENGTH;
void check_for_incoming_byte()
if( inWaiting() ) // Has a new byte has come into the UART?
// If so, then deal with this new byte.
if( NEW_VALID_MESSAGE == state )
// oh dear. We had an unhandled valid message,
// and now another byte has come in.
reset_buffer();
char newbyte = read_serial(1); // pull out 1 new byte.
buffer[ bytes_read++ ] = newbyte; // and store it in the buffer.
if( max_message_length < bytes_read )
reset_buffer(); // reset: avoid buffer overflow
switch state:
WAITING_FOR_LENGTH:
// FIXME: currently only handles messages of 4 data bytes
if( '4' != newbyte )
reset_buffer(); // doesn't look like a valid header.
else
// otherwise, it looks good -- move to next state
state = WAITING_FOR_COLON;
WAITING_FOR_COLON:
if( ':' != newbyte )
reset_buffer(); // doesn't look like a valid header.
else
// otherwise, it looks good -- move to next state
state = WAITING_FOR_DATA;
data_bytes_read = 0;
WAITING_FOR_DATA:
// FIXME: currently only handles messages of 4 data bytes
data_bytes_read++;
if( 4 >= data_bytes_read )
state = WAITING_FOR_COMMA;
WAITING_FOR_COMMA:
if( ',' != newbyte )
reset_buffer(); // doesn't look like a valid message.
else
// otherwise, it looks good -- move to next state
state = NEW_VALID_MESSAGE;
void handle_message()
// FIXME: currently only handles messages of 4 data bytes
long int temp = 0;
temp = (temp << 8) | buffer[2];
temp = (temp << 8) | buffer[3];
temp = (temp << 8) | buffer[4];
temp = (temp << 8) | buffer[5];
reset_buffer();
latest_commanded_speed = temp;
print( "commanded speed has been set to: " & latest_commanded_speed );
}
void loop () # main loop, repeated forever
# then check to see if a byte has arrived yet
check_for_incoming_byte();
if( NEW_VALID_MESSAGE == state ) handle_message();
# While we're waiting for bytes to come in, do other main loop stuff.
do_other_main_loop_stuff();
more tips
When defining a serial communication protocol,
I find it makes testing and debugging much easier if the protocol always uses human-readable ASCII text characters, rather than any arbitrary binary values.
frame synchronization (again)
I thought of using a marker for the start of a message, but what if I want to send the number I choose as data?
We already covered the case where the reciever already has frame sync.
The case where the receiver does not yet have frame sync is pretty messy.
The simplest solution is for the transmitter to send a series of harmless bytes
(perhaps newlines or space characters),
the length of the maximum possible valid message,
as a preamble just before each netstring.
No matter what state the receiver is in when the serial cable is plugged in,
those harmless bytes eventually drive the receiver into the
"WAITING_FOR_LENGTH" state.
And then when the tranmitter sends the packet header (length followed by ":"),
the receiver correctly recognizes it as a packet header and has recovered frame sync.
(It's not really necessary for the transmitter to send that preamble before every packet.
Perhaps the transmitter could send it for 1 out of 20 packets; then the receiver is guaranteed to recover frame sync in 20 packets (usually less) after the serial cable is plugged in).
other protocols
Other systems use a simple Fletcher-32 checksum or something more complicated to detect many kinds of errors that the netstring format can't detect ( a, b ),
and can synchronize even without a preamble.
Many protocols use a special "start of packet" marker, and use a variety of "escaping" techniques to avoid actually sending a literal "start of packet" byte in the transmitted data, even if the real data we want to send happens to have that value.
( Consistent Overhead Byte Stuffing, bit stuffing, quoted-printable and other kinds of binary-to-text encoding, etc.).
Those protocols have the advantage that the reciever can be sure that when we see the "start of packet" marker, it is the actual start of packet (and not some data byte that coincidentally happens to have the same value).
This makes handling loss of synchronization much easier -- simply discard bytes until the next "start of packet" marker.
Many other formats, including the netstring format, allow any possible byte value to be transmitted as data.
So receivers have to be smarter about handling the start-of-header byte that might be an actual start-of-header, or might be a data byte -- but at least they don't have to deal with "escaping" or the surprisingly large buffer required, in the worst case, to hold a "fixed 64-byte data message" after escaping.
Choosing one approach really isn't any simpler than the other -- it just pushes the complexity to another place, as predicted by waterbed theory.
Would you mind skimming over the discussion of various ways of handling the start-of-header byte, including these two ways, at the Serial Programming Wikibook,
and editing that book to make it better?

Resources