Data Error Checking - serial-port

I've got a bit of an odd question. A friend of mine and I thought it would be funny to make a serial port kind of communication between computers using sound. Basically, computers emit a series of beeps to send data, and listen for beeps over a microphone to receive data. In short, the world's most annoying serial port. I have all of the basics worked out. I can filter out sounds of only one frequency and I have sent data from one computer to another. Although the transmission is fairly error free, being affected only by very loud noises, some issues still exist. My question is, what are some good ways to check the data for errors and, more importantly, recover from these errors.
My serial communication is very standard once you get past the fact it uses sound waves. I use one start bit, 8 data bits, and one stop bit in every frame. I have already considered Cyclic Redundancy Checks, and I plan to factor this into my error checking, but CRCs don't account for some of the more insidious issues. For example, consider sending two bytes of data. You send the first one, and it received correctly, but just after the stop bit of the first byte, and the start bit of the next, a large book falls on the floor, which the receiver interprets to be a start bit, now the true start bit is read as part of the data and the receiver could be reading garbage data for many bytes to come. Eventually, a pause in the data could get things back on track.
That isn't the worst of it though. Bits can be dropped too, and most error checking schemes I can think of rely on receiving a certain number of bytes. What happens when the receiver keeps waiting for bytes that may not come?
So, you can see the complexity of this question. If you can direct me to any resources, or just give me a few tips, I would greatly appreciate your help.

A CRC is just a part of the solution. You can check for bad data but then you have to do something about it. The transmitter has to re-send the data, it needs to be told to do that. A protocol.
The starting point is that you split up the data into packets. A common approach is a start byte that indicates the start of the packet, followed by a packet number, followed by a length byte that indicates the length of the packet. Followed by the data bytes and the CRC. The receiver sends an ACK or NAK back to indicate success.
This solves several problems:
you don't care about a bad start bit anymore, the pause you need to recover is always there
you start a timer when you receive the first bit or byte, declare failure when the timer expires before the entire packet is received
the packet number helps you recover from bad ACK/NAK returns. The transmitter times out and resends the packet, you can detect the duplicate
RFC 916 describes such a protocol in detail. I never heard of anybody actually implementing it (other than me). Works pretty well.

Related

What happens in rs232 if both sender and receiver are following odd parity and bit gets swapped?

I am studying USART, with the help of rs232 and max232 for communication.
I want to know if, in a scenario, sender and receiver are following odd parity and except the parity and start, stop bit rest bits gets swapped. So in this case how the receiver will get to know that data received by the receiver is wrong.Here,
Odd/Even parity is not particularly useful for exactly the reason you have identified - it detects only a subset of errors. In days when the number of gates that could fit on chip was far fewer, it had the advantage at least of requiring minimal logic to implement.
However even if you detect an error, what do you do about it? Normally a higher level packet based protocol is used where the packets have a more robust error check such as a CRC. In that case on an error, the receiver can request a resend of the erroneous packet.
At the word rather then packet level, it is possible to use a more sophisticated error checking mechanism bu using more bits for error checking and fewer for the data. This reduces the effective data rate further, and on a simple UART requires a software implementation. It is even possible to implement error-detection and correction at the word level, but this is seldom used for UART/USART comms.

LWIP: How exactly does the TCP_INTERVAL relate to the reception of ACK Messages?

I am trying to implement a data transfer from an embedded board to a PC. For this, I need to use low latency communication and I am bound to use Ethernet with TCP/IP.
Furthermore, I'm using the lwip stack.
First of all, I disabled nagle algorithm, because I have to send small packets of data (10 KB) and I want them to be sent as soon as possible, without waiting for intermediate ACKS.
The Wireshark Log shows me that this is working quite fine (the whole data is being sent to the PC in about 1msec).
After that, the PC takes about 200msec to send the last ACK (because the last Segment is not maximum size).
The problem is now, that on the embedded processor, it takes a very long time, until the lwip gives my application the message, that all of the data has been ACKED.
When I decrease the TCP_INTERVAL (to let's say 5), it speeds up greatly.
I am wondering, why lwip behaves like this? I would think that the Periodic-TCP-Tasks (which are being called according to the TCP_INTERVAL) have nothing to do with the Handling of the received frames (which is really another call in the main).
I hope I could state my problem somehow understandable, if not I would appreciate feedback, so I can improve my question!
Thanks!
EDIT:
After more debugging, I found out that the process of sending data results in the following function calls:
My main calls tcp_write(...)
tcp_tmr() is called multiple times (through the LwIP_Periodic_Handle() function). This happens seven times. During the eigth call:
tcp_output() is called. During this call, all segments which were added during the last tcp_write() call are sent by calling tcp_output_segment().
So now it is clear that if I reduce the TCP_INTERVAL, of course the data gets sent sooner, because the tcp_tmr() function is called more quickly.
but my question is still: Is this the normal behaviour? It seems a bit odd, that lwIP is waiting such a long time before actually sending the data.
Since Youre doing this My main calls tcp_write(...)
use tcp_output() immediately after tcp_write
or else use tcp_write() in tcp_recv callback

How do you read without specifying the length of a byte slice beforehand, with the net.TCPConn in golang?

I was trying to read some messages from a tcp connection with a redis client (a terminal just running redis-cli). However, the Read command for the net package requires me to give in a slice as an argument. Whenever I give a slice with no length, the connection crashes and the go program halts. I am not sure what length my byte messages need going to be before hand. So unless I specify some slice that is ridiculously large, this connection will always close, though this seems wasteful. I was wondering, is it possible to keep a connection without having to know the length of the message before hand? I would love a solution to my specific problem, but I feel that this question is more general. Why do I need to know the length before hand? Can't the library just give me a slice of the correct size?
Or what other solution do people suggest?
Not knowing the message size is precisely the reason you must specify the Read size (this goes for any networking library, not just Go). TCP is a stream protocol. As far as the TCP protocol is concerned, the message continues until the connection is closed.
If you know you're going to read until EOF, use ioutil.ReadAll
Calling Read isn't guaranteed to get you everything you're expecting. It may return less, it may return more, depending on how much data you've received. Libraries that do IO typically read and write though a "buffer"; you would have your "read buffer", which is a pre-allocated slice of bytes (up to 32k is common), and you re-use that slice each time you want to read from the network. This is why IO functions return number of bytes, so you know how much of the buffer was filled by the last operation. If the buffer was filled, or you're still expecting more data, you just call Read again.
A bit late but...
One of the questions was how to determine the message size. The answer given by JimB was that TCP is a streaming protocol, so there is no real end.
I believe this answer is incorrect. TCP divides up a bitstream into sequential packets. Each packet has an IP header and a TCP header See Wikipedia and here. The IP header of each packet contains a field for the length of that packet. You would have to do some math to subtract out the TCP header length to arrive at the actual data length.
In addition, the maximum length of a message can be specified in the TCP header.
Thus you can provide a buffer of sufficient length for your read operation. However, you have to read the packet header information first. You probably should not accept a TCP connection if the max message size is longer than you are willing to accept.
Normally the sender would terminate the connection with a fin packet (see 1) not an EOF character.
EOF in the read operation will most likely indicate that a package was not fully transmitted within the allotted time.

How does a device receiving data tell when data transmission stops?

I'm trying to understand asynchronous serial data transmission. I know that the transmitting device sends a start bit (e.g. 1) to the receiver to indicate that transmission has begun; then a stop bit (e.g. 0) afterwards to indicate that the transmission has ended.
What I don't understand: how does the receiving device know which bit is the stop bit? The stop bit is surely no different from the other bits of data. The only way I can think of is if the transmitting device stops sending bits for a significant gap, the receiving device would know that no more bits are forthcoming, and the last bit must have been a stop bit. But if that is the case, then why would a stop bit be required at all, rather than the receiving device simply waiting for a bit, and considering the transmission to be ended when the transmitting device doesn't send any more bits?
That becomes a question of protocol. start and stop bits only have meaning if the communicating devices agree on that meaning (e.g. a frame consists of a start bit, 8 data bits, and a stop bit). Similarly, how to denote when a particular communication is complete needs to be agreed between the participants (e.g. define one or more frames that denote message termination).So for a particular communication either a full frame is received and the listener keeps listening, a partial frame is received with no subsequent data transmission and the connection can be considered faulted after some duration, or a full frame is received and that frame denotes the end of the exchange.

Is there a good way to frame a protocol so data corruption can be detected in every case?

Background: I've spent a while working with a variety of device interfaces and have seen a lot of protocols, many serial and UDP in which data integrity is handled at the application protocol level. I've been seeking to improve my receive routine handling of protocols in general, and considering the "ideal" design of a protocol.
My question is: is there any protocol framing scheme out there that can definitively identify corrupt data in all cases? For example, consider the standard framing scheme of many protocols:
Field: Length in bytes
<SOH>: 1
<other framing information>: arbitrary, but fixed for a given protocol
<length>: 1 or 2
<data payload etc.>: based on length field (above)
<checksum/CRC>: 1 or 2
<ETX>: 1
For the vast majority of cases, this works fine. When you receive some data, you search for the SOH (or whatever your start byte sequence is), move forward a fixed number of bytes to your length field, and then move that number of bytes (plus or minus some fixed offset) to the end of the packet to your CRC, and if that checks out you know you have a valid packet. If you don't have enough bytes in your input buffer to find an SOH or to have a CRC based on the length field, then you wait until you receive enough to check the CRC. Disregarding CRC collisions (not much we can do about that), this guarantees that your packet is well formed and uncorrupted.
However, if the length field itself is corrupt and has a high value (which I'm running into), then you can't check the (corrupt) packet's CRC until you fill up your input buffer with enough bytes to meet the corrupt length field's requirement.
So is there a deterministic way to get around this, either in the receive handler or in the protocol design itself? I can set a maximum packet length or a timeout to flush my receive buffer in the receive handler, which should solve the problem on a practical level, but I'm still wondering if there's a "pure" theoretical solution that works for the general case and doesn't require setting implementation-specific maximum lengths or timeouts.
Thanks!
The reason why all protocols I know of, including those handling "streaming" data, chop up the datastream in smaller transmission units each with their own checks on board is exactly to avoid the problems you describe. Probably the fundamental flaw in your protocol design is that the blocks are too big.
The accepted answer of this SO question contains a good explanation and a link to a very interesting (but rather heavy on math) paper about this subject.
So in short, you should stick to smaller transmission units not only because of practical programming related arguments but also because of the message length's role in determining the security offered by your crc.
One way would be to encode the length parameter so that it would be easily detected to be corrupted, and save you from reading in the large buffer to check the CRC.
For example, the XModem protocol embeds an 8 bit packet number followed by it's one's complement.
It could mean doubling your length block size, but it's an option.

Resources