FCS collision in (Ethernet) frames - tcp

What happens if the data transmitted in an Ethernet Frame has been altered, but the FCS is still valid for this data? (FCS collision)
Is that the reason why some download sites provide MD5 / SHA sums for their downloads?

What will happen is that the frame will be considered valid by the data link layer and its payload will be passed to the upper layers. The upper layers might have their own error detection mechanisms (e.g. TCP/IP checksum) so the might detect the error themselves and throw out the frame, but then again, they might not. If they don't, then the application will actually receive faulty data.
The chances of that happening are rather low, but I'm sure happens due to the sheer number of packets sent across networks.

Related

What's the difference between WebSocket fragmentation and TCP fragmentation?

I'm reading about Websocket and I see that protocol have a data fragmentation (frames), a WebSocket message is composed of one or more frames, but it's not what TCP (fragmentation of data) do? I'm confused.
Fragmentation in the context of data transfer just means splitting the original data into smaller parts for transfer and combining these fragments later (for example at the recipients side) again to recreate the original data.
Fragmentation is often done if the underlying layer cannot handle larger messages or if larger messages will result in performance problems. Such problems might be because it is more expensive if one large message is lost and need to be repeated instead of only a small fragment. Or it can be a performance problem if the transfer of one large message would block the delivery of smaller messages. In this case it is useful to split the large message into fragments and deliver these message fragments together with the other messages so that these don't have to wait for delivery until the large message is done.
Fragmentation of messages in WebSockets is just one of the many types of fragmentation which exist at various layers at the data transport, like:
IP messages can be fragmented at the sender or some middlebox and get reassembled at the end.
TCP is a data stream. The various parts of the stream are transferred in different IP packets and get reassembled in the correct order at the recipient.
Application layer protocols like HTTP can have fragments too, for example the chunked Transfer-Encoding mode within HTTP or the fragments in WebSockets.
And at even higher layers there can be more fragments, like the spreading of a single large ZIP file into multiple parts onto floppy disks in former times or the accelerating of downloads by requesting different parts of the same file in parallel connections and combining these at the recipient.
I love the detailed answer by Steffen Ullrich, but I wish to add a few specific details regarding the differences between raw TCP/IP and the added Websockets layer.
TCP/IP is a stream protocol, meaning the application receives the data as fragmented pieces as data become available, with no clear indication of the fragmented "packet boundaries" or the original (non-fragmented) data structure.
The Websocket protocol is a message based protocol, meaning that the application will only receive the full Websocket message once all the fragmented pieces have arrived and put back together.
As a very simplified example:
TCP/IP: if a 50 Mb file is sent using TCP, the application will probably receive a piece of the file at a time and it will need to piece the file back together (possibly saving each piece to a temporary disk storage).
Websocket: if a 50 Mb file is sent using the Websocket protocol, the application will receive the whole of the 50Mb in one message (and the storage of all of the data, memory or disk, will be dictated by the Websocket layer, not the application layer).
Note that the Websocket Protocol is an additional layer over the TCP/IP protocol, so data is streamed over TCP/IP and the Websocket layer puts the pieces back together before forwarding the original (whole) message).
5.4. Fragmentation
A secondary use-case for fragmentation is for multiplexing, where it
is not desirable for a large message on one logical channel to
monopolize the output channel, so the multiplexing needs to be free
to split the message into smaller fragments to better share the
output channel. (Note that the multiplexing extension is not
described in this document.)
Even though it's listed as secondary reason, I'd say that't the primary reason for that fragmentation feature. Imagine, if you try to send first message with 1GB size and right away when you start sending it you also send second message with 1KB size. Framing allows applications to inject second message in between individual frames of the first message, this way receiver will not need to wait for 1GB to be transferred and will receive/handle 1KB second message right away.

Schemes for streaming data with BLE GATT characteristics

The GATT architecture of BLE lends itself to small fixed pieces of data (20 bytes max per characteristic). But in some cases, you end up wanting to “stream” some arbitrary length of data, that is greater than 20 bytes. For example, a firmware upgrade, even if you know its slow.
I’m curious what scheme others have used if any, to “stream” data (even if small and slow) over BLE characteristics.
I’ve used two different schemes to date:
One was to use a control characteristic, where the receiving device notify the sending device how much data it had received, and the sending device then used that to trigger the next write (I did both with_response, and without_response) on a different characteristic.
Another scheme I did recently, was to basically chunk the data into 19 byte segments, where the first byte indicates the number of packets to follow, when it hits 0, that clues the receiver that all of the recent updates can be concatenated and processed as a single packet.
The kind of answer I'm looking for, is an overview of how someone with experience has implemented a decent schema for doing this. And can justify why what they did is the best (or at least better) solution.
After some review of existing protocols, I ended up designing a protocol for over-the-air update of my BLE peripherals.
Design assumptions
we cannot predict stack behavior (protocol will be used with all our products, whatever the chip used and the vendor stack, either on peripheral side or on central side, potentially unknown yet),
use standard GATT service,
avoid L2CAP fragmentation,
assume packets get queued before TX,
assume there may be some dropped packets (even if stacks should not),
avoid unnecessary packet round-trips,
put code complexity on central side,
assume 4.2 enhancements are unavailable.
1 implies 2-5, 6 is a performance requirement, 7 is optimization, 8 is portability.
Overall design
After discovery of service and reading a few read-only characteristics to check compatibility of device with image to be uploaded, all upload takes place between two characteristics:
payload (write only, without response),
status (notifiable).
The whole firmware image is sent in chunks through the payload characteristic.
Payload is a 20-byte characteristic: 4-byte chunk offset, plus 16-byte data chunk.
Status notifications tell whether there is an error condition or not, and next expected payload chunk offset. This way, uploader can tell whether it may go on speculatively, sending its chunks from its own offset, or if it should resume from offset found in status notification.
Status updates are sent for two main reasons:
when all goes well (payloads flying in, in order), at a given rate (like 4Hz, not on every packet),
on error (out of order, after some time without payload received, etc.), with the same given rate (not on every erroneous packet either).
Receiver expects all chunks in order, it does no reordering. If a chunk is out of order, it gets dropped, and an error status notification is pushed.
When a status comes in, it acknowledges all chunks with smaller offsets implicitly.
Lastly, there is a transmit window on the sender side, where many successful acknowledges flying allow sender to enlarge its window (send more chunks ahead of matching acknowledge). Window is reduced if errors happen, dropped chunks probably are because of a queue overflow somewhere.
Discussion
Using "one way" PDUs (write without response and notification) is to avoid 6. above, as ATT protocol explicitly tells acknowledged PDUs (write, indications) must not be pipelined (i.e. you may not send next PDU until you received response).
Status, containing the last received chunk, palliates 5.
To abide 2. and 3., payload is a 20-byte characteristic write. 4+16 has numerous advantages, one being the offset validation with a 16-byte chunk only involves shifts, another is that chunks are always page-aligned in target flash (better for 7.).
To cope with 4., more than one chunk is sent before receiving status update, speculating it will be correctly received.
This protocol has the following features:
it adapts to radio conditions,
it adapts to queues on sender side,
there is no status flooding from target,
queues are kept filled, this allows the whole central stack to use every possible TX opportunity.
Some parameters are out of this protocol:
central should enforce short connection interval (try to enforce it in the updater app);
slave PHY should be well-behaved with slave latency (YMMV, test your vendor's stack);
you should probably compress your payload to reduce transfer time.
Numbers
With:
15% compression,
a device connected with connectionInterval = 10ms,
a master PHY limiting every connection event to 4-5 TX packets,
average radio conditions.
I get 3.8 packets per connection event on average, i.e. ~6 kB/s of useful payload after packet loss, protocol overhead, etc.
This way, upload of a 60 kB image is done in less than 10 seconds, the whole process (connection, discovery, transfer, image verification, decompression, flashing, reboot) under 20 seconds.
It depends a bit on what kind of central device you have.
Generally, Write Without Response is the way to stream data over BLE.
Packets being received out-of-order should not happen since BLE's link layer never sends the next packet before it the previous one has been acknowledged.
For Android it's very easy: just use Write Without Response to send all packets, one after another. Once you get the onCharacteristicWrite you send the next packet. That way Android automatically queues up the packets and it also has its own mechanism for flow control. When all its buffers are filled up, the onCharacteristicWrite will be called when there is space again.
iOS is not that smart however. If you send a lot of Write Without Response packets and the internal buffers are full, iOS will silently drop new packets. There are two ways around this, either implement some (maybe complex) protocol for the peripheral notifying the status of the transmission, like Nipos answer. An easier way however is to send each 10th packet or so as a Write With Response, the rest as Write Without Response. That way iOS will queue up all packets for you and not drop the Write Without Response packets. The only downside is that the Write With Response packets require one round-trip. This scheme should nevertheless give you high throughput.

What are the chances of losing a UDP packet?

Okay, so I am programming for my networking course and I have to implement a project in Java using UDP. We are implementing an HTTP server and client along with a 'gremlin' function that corrupts packets with a specified probability. The HTTP server has to break a large file up into multiple segments at the application layer to be sent to the client over UDP. The client must reassemble the received segments at the application layer. What I am wondering however is, if UDP is by definition unreliable, why am I having to simulate unreliability here?
My first thought is that perhaps it's simply because my instructor is figuring in our case, both the client and the server will be run on the same machine and that the file will be transferred from one process to another 100% reliably even over UDP since it is between two processes on the same computer.
This led me first to question whether or not UDP could ever actually lose a packet, corrupt a packet, or deliver a packet out of order if the server and client were guaranteed to be two processes on the same physical machine, guaranteed to be routed strictly over localhost only such that it won't ever go out over the network.
I would also like to know, in general, for a given packet what is the rough probability that UDP will drop / corrupt / or deliver a packet out of order while being used to facilitate communication over the open internet between two hosts that are fairly geographically distant from one another (say something comparable to the route between the average broadband user in the US to one of Google's CDNs)? I'm mostly just trying to get a general idea of the conditions experienced when communicated over UDP, does it drop / corrupt / misorder something on the order of 25% of packets, or is it more like something on the order of 0.001% of packets?
Much appreciation to anyone who can shed some light on any of these questions for me.
Packet loss happens for multiple reasons. Primarily it is caused by errors on individual links and network congestion.
Packet loss due to errors on the link is very low, when links are working properly. Less than 0.01% is not unusual.
Packet loss due to congestion obviously depends on how busy the link is. If there is spare capacity along the entire path, this number will be 0%. But as the network gets busy, this number will increase. When flow control is done properly, this number will not get very high. A couple of lost packets is usually enough that somebody will reduce their transmission speed enough to stop packets getting lost due to congestion.
If packet loss ever reaches 1% something is wrong. That something could be a bug in how your congestion control algorithm responds to packet loss. If it keeps sending packets at the same rate, when the network is congested and losing packets, the packet loss can be pushed much higher, 99% packet loss is possible if software is misbehaving. But this depends on the types of links involved. Gigabit Ethernet uses backpressure to control the flow, so if the path from source to destination is a single Gigabit Ethernet segment, the sending application may simply be slowed down and never see actual packet loss.
For testing behaviour of software in case of packet loss, I would suggest two different simulations.
On each packet drop it with a probability of 10% and transmit it with a probability of 90%
Transmit up to 100 packets per second or up to 100KB per second, and drop the rest if the application would send more.
if UDP is by definition unreliable, why am I having to simulate unreliability here?
It is very useful to have a controlled mechanism to simulate worst case scenarios and how both your client and server can respond to them. The instructor will likely want you to demonstrate how robust the system can be.
You are also talking about payload validity here and not just packet loss.
This led me to question whether or not UDP, lose a packet, corrupt a packet, or deliver it out of order if the server and client were two processes on the same machine and it wasn't having to go out over the actual network.
It is obviously less likely over the loopback adapter, but this is not impossible.
I found a few forum posts on the topic here and here.
I am also wondering what the chances of actually losing a packet, having it corrupted, or having them delivered out of order in reality would usually be over the internet between two geographically distant hosts.
This question would probably need to be narrowed down a bit. There are several factors both application level (packet size and frequency) as well as limitations/traffic of routers and switches along the path.
I couldn't find any hard numbers on this but it seems to be fairly low... like sub 5%.
You may be interested in The Internet Traffic Report and possibly pages such as this.
I was spamming udp packets over wifi to some nanoleaf panels and my packet loss was roughly 1/7000.
I think it depends on a ton of factors.

Data Link Layer and Transport Layer

What is the need of error control at data link layer when Transport layer provides error control ? What is the difference between the two error controls ?
Transport layer data could be broken down to many data-link layer frames/packets.
So it is possible that even without any data-link errors the transport layer stream/packet may be corrupt. Edit: This is because a transport layer path is usually composed of many data-link layer hops, for example:
Host1 <----> switch1 <----> switch2 <----> Host2
if a packet was lost between switch1 and switch2 then there would be no errors recorded on the switch2 Host2 link, but the corresponding transport layer stream would be corrupted.
On the other hand - once a data-link error is encountered it's possible to drop/restart the transport-layer transmission, without wasting resources.
This is because Data link layer deals exclusively with bit-level error correction. It takes a packet the receiving computer already has in its possession and determines if an error occurred in transmission and whether the data is intact or corrupt. However, there need to be additional controls in place to make sure the system knows that all the packets are arriving. This is called end to end error control and is the responsibility of transport layer. Transport layer couldn't care less whether the data in the payload is good or bad. That's Data link's job. Transport only cares if it is getting every packet that it is supposed to, and whether or not there are arriving in the right order. It is the transport layer that detects the absence of packets or the corruption of packets that occurred on the transmission end before they arrived at the Data link layer.
For additional details, refer to
http://books.google.ca/books?id=9c1FpB8qZ8UC&pg=PA216&lpg=PA216&dq=why+error+detection+and+correction+both+in+transport+and+link+layer+?&source=bl&ots=RI7-DU8RO0&sig=0U5Z9AmKkx3m3TA71WfIe1uTeW0&hl=en&sa=X&ei=LbqPUsahOtDEqQHyvIHQCw&ved=0CDUQ6AEwAQ#v=onepage&q=why%20error%20detection%20and%20correction%20both%20in%20transport%20and%20link%20layer%20%3F&f=false
It really depends on the protocols rather than the layer, but assuming you mean TCP...
TCP's error detection is minimal and designed more as an integrity check than any kind of reliable error detection. The reason you don't see this is practice is that data-link layers such as Ethernet, PPP, FrameRelay, etc. have much, much more robust error detection algorithms and so there are virtually no transmission errors for the TCP protocol to detect.
If you had a different transport layer protocol with robust error detection then you wouldn't strictly need it at lower levels. There is benefit, largely performance and resource use related, to discarding errors as low in the stack as possible.
Note that errors can creep in above the transport layer due to ram glitches, etc, so if data is really, really important then you should include error checking right in your application.
Assuming the checksum was correct this result meant that the data was damaged in transit.
Furthermore, the damage took place not on the transmission links (where it would be caught by the CRC) but rather must have occurred in one of the intermediate systems (routers
and bridges) or the sending or receiving hosts.
http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf
Firstly, retransmission of packets from end-to-end is an expensive task and potentially takes a lot of time. Having these checks at each link reduces the "length" that the retransmitted packet has to travel. Consider a case when there is an error-prone link on the end-to-end path. This link will probably cause many packet drops, being that it is unreliable. If there wasn't any link layer reliability, the retransmission could only be handled by the transport layer protocol. Therefore, the malformed packet would have to reach its destination, send a NACK (or equivalent), and only then could the retransmission take place. On the other hand, if link layer has reliability built in the packet would be retransmitted immediately only on the unreliable link.
Secondly, link layer reliability relies on bit checks only, while transport layer reliability also utilizes sequencing and acknowledgments. Consider a case where the segment passed to the network layer needs to be fragmented because the MTU is too small. Link layer reliability will only check for the integrity of each individual fragment. If one fragment is lost, link layer may not raise an alarm. On the other hand, transport layer will because it expects all the fragmented packets.
Finally, link layer is not only carrying TCP and other transport layer protocols within its payload. Therefore, it is befitting to have reliability built in for protocols which do not have reliability built-in so that malformed payloads don't go up the stack.
In a noisy channel where the error rate is high, like wireless networks, the error correction is done at the datalink layer.
In robust networks where the error rate is low, like LANs, the error correction is done at the transport layer, so the retransmission cost is minimized.

How come ftp protocol produces transmission errors sometimes if the data is using TCP, which is checksummed?

Every once in a while, downloading (especially large) files through ftp will produce errors. I am guessing that's also partly the reason why all major sites are publishing external checksums along with their downloads.
How is this possible if ftp goes through TCP, which has checksum inbuilt and resends data if it is transmitted corruptly?
One could argue that this is due to the short length of the CRC in the TCP protocol (which is 16bit I think, or something like that), and the collisions are simply happening too often. but
1) for this to be true, not only must there be a CRC collision, but also the random network error must modify both the CRC in the packet, and the packet itself so that the CRC will be valid for the new packet... Even with 16 bitCRC, is that so likely?
2) There are seemingly not many errors in, say, browsing the web which also goes through TCPIP.
FTP distinguishes between ASCII and BINARY data, and can modify the data stream accordingly, which is the most common reason I've encountered for corrupted FTP downloads.
(The TCP checksums would be computed on the modified data, so nothing would appear
amiss at the TCP level.)
Next most common, I suppose, would be a transfer that gets truncated due to a timeout
or other network error. In that case the TCP checksums would be locally correct, but
the partially downloaded file is corrupt.
The FTP protocol is a bit firewall-unfriendly, since it can involve external hosts connecting back on
unpredictable port numbers, but that usually manifests as an inability to transfer
anything at all, rather than a corrupted download.
Apart from ASCII vs. BINARY issues, I can't think of a reason why FTP connections should
be more susceptible to corrupted transfers. Maybe you just notice them more, because
they tend to be things like binaries or compressed files that need to be bit-for-bit
complete and correct, and if not you get a big ugly error message. One is much less likely to notice, say, a missing advertisement
on a web page because the connection to the ad network timed out.
A 16-bit checksum isn't startlingly strong, especially when you consider the size of some FTP transfers, e.g. software downloads. However there are CRCs and so forth at the lower layers which compensates.
I don't think I've had a corrupt FTP download this century myself.

Resources