Reliable udp in c# - networking

There are several articles on the internet about how to make udp reliable. I have not been able to find one on c#. So maybe I can implement my algorithm.
from researching on the internet I believe udp has two problems:
it does not ensure that all data reaches it's destination.
data may reach it's destination on a different order
maybe there is a third problem that I am missing in order to make it reliable
if you are interested in knowing why I want to make udp reliable and why I don't use tcp instead take a look at this question. Believe me, I been trying to do tcp punch holing for so long.
anyways maybe there is already a library that I can use with c# that will enable me to do this. Because I have note been able to find a library I been thinking about the following algorithm:
"Imagine there is computer A and computer B and computer A is the one that is sending the file to computer B";
here are the steps that I been thinking of:
1) computer A opens the file for reading and let's say it is 5000 bytes. that means that computer A will have to send 5000 bytes to computer B making sure no bytes are lost and also in the right order.
2) computer A get's the first 500 bytes of the file and it get's the hash of those bytes. so now computer A has two things the hash of those 500 bytes and also the bytes. (the hash will be an efficient algorithm such as md5 to make sure data got received in the right order. that is md5(1,2,3) != md5(2,1,3))
3) imaging the hash of those first 500 bytes comes out to be kj82lkdi930fi1.
4) computer B should be listening for a hash and bytes.
5) computer A sends the hash to computer B. and it sends the 500 bytes too. as soon as it sends that it start's waiting for a reply.
6) computer B should now receive the hash and the bytes. computer b performs the same algorithm md5 on the received bytes. if that result is equal to the hash that was received then it replies back to A with {1,1,1,1,1,1} otherwise it replies with {2,2,2,2,2,2,2}
6.5) let's assume computer B got the data on the right order so it replies {1,1,1,1,1,} it also saves the hash code on memory or array.
7) computer A should be waiting for a response in order to send the next 500 bytes. let's say that it receives {1,1,1}. because it received a 1 it knows it can proceed and send the next 500 bytes with a new hash code of those 500 bytes.
8) computer A sends the next 500 bytes with its hash code.
9) let's pretend computer B did not receive the data so it does not reply back to A. computer B will still wait for bytes and a hash
8) since computer A has not receive a 1,1,1,1,1, or 2,2,2,2,2, for a reasonable amount of time, then A will send the same bytes and hash again for a second time.
9) let's assume computer B receives the hash and the bytes but the bytes got received on a different order. when computer B calculates the hash on those bytes then that hash will not match the hash that was received. as a result it will reply back with {2,2,2,2,2,2}
10) if computer A receives the 2,2,2,2,2,2 then it will send the same bytes and hash. if it did not receive the 2,2,2,2,2 for some reason then it will send the same bytes and hash after some period of time. let's pretend computer A receives 2,2,2,2,2
11) computer A sends the same bytes and hash for the 3th time.
12) computer B receives the hash and bytes on the right order. as a result it replies 1,1,1,1,1,1 and saves that previous hash on memory. (recall step 6.5)
13) lets pretend computer A did not receive the 1,1,1,1 response from B. It will then send the same bytes for the forth time.
14) computer B checks the hashes and if it is equal to the last one that was accepted then it replies 1,1,1,1 again without writing those bytes to the file.
15) the algorithm continues like so until the file get's transferred.
.
.
.
I mean there are obviously some other things that I need to add to this algorithm such as letting computer B know when the transfer is done. maybe checking for more errors. what happens if computer A get disconnected for a long time. But the main protocol will be something like the one I described.
So do you think I should start implementing this algorithm? should I increase and send more bytes every time. I mean send 1000 instead of 500? There are lots of articles on the internet that tell you about several techniques but very few of them give you a working example on the language that you want. In this case I need this in c#.

The third problem is that data can be corrupted when you receive it.
You can start by reading TCP RFC just to understand how TCP makes communication reliable.
Having that knowledge you can implement some of its techniques using UDP as transport.
Also take a look at this UDP network library http://code.google.com/p/lidgren-network-gen3/

Related

Schemes for streaming data with BLE GATT characteristics

The GATT architecture of BLE lends itself to small fixed pieces of data (20 bytes max per characteristic). But in some cases, you end up wanting to “stream” some arbitrary length of data, that is greater than 20 bytes. For example, a firmware upgrade, even if you know its slow.
I’m curious what scheme others have used if any, to “stream” data (even if small and slow) over BLE characteristics.
I’ve used two different schemes to date:
One was to use a control characteristic, where the receiving device notify the sending device how much data it had received, and the sending device then used that to trigger the next write (I did both with_response, and without_response) on a different characteristic.
Another scheme I did recently, was to basically chunk the data into 19 byte segments, where the first byte indicates the number of packets to follow, when it hits 0, that clues the receiver that all of the recent updates can be concatenated and processed as a single packet.
The kind of answer I'm looking for, is an overview of how someone with experience has implemented a decent schema for doing this. And can justify why what they did is the best (or at least better) solution.
After some review of existing protocols, I ended up designing a protocol for over-the-air update of my BLE peripherals.
Design assumptions
we cannot predict stack behavior (protocol will be used with all our products, whatever the chip used and the vendor stack, either on peripheral side or on central side, potentially unknown yet),
use standard GATT service,
avoid L2CAP fragmentation,
assume packets get queued before TX,
assume there may be some dropped packets (even if stacks should not),
avoid unnecessary packet round-trips,
put code complexity on central side,
assume 4.2 enhancements are unavailable.
1 implies 2-5, 6 is a performance requirement, 7 is optimization, 8 is portability.
Overall design
After discovery of service and reading a few read-only characteristics to check compatibility of device with image to be uploaded, all upload takes place between two characteristics:
payload (write only, without response),
status (notifiable).
The whole firmware image is sent in chunks through the payload characteristic.
Payload is a 20-byte characteristic: 4-byte chunk offset, plus 16-byte data chunk.
Status notifications tell whether there is an error condition or not, and next expected payload chunk offset. This way, uploader can tell whether it may go on speculatively, sending its chunks from its own offset, or if it should resume from offset found in status notification.
Status updates are sent for two main reasons:
when all goes well (payloads flying in, in order), at a given rate (like 4Hz, not on every packet),
on error (out of order, after some time without payload received, etc.), with the same given rate (not on every erroneous packet either).
Receiver expects all chunks in order, it does no reordering. If a chunk is out of order, it gets dropped, and an error status notification is pushed.
When a status comes in, it acknowledges all chunks with smaller offsets implicitly.
Lastly, there is a transmit window on the sender side, where many successful acknowledges flying allow sender to enlarge its window (send more chunks ahead of matching acknowledge). Window is reduced if errors happen, dropped chunks probably are because of a queue overflow somewhere.
Discussion
Using "one way" PDUs (write without response and notification) is to avoid 6. above, as ATT protocol explicitly tells acknowledged PDUs (write, indications) must not be pipelined (i.e. you may not send next PDU until you received response).
Status, containing the last received chunk, palliates 5.
To abide 2. and 3., payload is a 20-byte characteristic write. 4+16 has numerous advantages, one being the offset validation with a 16-byte chunk only involves shifts, another is that chunks are always page-aligned in target flash (better for 7.).
To cope with 4., more than one chunk is sent before receiving status update, speculating it will be correctly received.
This protocol has the following features:
it adapts to radio conditions,
it adapts to queues on sender side,
there is no status flooding from target,
queues are kept filled, this allows the whole central stack to use every possible TX opportunity.
Some parameters are out of this protocol:
central should enforce short connection interval (try to enforce it in the updater app);
slave PHY should be well-behaved with slave latency (YMMV, test your vendor's stack);
you should probably compress your payload to reduce transfer time.
Numbers
With:
15% compression,
a device connected with connectionInterval = 10ms,
a master PHY limiting every connection event to 4-5 TX packets,
average radio conditions.
I get 3.8 packets per connection event on average, i.e. ~6 kB/s of useful payload after packet loss, protocol overhead, etc.
This way, upload of a 60 kB image is done in less than 10 seconds, the whole process (connection, discovery, transfer, image verification, decompression, flashing, reboot) under 20 seconds.
It depends a bit on what kind of central device you have.
Generally, Write Without Response is the way to stream data over BLE.
Packets being received out-of-order should not happen since BLE's link layer never sends the next packet before it the previous one has been acknowledged.
For Android it's very easy: just use Write Without Response to send all packets, one after another. Once you get the onCharacteristicWrite you send the next packet. That way Android automatically queues up the packets and it also has its own mechanism for flow control. When all its buffers are filled up, the onCharacteristicWrite will be called when there is space again.
iOS is not that smart however. If you send a lot of Write Without Response packets and the internal buffers are full, iOS will silently drop new packets. There are two ways around this, either implement some (maybe complex) protocol for the peripheral notifying the status of the transmission, like Nipos answer. An easier way however is to send each 10th packet or so as a Write With Response, the rest as Write Without Response. That way iOS will queue up all packets for you and not drop the Write Without Response packets. The only downside is that the Write With Response packets require one round-trip. This scheme should nevertheless give you high throughput.

Reduce Storage, By Juggling Packets in Network?

I want to preface this by saying, I have a feeling that this idea will not work the way I'm imagining, but I'm not sure why. Its likely I'm making some sort of false assumption about the way the internet works.
Lets say server A has a file of size 1024 kb. This file is split up into 1024 packets to be sent over a network. Server A sends these 1024 packets to server B via TCP. As soon as B receives a packet it sends it back to A and vice versa. If any 3rd party C makes a request to A for the data, it would make a copy of each packet it receives from B and send to to C and B.
In this scheme, server A could delete its file once it has sent all the packets to B, freeing up disk space. Servers A and B only need to store a single packet at any given time and the rest of the packets would be "juggled" in the network.
Is it really possible to store data by "juggling" it through a network? Am I underestimating the overhead in receiving and sending packets?
EDIT: This is assuming 100% network reliability, which is probably a completely unrealistic assumption.
A network does have some sort of "storage" capacity, and it is usually measured by the "Bandwidth-delay product". (Think network as a pipe, then its storage-capacity, the amount of data(water) that the pipe can hold is the volume of that pipe, the area-length product.) If you use unreliable protocols such as UDP to do the "cyclic sending", the extra data that exceeds the capacity will simply lose. If you use TCP to do it, after filling up the "pipe", the sending will then fill up the internal sending buffer on the OS, and then block.

Is using packets in a networking library a good idea at all?

I've been programming a library for both TCP and UDP networking and thought about using packets. Currently I've implemented a packet class which can be used like the C++ standard library's stream classes (it has << and >> for inputting and reading data). I plan on sending the packets like so:
bytes 1-8 - uint64_t as the size of the packet.
bytes 8-size - contents of the packet.
But there's a problem. What if a malicious client sends a size measured in terabytes and random garble as the filler? The server's memory is filled with the random garble and it will freeze/crash.
Is it a good idea to let the server decide the maximum allowed size of the received packet?
Or should I discard packets and implement transferring data as streams (where reading/writing would be entirely decided by the user of the library)?
(PS: I'm not a native English speaker, so forgive my possibly hideous usage of the language.)
Yes, set a maximum allowed size on the server side. Set it so that the server won't freeze/crash, but not smaller. Predictable behaviour should be the highest goal.

Nagle-Like Problem

so I have this real-time game, with a C++ sever with disabled nagle using SFML library , and client using asyncsocket, also disables nagle. I'm sending 30 packets every 1 second. There is no problem sending from the client to the server, but when sending from the server to the clients, some of the packets are migrating. For example, if I'm sending "a" and "b" in completly different packets, the client reads it as "ab". It's happens just once a time, but it makes a real problem in the game.
So what should I do? How can I solve that? Maybe it's something in the server? Maybe OS settings?
To be clear: I AM NOT using nagle but I still have this problem. I disabled in both client and server.
For example, if I'm sending "a" and "b" in completly different packets, the client reads it as "ab". It's happens just once a time, but it makes a real problem in the game.
I think you have lost sight of the fundamental nature of TCP: it is a stream protocol, not a packet protocol. TCP neither respects nor preserves the sender's data boundaries. To put it another way, TCP is free to combine (or split!) the "packets" you send, and present them on the receiver any way its wants. The only restriction that TCP honors is this: if a byte is delivered, it will be delivered in the same order in which it was sent. (And nothing about Nagle changes this.)
So, if you invoke send (or write) on the server twice, sending these six bytes:
"packet" 1: A B C
"packet" 2: D E F
Your client side might recv (or read) any of these sequences of bytes:
ABC / DEF
ABCDEF
AB / CD / EF
If your application requires knowledge of the boundaries between the sender's writes, then it is your responsibility to preserve and transmit that information.
As others have said, there are many ways to go about that. You could, for example, send a newline after each quantum of information. This is (in part) how HTTP, FTP, and SMTP work.
You could send the packet length along with the data. The generalized form for this is called TLV, for "Type, Length, Value". Send a fixed-length type field, a fixed-length length field, and then an arbitrary-length value. This way you know when you have read the entire value and are ready for the next TLV.
You could arrange that every packet you send is identical in length.
I suppose there are other solutions, and I suppose that you can think of them on your own. But first you have to realize this: TCP can and will merge or break your application packets. You can rely upon the order of the bytes' delivery, but nothing else.
You have to disable Nagle in both peers. You might want to find a different protocol that's record-based such as SCTP.
EDIT2
Since you are asking for a protocol here's how I would do it:
Define a header for the message. Let's say I would pick a 32 bits header.
Header:
MSG Length: 16b
Version: 8b
Type: 8b
Then the real message comes in, having MSG Length bytes.
So now that I have a format, how would I handle things ?
Server
When I write a message, I prepend the control information (the length is the most important, really) and send the whole thing. Having NODELAY enabled or not makes no difference.
Client
I continuously receive stuff from the server, right ? So I have to do some sort of read.
Read bytes from the server. Any amount can arrive. Keep reading until you've got at least 4 bytes.
Once you have these 4 bytes, interpret them as the header and extract the MSG Length
Keep reading until you've got at least MSG Length bytes. Now you've got your message and can process it
This works regardless of TCP options (such as NODELAY), MTU restrictions, etc.

Strange behavior using SO_SNDBUF on non-blocking TCP socket under windows

I'm trying to lower the send buffer size on my non-blocking TCP socket so that I can properly display an upload progress bar but I'm seeing some strange behavior.
I am creating a non-blocking TCP socketed, setting SO_SNDBUF to 1024, verifying that is is set properly, then connecting (tried this before and after call to connect with no difference).
The problem is, when my app actually comes around and calls send (sending about 2MB) rather than returning that around 1024 bytes were sent, the send call apparently accepts all the data and returns a sent value of 2 MB (exactly what I passed in). Everything operates properly (this is an HTTP PUT and i get a response, etc) but what I end up displaying in my progress bar is the upload sitting at 100% for about 30 seconds then the response coming in.
I have verified that if I stop before getting the response the upload does not complete so it's not like it just uploaded really fast and then the server stalled... Any ideas? Does windows even look at this setting?
Windows does look at this setting, but the setting is not working as you expect it to be.
When you're setting the size of those buffers, you're actually setting the size of the buffers on the actuall NIC you're communicating with, thus determining the size of the packets that are going out.
What you need to know about Windows, is that there is a buffer between your calling code and the actuall NIC, and I'm not sure that you can control the size of that. What happens if when you call the Send operation on your socket, you're dumping the data in that socket, and the Kernel of Windows will perform small step by step sends on the NIC using the data in the buffer.
This means that the code will actually report 2MB beeing 'sent', but this just means that your 2MB of data has been successfully written in the internal buffer, and it does not mean/guarantee that the data has already been sent.
I've been working on similar projects with video streaming and tcp communications, and this information is somewhere available on the MSDN Forums and technet, but it requires some really detailed searching on how it all actually works.
I observed the same thing on Windows, using Java non-blocking channel.
According to http://support.microsoft.com/kb/214397
If necessary, Winsock can buffer significantly more than the SO_SNDBUF buffer size.
This makes sense; the send is initiated by a program on local machine, which is presumed to be cooperative and not hostile. If kernel has enough memory, there's no point to reject the send data; some one must buffer it anyway. (Receive buffer is for the remote program, which may be hostile)
Kernel does have limits on this buffering of send data. I'm making a server socket, and kernel accepts at most 128K per send; not like 2MB in your example which is for a client socket.
Also according to the same article, kernel only buffer 2 sends; next non-blocking send should return immediately reporting 0 bytes written. So if we only send small amount of data each time, the program will be throttled by the receiving end, and your progress indicator would work nicely.
The setting does not affect anything on the NIC; it is the Kernel buffer that is affected. It defaults to 8k for both Send and Receive.
The reason for the behavior you are seeing is this: the send buffer size is NOT the limit of the amount you can sent at one time, it is the "nominal" buffer size. It really only affects subsequent sends when there is still data in the buffer waiting to be sent.
For example:
Set the send buffer to 101 bytes
Send 10 bytes, it will be buffered
Send 10 more bytes, it will be buffered
...continue until the buffer has 100 bytes in it
Send 10 more bytes
At this point WinSock uses some logic to determine whether to accept the new 10 bytes (and make the buffer 110 bytes) or block. I don't recall the behavior exactly but it is on MSDN.
Send 10 more bytes
This last one will definately block until some buffer space is available.
So, in essence, the send buffer is sizeable and:
WinSock will always accept a send of almost any size of the buffer is empty
If the buffer has data and a write will overflow, there is some logic to determine whether to accept/reject
If the buffer is full or overflowed, it will not accept the new send
Sorry for the vagueness and lack of links; I'm in a bit of a hurry but happened to remember these details from a network product I wrote a while back.

Resources