Stream instrumentation data lossless trough non-reliable 4G - tcp

I have some data aquisition devices in industrial machinery that have 4G connectivity. Right now I make them to stream the intrumentation data in real time to my server through raw TCP/IP protocol. But this has some problems:
The machinery sometimes work in places where there is low or null mobile connectivity. If there is no connectivity for too long it can happen two things: a) the machine gets shutted down and the tcp/ip buffer it's lost along with the instrumentation data or b) the tcp/ip buffer overflows, which has the same results.
The same as point 1, but for the server side, due to maintenance or if something in the server fails in the weekend when nobody is going to notice it but the machinery can be ON and working. Then we can have data loss in the same way as point 1.
I have to manage authentication and the connection of all the clients into a server single TCP port. I have done some temporary hack that works for the moment but isn't the best. But this is another problem and it's not the reason of this question, so take it only for context.
So, I should code an application layer acknowledge where the server tells the client when a high-level message (not the individual TCP packets) has been received and processed. And in the client side to have a buffer writted in-disk where data is being deleted as is being confirmed by the server. This, to solve points 1 & 2.
But I'm afraid that I'm reinventing the wheel or that I don't know the correct tools, because I think that this problem should be more or less common but I fail to google for it and I can't find a library or tool that does this job.
What I was thinking about is something that in the remote client is listenning in a local TCP port for incomming data from the DAQ software, once it receives a message it streams it to the server and writes it to the local disk. In the server, the tool receives the message and re-streams it over local network to the final server. Then, notifies the client that is able to delete the message from its disk buffer.
So, the question is, there is something already done? I would prefer an already compiled / language agnostic solution because I code in LabView and I know there isn't like that in its ecosystem, but I'm open to everything. If there isn't anything like that, any advice in what to do / to avoid when developing it myself?
Thanks for your time.

Related

TCP as connection protocol questions

I'm not sure if this is the correct place to ask, so forgive me if it isn't.
I'm writing computer monitoring software that needs to connect to a server. The server may send out relatively urgent messages, such as sound or cancel an alarm, and the client may send out data about the computer, such as screenshots. The data that the client sends isn't too critical on timing, but shouldn't be more than a two minutes late.
It is essential to the software that portforwarding need not be set up, and it is assumed that the internet connection will be done through a wireless router that has NAT almost all the time.
My idea is to have a TCP connection initiated from the client, and use that to transfer data. Ideally, I would have no data being sent when it is not needed, but I believe this to be impossible. Would sending the equivalent of a ping every now and again keep the connection alive, and what sort of bandwidth would it use if this program was running all the time on the computer? In addition, would it be possible to reduce the header size for these keep-alives?
Before I start designing the communication and programming, is this plan for connection flawed? Are there better alternatives?
Thanks!
1) You do not need to send 'ping' data to keep the connection alive, the TCP stack does this automatically; one reason for sending 'ping' data would be to detect a connection close on the client side - typically you only find out something has gone wrong when you try and read/write from the socket. There may be a way to change various time-outs so you can detect this condition faster.
2) In general while TCP provides a stream-oriented error free channel, it makes no guarantees about timeliness, if you are using it on the internet it is even more unpredictable.
3) For applications such as this (I hope you are making it for ethical purposes) - I would tend to use TCP, since you don't want a situation where the client receives a packet to raise an alarm but misses that one that turns it off again.

How to test your client server program

I'm making a multiplayer game and often i want to test out if it perfectly works on global network, because sometime it's just works locally, so how could i do that without sending my client to friend to test it out.
If you want to test it for the "global network" - you have to test it that way. There are multiple things that can go wrong which are not an issue on a local network. Just off the top of my head
latency (important for a game)
NAT (common cause of problems depending on your game architecture - more so if P2P)
security
connection errors (Wifi/3G intermittent loss)
There are many aspects of networking that are often subtly different when you're talking over the internet rather than running on either localhost or a local network.
All manner of delays can occur, and this can throw out some poor assumptions in your code.
The TCP flow can be affected by TCP's flow control (when the TCP Window fills up the sender will stop sending and report the fact to you (or maybe not report it, if you're using async APIs)).
TCP reads that return 'complete messages' on localhost and your own network may start to return pieces of a message.
UDP datagrams may go missing and never arrive or may arrive multiple times or in any sequence.
So you're right to think that you need to test your code for these edge cases which rarely show up on your own network. You're also right to think that simply sending a client to a friend, or running it on a remote machine, is not enough.
One approach is to build a dedicated test client which sends known game play and checks that it gets expected responses (how hard this is depends on your protocol and your game). Once you have that working you then have the test client deliberately send data in such a way that the items above are tested. So, if you're using UDP, you might put some code in your test client so that it sometimes doesn't bother to send a UDP datagram at all. The client should think it sent it. The networking layer simply ditches it. This tests your UDP protocol for missing datagrams. Then send some datagrams multiple times, then send some out of sequence, etc. For TCP add delays, break logical "messages" into separate network sends with large delays between them; ideally send each distinct message type as a sequence of single bytes to check that the server's 'message accumulation code' works correctly.
Once you have done this you need to do the same for your client code, perhaps by adding a "fuzzing" option to your server's network code to do the same kind of thing...
Personally I tend to try and take a step back and do as much of this as possible in dedicated 'unit tests' (I know that some people will say that these aren't unit tests, call them what you like, just write them!). These tests exercise your networking layer using real networking (talking to a dummy server/client that the test creates) and validate the horrible edge cases.

Why doesn't using UDP for video-on-demand cause cross-talk?

While reading one of the assignment questions in "Data Communication and Networking" by Behrouz Forouzan, one of the questions asked were using UDP for file-transfer have any adverse effects keeping process crash phenomenon in mind.
The solution to this said that if a process A asked for the file-contents from a server X and soon after the request, A crashed and another process B came up on the same port on the same machine(giving it the same socket address) and sends a request to the same server for another file but the request is lost which makes the server unknown of both the process A crashing and the request being lost and hence, it sends the contents of the file asked by A to B.
Why doesn't this problem occur, in a video-on-demand channel like you-tube or likes?
One of the closest answers I got is this, but it doesn't seem to address my problem:
When is it appropriate to use UDP instead of TCP?
UPDATE: For people who would like to have a read of the question given in the book, I found an online version of the required part, please have a look at the 8th question of the PDF:
http://ceng334.cankaya.edu.tr/uploads/files/file/network%20sample.pdf
In theory the problem could happen but in real life? Not a chance.
Let's say a user wants to stream a video from Youtube with a browser.
Browser must crash - realistically does not happen too often.
New browser instance takes the exact same source UDP port - virtually never happens.
The user decides to look at a different video - makes no sense.
While all this happens, server side does not time out - I don't think so.
This is like arguing that TCP should be used because a packet might get dropped on the wire when two computers are connected back to back with one meter Ethernet cable.

what happens when tcp/udp server is publishing faster than client is consuming?

I am trying to get a handle on what happens when a server publishes (over tcp, udp, etc.) faster than a client can consume the data.
Within a program I understand that if a queue sits between the producer and the consumer, it will start to get larger. If there is no queue, then the producer simply won't be able to produce anything new, until the consumer can consume (I know there may be many more variations).
I am not clear on what happens when data leaves the server (which may be a different process, machine or data center) and is sent to the client. If the client simply can't respond to the incoming data fast enough, assuming the server and the consumer are very loosely coupled, what happens to the in-flight data?
Where can I read to get details on this topic? Do I just have to read the low level details of TCP/UDP?
Thanks
With TCP there's a TCP Window which is used for flow control. TCP only allows a certain amount of data to remain unacknowledged at a time. If a server is producing data faster than a client is consuming data then the amount of data that is unacknowledged will increase until the TCP window is 'full' at this point the sending TCP stack will wait and will not send any more data until the client acknowledges some of the data that is pending.
With UDP there's no such flow control system; it's unreliable after all. The UDP stacks on both client and server are allowed to drop datagrams if they feel like it, as are all routers between them. If you send more datagrams than the link can deliver to the client or if the link delivers more datagrams than your client code can receive then some of them will get thrown away. The server and client code will likely never know unless you have built some form of reliable protocol over basic UDP. Though actually you may find that datagrams are NOT thrown away by the network stack and that the NIC drivers simply chew up all available non-paged pool and eventually crash the system (see this blog posting for more details).
Back with TCP, how your server code deals with the TCP Window becoming full depends on whether you are using blocking I/O, non-blocking I/O or async I/O.
If you are using blocking I/O then your send calls will block and your server will slow down; effectively your server is now in lock step with your client. It can't send more data until the client has received the pending data.
If the server is using non blocking I/O then you'll likely get an error return that tells you that the call would have blocked; you can do other things but your server will need to resend the data at a later date...
If you're using async I/O then things may be more complex. With async I/O using I/O Completion Ports on Windows, for example, you wont notice anything different at all. Your overlapped sends will still be accepted just fine but you might notice that they are taking longer to complete. The overlapped sends are being queued on your server machine and are using memory for your overlapped buffers and probably using up 'non-paged pool' as well. If you keep issuing overlapped sends then you run the risk of exhausting non-paged pool memory or using a potentially unbounded amount of memory as I/O buffers. Therefore with async I/O and servers that COULD generate data faster than their clients can consume it you should write your own flow control code that you drive using the completions from your writes. I have written about this problem on my blog here and here and my server framework provides code which deals with it automatically for you.
As far as the data 'in flight' is concerned the TCP stacks in both peers will ensure that the data arrives as expected (i.e. in order and with nothing missing), they'll do this by resending data as and when required.
TCP has a feature called flow control.
As part of the TCP protocol, the client tells the server how much more data can be sent without filling up the buffer. If the buffer fills up, the client tells the server that it can't send more data yet. Once the buffer is emptied out a bit, the client tells the server it can start sending data again. (This also applies to when the client is sending data to the server).
UDP on the other hand is completely different. UDP itself does not do anything like this and will start dropping data if it is coming in faster then the process can handle. It would be up to the application to add logic to the application protocol if it can't lose data (i.e. if it requires a 'reliable' data stream).
If you really want to understand TCP, you pretty much need to read an implementation in conjunction with the RFC; real TCP implementations are not exactly as specified. For example, Linux has a 'memory pressure' concept which protects against running out of the kernel's (rather small) pool of DMA memory, and also prevents one socket running any others out of buffer space.
The server can't be faster than the client for a long time. After it has been faster than the client for a while, the system where it is hosted will block it when it writes on the socket (writes can block on a full buffer just as reads can block on an empty buffer).
With TCP, this cannot happen.
In case of UDP, packets will be lost.
The TCP Wikipedia article shows the TCP header format which is where the window size and acknowledgment sequence number are kept. The rest of the fields and the description there should give a good overview of how transmission throttling works. RFC 793 specifies the basic operations; pages 41 and 42 details the flow control.

Using NetConnection and URLStream to send/recieve data at high frequency

I'm writing a Comet-like app using Flex on the client and my own hand-written server.
I need to be able to send short bursts of data from the client at quite a high frequency (e.g. of the order of 10ms between sends).
I also need the server to push short bursts of data at a similarly high frequency.
I'm using NetConnection.call() to send the data to the server, and URLStream (with chunked encoding) to push the data from the server to the client.
What I've found is that the data isn't being sent/received as soon as it's available. For example, in IE, it seems the data is sent every 200ms rather than as soon as NetConnection.call() is called. Similarly, URLStream isn't making the data available as soon as the server is sending it.
Judging by the difference in behaviour between the browsers, it seems as though the Flash Player (version 10) is relying on the host browser to do all the comms. Can anyone confirm this? Update: This is very likely as only the host browser would know about the proxy settings that might be set.
I've tried using the Socket class and there's no problem with speed there: it works perfectly. However, I'd like to be able to use HTTP-based (port 80) connections so that my app can run in heavily fire-walled environments (I tried using a Socket over port 80, but that has its problems).
Incidentally, all development/testing has been done on an internal LAN, so bandwidth/latency is not an issue.
Update: The data being sent/received is in small packets and doesn't need to be in any particular format. For example, I might need to send a short array of Numbers, and this could either be encoded in AMF (e.g. via NetConnection.call()) or could be put into GET parameters (e.g. using sendToURL()). The main point of my question is really to see whether anyone else has experienced the same problem in calling NetConnection/URLStream frequently, and whether there is a workaround (it's also possible that the fault lies with my server code of course, rather than Flash).
Thanks.
Turns out the problem had nothing to do with Flash/Flex or any of the host browsers. The problem was in my server code (written in C++ on Linux), and without access to my source code the cause is hard to find (so I couldn't have hoped for an answer from this forum).
Still - thank you everyone who chipped in.
It was only after looking carefully at the output shown in Wireshark that I noticed the problem, which was twofold:
Nagle's algorithm
I was sending replies in multiple packets by calling write() multiple times (e.g. once for the HTTP response header, and again for the HTTP response body). The server's TCP/IP stack was waiting for an ACK for the first packet before sending the second, but because of Nagle's algorithm the client was waiting 200ms before sending back the ACK to the first packet, so the server took at least 200ms to send the full HTTP response.
The solution is to use send() with the flag MSG_MORE until all the logically connected blocks are written. I could also have used writev() or setsockopt() with TCP_CORK, but it suited my existing code better to use send().
Chunk-encoded streams
I'm using a never-ending HTTP response with chunk encoding to push data back to the client. Naggle's algorithm needs to be turned off here because even if each chunk is written as one packet (using MSG_MORE), the client OS TCP/IP stack will still wait up to 200ms before sending back an ACK, and the server can't push a subsequent chunk until it gets that ACK.
The solution here is to ask the server not to wait for an ACK for each sent packet before sending the next packet, and this is done by calling setsockopt() with the TCP_NODELAY flag.
The above solutions only work on Linux and aren't POSIX-compliant (I think), but that isn't a problem for me.
I'm almost 100% sure the player relies on the browser for such communications. Can't find an official page stating so atm, but check this out for example:
Applications hosting the Flash Player
ActiveX control or Flash Player
plug-in can use the
EnforceLocalSecurity and
DisableLocalSecurity API calls to
control security settings.
Which I think somehow implies the idea. Also, I've suffered some network related bugs on FF/IE only which again points out to the player using each browser for networking (otherwise there wouldn't be such differences).
And regarding your latency problem, I think that if speed is critical, your best bet is sockets. You have some work to do, but seems possible, check out the docs again:
This error occurs in SWF content.
Dispatched if a call to
Socket.connect() attempts to connect
either to a server outside the
caller's security sandbox or to a port
lower than 1024. You can work around
either problem by using a cross-domain
policy file on the server.
HTH,
Juan

Resources