What's the delay for in TCP/UDP? - networking

HELP PLEASE! I have an application that needs as close to real-time processing as possible and I keep running into this unusual delay issue with both TCP and UDP. The delay occurs like clockwork and it is always the same length of time (mostly 15 to 16 ms). It occurs when transmitting to any machine (eve local) and on any network (we have two).
A quick run down of the problem:
I am always using winsock in C++, compiled in VS 2008 Pro, but I have written several programs to send and receive in various ways using both TCP and UDP. I always use an intermediate program (running locally or remotely) written in various languages (MATLAB, C#, C++) to forward the information from one program to the other. Both winsock programs run on the same machine so they display timestamps for Tx and Rx from the same clock. I keep seeing a pattern emerge where a burst of packets will get transmitted and then there is a delay of around 15 to 16 milliseconds before the next burst despite no delay being programmed in. Sometimes it may be 15 to 16 ms between each packet instead of a burst of packets. Other times (rarely) I will have a different length delay, such as ~ 47 ms. I always seem to receive the packets back within a millisecond of them being transmitted though with the same pattern of delay being exhibited between the transmitted bursts.
I have a suspicion that winsock or the NIC is buffering packets before each transmit but I haven't found any proof. I have a Gigabit connection to one network that gets various levels of traffic, but I also experience the same thing when running the intermediate program on a cluster that has a private network with no traffic (from users at least) and a 2 Gigabit connection. I will even experience this delay when running the intermediate program locally with the sending and receiving programs.

I figured out the problem this morning while rewriting the server in Java. The resolution of my Windows system clock is between 15 and 16 milliseconds. That means that every packet that shows the same millisecond as its transmit time is actually being sent at different milliseconds in a 16 millisecond interval, but my timestamps only increment every 15 to 16 milliseconds so they appear the same.
I came here to answer my question and I saw the response about raising the priority of my program. So I started all three programs, went into task manager, raised all three to "real time" priority (which no other process was at) and ran them. I got the same 15 to 16 millisecond intervals.
Thanks for the responses though.

There is always buffering involved and it varies between hardware/drivers/os etc. The packet schedulers also play a big role.
If you want "hard real-time" guarantees, you probably should stay away from Windows...

What you're probably seeing is a scheduler delay - your application is waiting for other process(s) to finish their timeslice and give up the CPU. Standard timeslices on multiprocessor Windows are from 15ms to 180ms.
You could try raising the priority of your application/thread.

Oh yeah, I know what you mean. Windows and its buffers... try adjusting the values of SO_SNDBUF on sender and SO_RCVBUF on reciever side. Also, check involved networking hardware (routers, switches, media gateways) - eliminate as many as possible between the machines to avoid latency.

I meet the same problem.
But in my case,I use GetTickCount() to get current system time,unfortunately it always has a resolution of 15-16ms.
When I use QueryPerformanceCounter instead of GetTickCount(), everything's all right.
In fact,TCP socket recv data evenly,not 15ms deal once.

Related

Multiple IOT devices communicating to a server Asynchronously via TCP

I want multiple IoT devices (Say 50) communicating to a server directly asynchronously via TCP. Assume all of them have a heartbeat pulse every 30 seconds and may drop off and reconnect at variable times.
Can anyone advice me the best way to make sure no data is dropped or blocked when multiple devices are communicating simultaneously?
TCP by itself ensures no data loss during the communication between a client and a server. It does that by the use of sequence numbers and ACK messages.
Technically, before the actual data transfer happens, a TCP connection is created between the client (which can be an IoT device, or any other device) and the server. Then, the data is split into multiple packets and sent over the network through that connection. All TCP-related mechanisms like flow-control, error-detection, congestion-detection, and many others, take place once the data starts to flow.
The wiki page for TCP is a pretty good start if you want to learn more about how it works.
Apart from that, as long as your server has enough capacity to support the flow of requests coming from the devices, then everything should work (at least in theory).
I don't think you are asking the right question. There is no way to make sure that no data is dropped or blocked. Networks do not always work (that is why the word work is in network, to convince you otherwise ).
The right question is: how do I make my distributed system as available and reliable as possible? The answer involves viewing interruption and congestion as part of the normal operation, and build your software appropriately.
There is a timeless usenix/acm/? paper from the late 70s early 80s that invigorated the notion that end-to-end protocols are much more effective then over-featured middle to middle protocols; and most guarantees of middle to middle amount to best effort. If you rely upon those guarantees, you are bound to fail. Sorry, cannot find the reference right now, but it is widely cited.

Performance of Pipes vs TCP/IP for a lot of requests with small data

We have a system with two micro-controllers and we are developing a simulation environment. In the real system the micro-controllers communicate trough UARTs sending small data back and forth all the time every millisecond (asking for variable values, sending commands, etc.).
In the simulation each micro-controller is a different application and we must find a way to communicate them. I have tried TCP/IP before but I notice from time to time there are very long time delays - for example it works ok and after 20 messages there is a second delay. I assume this is because TCP/IP is not designed to send small packets of ~10 bytes of data every millisecond. I have found some information about the issue but nothing conclusive on how to avoid the problem.
I was wondering if NamedPipes will work better. Will there be a huge performance hit when using NamedPipes to communicate back and forth ~10 bytes of data every ms? Should we look into another alternative? In the past the company has used virtual serial ports but that requires special third-party software and is too cumbersome to set up.

RxAndroidBle multiple connections

Background:
I am using the RxAndroidBle library and have a requirement to quickly (as possible) connect to multiple devices at a time and start communicating. I used the RxBluetoothKit for iOS, and have started to use RxAndroidBle on my Pixel 2. This had worked as expected and I could establish connections to 6-8 devices, as required, in a few hundred milliseconds. However, broadening my testing to phones such as the Samsung S8 and Nexus 6P, it seems that establishing a single connection can now take upwards of 5-6 seconds instead of 50-60 millis. I will assume for the moment that that disparity is within the vendor-specific BT implementations. Ultimately, this means that connecting to, e.g., 5 devices now takes 30 seconds instead of < 1 second.
Question:
What I understand from the documentation and other questions asked, RxAndroidBle queues all scanning, connecting, and communication requests and executes them serially to be safe and maintain stability based on the variety of Bluetooth implementations in the Android ecosystem. However, is there currently a way to execute the requests (namely, connecting) in parallel to accept this risk and potentially cut my total time to establish multiple connections down to whichever device takes the longest to connect?
And side question: are there any ideas to diagnose what could possibly be taking 5 seconds to establish a connection with a device? Or do we simply need to accept that some phones will take that long in some instances?
However, is there currently a way to execute the requests (namely, connecting) in parallel to accept this risk and potentially cut my total time to establish multiple connections down to whichever device takes the longest to connect?
Yes. You may try to establish connections using autoConnect=true which would prevent locking the queue for longer than few milliseconds. The last connection should be started with autoConnect=false to kick off a scan. Some stack implementations are handling this quite OK but your mileage may vary.
And side question: are there any ideas to diagnose what could possibly be taking 5 seconds to establish a connection with a device?
You can check the Bluetooth HCI snoop log. Also you may try using a BLE sniffer to check what is actually happening "on-air" (e.g. an nRF51 Development Kit).
Or do we simply need to accept that some phones will take that long in some instances?
This is also an option since usually there is little one can do about connecting time. From my experience BLE stack/firmware implementations are wildly different from each other.

Network Time Protocol vs Real Time Clock

After doing some research related to the Network Time Protocol (NTP), I was wondering what could be a real use of it. From my knowledge, almost all devices have a real time clock that keeps updating itself even when the machine is shutted down. Then, when the machine boots up, the software clock takes its value from this hardware clock (the real time clock).
What is the point of taking the time from a server and, therefore, exposing the machine to some kind of time attack? Is the goal to keep a set of hosts strictly synchronized and avoid that their times differ too much (but how much could be, in reality, this "too much"? ) ? Also: if a host configures NTP, does it still have an initialized software clock from the real time clock that simply corrects itself according to the received NTP packets, or not?

network packet loss delay bandwidth simulation

So I have come across many network simulators and most of them are either on permanently (as in you have to close it to turn it off), or have a stop and start button. now, I have been looking for a network simulator than can simulate packet loss, delay, bandwidth speeds of choice for windows that can also be cycled on or off by a set time. I found one such program called FnLag, but the problem is (for me at least) is that it is not free.
basically the shorter version of my question is does anyone know of a network simulator for windows or linux that is free that can simulate packet loss, delay, bandwidth control, seperate downstream and upstream rules, tcp and udp selection that can be cycled on and off or burst/pulse feature? i can elaborate further if necessary.
There is a very good open source solution for network simulations called - WANEM
WANem is a wide area network emulator. It supports various features such a bandwidth limitation, latency, packet loss, network disconnection among other wide area network characteristics.
Here is the url: http://wanem.sourceforge.net

Resources