my realtime network receiving time differs a lot, anyone can help? - networking

I wrote a program using tcpip sockets to send commands to a device and receive the data from the device. The data size would be around 200kB to 600KB. The computer is directly connected to the device using a 100MB network. I found that the sending packets always arrive at the computer at 100MB/s speed (I have debugging information on the unit and I also verified this using some network monitoring software), but the receiving time differs a lot from 40ms to 250ms, even if the size is the same (I have a receiving buffer about 700K and the receiving window of 8092 bytes and changing the window size does not change anything). The phenomena differs also on different computers, but on the same computer the problem is very stable. For example, receiving 300k bytes on computer a would be 40ms, but it may cost 200ms on another computer.
I have disabled firewall, antivirus, all other network protocol except the TCP/IP. Any experts on this can give me some hints?

I've found the answer to this question. The problem is due to the even/odd number of packets before the last fragment packet caused by the Nagle's algorithm.
see the link which is very informative:
http://www.stuartcheshire.org/papers/NagleDelayedAck/

Related

Problem with sending data from Hololens 2 over wifi - lagging periodically every 5 minutes

I am trying to continuously send small UDP packets (containing only couple of bytes) from Hololens 2 to a computer over wifi. In my case it is necessary to transfer these data at a stable rate of 60fps - without any dropouts. This works as expected, but I have encountered a strange problem - incoming data start "lagging" heavily every 5 minutes. This lagging happens for a couple of seconds and then it reverts back to stable and continuous stream. Unfortunately this is unacceptable for my use-case and I am hoping to get rid of this issue. This gif shows lagging of incoming udp data in real-time (each datagram contains frame number that is visualized in graph).
I have been trying to analyze this problem for quite some time, but so far I haven't figured out what is causing it. Nevertheless, I can provide these conclusions from my tests:
Wifi signal isn't a problem. I have a wifi access point placed directly next to a Hololens 2 - providing excellent signal. Whole setup consists of my laptop that is connected directly to Cisco access point (through ethernet connection) and Hololens 2 which is connected to its wifi. There is no internet access on this network - it is used just for local networking. Also there isn't any network interference playing a role in this problem as I have tried this at various locations with the same results. I have also tried adding one more computer onto this wifi to see if I could get a stable stream from it - both Hololens 2 and this computer were streaming similar data to my laptop. During the phases where Hololens data were lagging, data from this computer were stable without any issue.
Application itself running on Hololens 2 has a stable rate of 60fps without any frame drops. I run tests with simple app created using Unity and also with native UWP app (both behaved the same in terms of networking). I have also observed the same results with different socket implementations - both System.Net.Sockets and Windows.Networking.Sockets performed the same.
Interestingly, this problem doesn't occur when I connect to Hololens 2 directly through ethernet. I have successfully tested this with usb-c to ethernet adapter which provided stable data transfer (without any lagging). Therefore it is clear that lagging is present only when using wireless network adapter on Hololens 2.
I am definitely no network specialist, but I have tried to examine packets coming in to my laptop network interface through Wireshark, but there wasn't any "suspicious" activity happening during these lagging periods. Out of curiosity I have also captured packets directly on Cisco access point and have created following graphs in order to visualize what was happening there.
This graph shows arrival time of all packets (y axis represents time in seconds, x is packet id). The middle area is clearly suffering from mentioned lagging.
This graph shows only filtered packets that belong to my udp communication - all other packets are displayed with 0 time in order to visually identify packets used for different (unknown) communication. You can clearly tell that majority of other requests happened exactly in those problematic areas. Unfortunately I don't know what those requests are as the capture output from Cisco access point seemed rather cryptic to me. All I could tell is that majority of those requests were sent from Hololens 2 to Cisco access point.
Based on these observations I would say that every 5 minutes is Hololens 2 doing something in the background that causes these network problems. It influences only wifi network interface. Even though I haven't tried holographic remoting or spectator view, I believe they must suffer from same lagging. Please does anyone how to resolve this problem? Since Hololens 2 doesn't have control over firewall or services, I am feeling kind of stuck here and was hoping that some Hololens developer could help me. Thanks.

What happens when ethernet reception buffer is full

I have a quite newbie question : assume that I have two devices communication via Ethernet (TCP/IP) at 100Mbps. In one side, I will be feeding the device with data to transmit. At the other side, I will be consuming the received data. I have the ability to choose the adequate buffer size of both devices.
And now my question is : If data consumption rate from the second device, is slower than data feeding rate at the first one, what will happen then?
I found some, talking about overrun counter.
Is there anything in the ethernet communication indicating that a device is momently busy and can't receive new packets? so I can pause the transmission from the receiver device.
Can some one provide me with a document or documents that explain this issue in detail because I didn't find any.
Thank you by advance
Ethernet protocol runs on MAC controller chip. MAC has two separate RX-ring (for ingress packets) and TX-ring(for egress packets), this means its a full-duplex in nature. RX/TX-rings also have on-chip FIFO but the rings hold PDUs in host memory buffers. I have covered little bit of functionality in one of the related post
Now, congestion can happen but again RX and TX are two different paths and will be due to following conditions
Queue/de-queue of rx-buffers/tx-buffers is NOT fast compared to line rate. This happens when CPU is busy and not honer the interrupts fast enough.
Host memory is slower (ex: DRAM and not SRAM), or not enough memory(due to memory leak)
Intermediate processing of the buffers taking too long.
Now, about the peer device: Back-pressure can be taken care in the a standalone system and when that happens, we usually tail drop the packets. This is agnostics to the peer device, if peer device is slow its that device's problem.
Definition of overrun is: Number of times the receiver hardware was unable to handle received data to a hardware buffer because the input rate exceeded the receiver’s ability to handle the data.
I recommend pick any MAC controller's data-sheet (ex: Intel's ethernet Controller) and you will get all your questions covered. Or if you get to see device-driver for any MAC controller.
TCP/IP is upper layer stack sits inside kernel(this can be in user plane as well), whereas ARPA protocol (ethernet) is inside MAC controller hardware. If you understand this you will understand the difference between router and switches (where there is no TCP/IP stack).

Detect faulty physical links with ping

I would have a question regarding physical problem detection in a link with ping.
If we have a fiber or cable which has a problem and generate some CRC errors on the frame (visible with switch or router interface statistics), it's possible all ping pass because of the default small icmp packet size and statistically fewer possibilities of error. First, can you confirm this ?
Also, my second question, if I ping with a large size like 65000 bytes, one ping will generate approximately 65000 / 1500(mtu) = 43 frames, as ip framgents, then the statistics to get packet loss (because normally if one ip fragment is lost the entire ip packet is lost) with large ping is clearly higher ? Is this assumption is true ?
The global question is, with large ping, could we easier detect a physical problem on a link ?
A link problem is a layer 1 or 2 problem. ping is a layer 3 tool, if you use it for diagnosis you might get completely unexpected results. Port counters are much more precise in diagnosing link problems.
That said, it's quite possible that packet loss for small ping packets is low while real traffic is impacted more severely.
In addition to cable problems - that you'll need to repair - and a statistically random loss of packets there are also some configuration problems that can lead to CRC errors.
Most common in 10/100 Mbit networks is a duplex mismatch where one side uses half-duplex (HDX) transmission with CSMA/CD while the other one uses full-duplex (FDX) - once real data is transmitted, the HDX side will detect collisions, late collisions and possibly jabber while the FDX side will detect FCS errors. Throughput is very low, put ping with its low bandwidth usually works.
Duplex mismatches happen most often when one side is forced to full duplex, thus deactivating auto-negotiation and the other side defaults to half duplex.

RTP video issue related to Jitter and packet loss depending on odd network status

I'm a newbie software developer who develops SIP/RTP Voip software.
For sure, I am using UDP protocol and Video Codec for this video is H264.
Since I am new to this Voip area, I am so confused and suffering painful network issues a lot.
I would like to ask experts something related to Network specifically dealing with RTP/RTCP issues on Jitter / Packet loss.
After SIP successfully creates media session, I get some issue for QoS.
Problems I am facing up is just like this below.
Wifi network(latency : 11.1m/s download speed : 14.9mbps upload speed : 3.27mbps) :
http://www.youtube.com/watch?v=epm01c6IT5Q&feature=youtu.be
3G network(latency : 26.4M/s Download speed : 1.94Mbps upload speed : 2.42Mbps) :
http://www.youtube.com/watch?v=-iG156_wdQE&feature=youtu.be
as you see, through 3G that has low upload and download and unstable latency, video quality including video issue coloured green and video's delay is better than Wifi.
using 3G network slower than Wifi , I can always take better user experience than Wifi.
I didn't analyse RTP/RTCP packets deeply but the thing I can tell is ...
At the problem situation, when Wi-fi was used for the application, Jitter was strangely high enough and packet loss was obviously high as well.
To sum up,
As you can see, video quality is better when I use 3G network slower than Wifi.
When Wifi works there, Jitter and packet loss are obviously high as I can analyse packet using wire-shark on the receiver side.
at that morning time, video problem(green pixel of video, video delay) was much more serious but as time went, at the afternoon and night, problem had been recovered a bit.
As far as I know, it would be related to network bandwidth and network congestion.
I am not sure that it is proper diagnosis and also need to solution to this.
I'm sorry that I have not enough background information yet.
Thanks.
You area going to have to look at the RTCP or RTCP-XR messages to see what is going wrong. If that fails then like the other posts have said you will need to use wireshark to determine what the issue is.
There is most likely a network layer issue that is causing this so do what you can to test your connection to the other side. A traceroute might be a good place to start to see what the difference is between how the 3G routes vs the wifi.
Wifi can have many issues related to jitter and packet loss that your cell network might not depending on your signal strength (and other things). If you can test with a hard wire connection then you can rule out wifi as an issue and if you still have issues it has to be network/ISP related. If a hardwired connection resolves your issues then you know it was the wifi and you can troubleshoot accordingly.
The green is most likely an artifact of the jitter/packetloss. Normally in the US, for voice, a ptime of 20ms is used. Which means that audio packets (and video if used) are send every .02 seconds. If your jitter is higher than 20ms or you have high packetloss or burst packetloss then you will likely see and hear distortions because the packets either arrive out of order and are dropped or are lost. The green screen is just one of many you could see depending on the app you are using. I work mostly with audio so I am sorry I can't be more helpful with the exact meaning of that artifact.

how can you count the number of packet losses in a file transfer?

One of my networks course projects has to do with 802.11 protocol.
Me and my parther thought about exploring the "hidden terminal" problem, simulating it.
We've set up a private network. We have 2 wireless terminals that will attempt to send a file
to a 3rd terminal that is connected to the router via ethernet. RTS/CTS will be disabled.
To compare results, we'd like to measure the number of packet collisions that occured during the transfer so as to conclude that is due to RTS being disabled.
We've read that it is imposible to measure packet collisions as it is basically noise. We'll have to make do with counting the packets that didnt recieve an "ACK". Basically, the number of retransmitions.
How can we do that?
I suggested that instead of sending a file, we could make the 2 wireless terminals ping the 3rd terminal continually. The ping feature automatically counts the ping packets that didnt recieve the "pong". Do you think its a viable approach?
Thank you very much.
No, you'll get incorrect results. Ping is an application, i.e. working at application (highest) level of the network. 802.11 protocol operates at MAC layer - there are at least 2 layers separating between ping and 802.11. Whatever retransmissions happen at MAC layer - they are hidden by the layers above it. You'll see failure in ping only if all the retransmissions initiated by lower levels have failed.
You need to work on the same level that you're investigating - in your case it's the MAC layer. You can use a sniffer (google for it) to get the statistics you want.

Resources