I'm developing a client/server based system using two GPRS modems (Siemens TC65). The application sends relatively small frames (128 bytes each) from the client to the server and vice versa. Initially I used an UDP connection (using UDPDatagramConnection), but then I decided to change to a TCP connection (using SocketConnection and ServerSocketConnection) and compared the delay between the two.
I did 40 tests around 4 seconds apart from each other measured the round-trip time using the exact same application (just changing the connection method) at the same time of the day to ensure the traffic was similar, surprisingly I got the following results:
I was expecting that UDP would be faster but TCP is on average two times faster than UDP. I'm having trouble justifying this. I read threads like this one
UDP vs TCP, how much faster is it?
and they helped but I'm not sure the Neagle's algorithm has anything to do with it since I waited for every frame to arrive before sending the next frame.
I would apreacite any tips justifying these results.
Also is there any influence by doing the connection under GPRS?
Related
I need to write a receiver that receives market data feed via websocket. One of my colleague said that we need to reset the connection every one hour because TCP data is buffered. I don't quite understand. Is it true that TCP connection quanlity will deteriorate as time passes?
This isn't just false, it's complete hogwash.
Network connections do not "degrade." If you are ever seeing degradation, something between the two endpoints is having problems or the application is poorly written.
I have seen many cases were network protocols (TCP/IP, for instance) are blamed for memory leaks and other bad programming issues.
Is it true that TCP connection quality will deteriorate as time passes?
This is false.
One of my colleague said that we need to reset the connection every one hour because TCP data is buffered.
Your colleague is mistaken. Yes, TCP data is buffered, which is perfectly fine under normal conditions. TCP has flow control, so if the receiver isn't reading the data correctly, or is not reading the data fast enough, the buffer could fill up over time and block the sender until buffer space is cleared. That would be a problem with the way the receiver is coded, not a problem with TCP itself.
Is it true that TCP connection quality will deteriorate as time passes?
Absolutely not. Given a stable network, and proper coding management, a TCP connection can happily stay alive for days, weeks, even years without problem.
Its said that one of the advantages of HTTP 2 over HTTP 1 is that HTTP2 has streams of data. It can be up to 256 different streams in ONE TCP/IP connection. However, in HTTP 1 there can be up to 6 parallel connections. It's an improvement that HTTP 2 enables reading data from 256 resources, but still I think that 6 connections (in HTTP 1) have a better throughput that one TCP/IP connection (in HTTP 2). Still, HTTP2 is considered faster than HTTP 1. So... what don't I understand correctly ?
6 physical connections would have more throughput than one physical connection, all else being equal.
However the same doesn't apply to 6 different TCP/IP connections between the same computers, as these are virtual connections (assuming you don't have two network cards). The limiting factor is usually the latency and bandwidth of your internet connection rather than TCP/IP protocol itself.
In fact, due to the way TCP connections are created and handled, its actually much more efficient to have one TCP/IP connection. This is because of the cost of the initial connection (a three way TCP handshake, the HTTPS handshake and the fact that TCP connections use a process called Slow Start to slowly build up its capacity to the maximum speed that the network can handle) but also in the ongoing upkeep of the connection (as the Slow Start process happens again periodically unless the connection is fully utilised all the time - which is much more likely to happen with one connection that is used for everything, than it is to happen when your requests are split across 6 connections).
Also HTTP/1.1 only allows one request in flight at a time, so the connection is not able to be used until the response is returned (ignoring pipelining which is not really supported at all in HTTP/1.1). This not only limits the usefulness of the 6 connections, but also means that it's more likely that the connections will be underused which, given the issues with underused connections in TCP mentioned above, means they are likely to be slower as they throttle back and have to go through the Slow Start process again to build up to maximum capacity. HTTP/2 however allows those 256 streams to allow requests to be in flight at the same time. Which is both better than just 6 connections, and allows true multiplexing.
If you want to know more, then Ilya Grigorik had written an excellent book on the subject called High Performance Browser Networking which is even available online for free.
In our C/S based online game project, we use TCP for network transmission. We include Libevent, utilise a bufferevent for each connection to handling with the network I/O automatically.
It works well before,but the lagging problem comes to the surface recently. When I do some stress testing to make the network busier, the latency becomes extremely high, several seconds or more. The server sinks into a confusing state:
the average CPU usage decreased (0%-60%-0%-60% repeat, waiting something?)
the net traffic decreased (nethogs)
the clients connected to server still alive (netstat & tcpdump)
It looks like something magically slowed all system down, but new connection to server responded quit in time.
When I changed the protocol to UDP, it works well on the same situation: no obvious latency, the system runs fast. Net traffic is around 3M/S.
The project is running on an Intranet. I also tested the max download speed, nearly 18M/S.
I studied part of Libevent's header files and ducumentations, tried to setup a rate limit to all connections. It did some improvements, but not completely resolved the problem even though I had tried several different configurations. Here is my parameters: read_rate 163840, read_burst 163840, write_rate 163840, write_burst 163840, tick_len 500ms.
Thank you for your help!
TCP = Transmission Control Protocol. It responds to packet loss by retransmitting unacknowledged packets after a delay. In the case of repeated loss, it will exponentially back off. Take a look at this network capture of an attempt to open a connection to host that is not responding:
It sends the initial SYN, and then after not getting an ack for 1s it tries again. After not getting an ack it then sends another after ~2s, then ~4s, then ~8s, and so on. So you can see that you can get some serious latency in the face of repeated packet loss.
Since you said you were deliberately stressing the network, and that the CPU usage is inconsistent, one possible explanation is that TCP is waiting to retransmit lost packets.
The best way to see what is going on is to get a network capture of what is actually transmitted. If your hosts are connected to a single switch, you can "span" a port of interest to the port of another host where you can make the capture.
If your switch isn't capable of this, or if you don't have the administrative control of the switch, then you will have to get the capture from one of hosts involved in your online game. The disadvantage of this is that taking the capture will possibly alter what happens, and it doesn't see what is actually on the wire. For example, you might have TCP segmentation offload enabled for your interface, in which case the capture will see large packets that will be broken up by the network interface.
I would suggest installing wireshark to analyse the network capture (which you can do in real time by using wireshark to do the capture as well). Any time you are working with a networked system I would recommend using wireshark so that you have some visibility into what is actually happening on the network. The first filter I would suggest you use is tcp.analysis.flags which will show you packets suggestive of problems.
I would also suggest turning off the rate limiting first to try to see what is going on (rate limiting is adding another reason to not send packets, which is probably going to make it harder to diagnose what is going on). Also, 500ms might be a longish tick_len depending on how your game operates. If your burst configuration allows the rate to be used up in 100ms, you will end up waiting 400ms before you can transmit again. The IO Graph is a very helpful feature of Wireshark in this regard. It can help you see transmission rates, although the default tick interval and unit are not very helpful in this regard. Here is an example of a bursty flow being rate limited to 200mbit/s:
Note that the tick interval is 1ms and the unit is bits/tick, which makes the top of the chart 1gb/s, the speed of the interface in question.
Because of geographic distance between server and client network latency can vary a lot. So I want to get "pure" req. processing time of service without network latency.
I want to get network latency as TCP connecting time. As far as I understand this time depends a lot on network.
Main idea is to compute:
TCP connecting time,
TCP first packet receive time,
Get "pure" service time = TCP first packet receive (waiting time) – TCP connecting.
I divide TCP connecting by 2 because in fact there are 2 requests-response (3-way handshake).
I have two questions:
Should I compute TCP all packets receive time instead of only first packet?
Is this method okay in general?
PS: As a tool I use Erlang's gen_tcp. I can show the code.
If at all, i guess the "pure" service time = TCP first packet receive - TCP connecting.. You have written other way round.
A possible answer to your first questions is , you should ideally compute atleast some sort of average by considering pure service time of many packets rather than just first packet.
Ideally it can also have worst case, average case, best case service times.
For second question to be answered we would need why would you need pure service time only. I mean since it is a network application, network latencies(connection time etc...) should also be included in the "response time", not just pure service time. That is my view based on given information.
I have worked on a similar question when working for a network performance monitoring vendor in the past.
IMHO, there are a certain number of questions to be asked before proceeding:
connection time and latency: if you base your network latency metric, beware that it takes into account 3 steps: Client sends a TCP/SYN, Server responds with a TCP/SYN-ACK, the Client responds by a final ACK to set up the TCP connection. This means that the CT is equivalent to 1.5 RTT (round trip time). This validates taking the first two steps of the TCP setup process in acccount like you mention.
Taking in account later TCP exchanges: while this first sounds like a great idea to keep evaluating network latency in the course of the session, this becomes a lot more tricky. Here is why: 1. Not all packets have to be acknowledged (RFC1122 or https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment) , which will generate false measurements when it occurs, so you will need an heuristic to take these off your calculations. 2. Not all systems consider acknowledging packets a high priority tasks. This means that some high values will pollute your network latency data and simply reflect the level of load of the server for example.
So if you use only the first (and reliable) measurement you may miss some network delay variation (especially in apps using long lasting TCP sessions).
Okay, so I am programming for my networking course and I have to implement a project in Java using UDP. We are implementing an HTTP server and client along with a 'gremlin' function that corrupts packets with a specified probability. The HTTP server has to break a large file up into multiple segments at the application layer to be sent to the client over UDP. The client must reassemble the received segments at the application layer. What I am wondering however is, if UDP is by definition unreliable, why am I having to simulate unreliability here?
My first thought is that perhaps it's simply because my instructor is figuring in our case, both the client and the server will be run on the same machine and that the file will be transferred from one process to another 100% reliably even over UDP since it is between two processes on the same computer.
This led me first to question whether or not UDP could ever actually lose a packet, corrupt a packet, or deliver a packet out of order if the server and client were guaranteed to be two processes on the same physical machine, guaranteed to be routed strictly over localhost only such that it won't ever go out over the network.
I would also like to know, in general, for a given packet what is the rough probability that UDP will drop / corrupt / or deliver a packet out of order while being used to facilitate communication over the open internet between two hosts that are fairly geographically distant from one another (say something comparable to the route between the average broadband user in the US to one of Google's CDNs)? I'm mostly just trying to get a general idea of the conditions experienced when communicated over UDP, does it drop / corrupt / misorder something on the order of 25% of packets, or is it more like something on the order of 0.001% of packets?
Much appreciation to anyone who can shed some light on any of these questions for me.
Packet loss happens for multiple reasons. Primarily it is caused by errors on individual links and network congestion.
Packet loss due to errors on the link is very low, when links are working properly. Less than 0.01% is not unusual.
Packet loss due to congestion obviously depends on how busy the link is. If there is spare capacity along the entire path, this number will be 0%. But as the network gets busy, this number will increase. When flow control is done properly, this number will not get very high. A couple of lost packets is usually enough that somebody will reduce their transmission speed enough to stop packets getting lost due to congestion.
If packet loss ever reaches 1% something is wrong. That something could be a bug in how your congestion control algorithm responds to packet loss. If it keeps sending packets at the same rate, when the network is congested and losing packets, the packet loss can be pushed much higher, 99% packet loss is possible if software is misbehaving. But this depends on the types of links involved. Gigabit Ethernet uses backpressure to control the flow, so if the path from source to destination is a single Gigabit Ethernet segment, the sending application may simply be slowed down and never see actual packet loss.
For testing behaviour of software in case of packet loss, I would suggest two different simulations.
On each packet drop it with a probability of 10% and transmit it with a probability of 90%
Transmit up to 100 packets per second or up to 100KB per second, and drop the rest if the application would send more.
if UDP is by definition unreliable, why am I having to simulate unreliability here?
It is very useful to have a controlled mechanism to simulate worst case scenarios and how both your client and server can respond to them. The instructor will likely want you to demonstrate how robust the system can be.
You are also talking about payload validity here and not just packet loss.
This led me to question whether or not UDP, lose a packet, corrupt a packet, or deliver it out of order if the server and client were two processes on the same machine and it wasn't having to go out over the actual network.
It is obviously less likely over the loopback adapter, but this is not impossible.
I found a few forum posts on the topic here and here.
I am also wondering what the chances of actually losing a packet, having it corrupted, or having them delivered out of order in reality would usually be over the internet between two geographically distant hosts.
This question would probably need to be narrowed down a bit. There are several factors both application level (packet size and frequency) as well as limitations/traffic of routers and switches along the path.
I couldn't find any hard numbers on this but it seems to be fairly low... like sub 5%.
You may be interested in The Internet Traffic Report and possibly pages such as this.
I was spamming udp packets over wifi to some nanoleaf panels and my packet loss was roughly 1/7000.
I think it depends on a ton of factors.