Delay between TCP ACK and next packet - tcp

I have a simple HTML form with file upload
<form action="...." enctype="multipart/form-data" method="post">
<input type="file" name="d">
<input type="submit" value="Send">
</form>
I noticed strange delay in uploading and debugged with wireshark. There is a strange 2 second delay between ACK for first packet and second one been sent.
Any idea why this is happening?

This is a delay introduced by the application & not by the TCP stack. You can see that the application pushed data of length 577 initially, probably a http header & received a ACK for it after nearly 200ms which equals the TCP delayed ACKs timeout & its absolutely fine. After that the 2 sec is the delay taken by the application to read the file from the system. Once the file is read, you can see that it sends bigger blocks of data filling the entire TCP segment length i.e 1514.
You need to check the file system or the way the application reads the file which actually causes the delay.

Related

Is the integrity of file uploads/downloads guaranteed by TCP/HTTPS?

Assume I want to upload a file to a web server. Maybe even a rather big file (e.g. 30 MB). It's done with a typical file upload form (see minimal example below).
Now networks are not perfect. I see those types of errors being possible:
Bitflips can happen
packages can get lost
the order in which packages arrive might not be the order in which they were sent
a package could be received twice
Reading the TCP wiki article, I see
At the lower levels of the protocol stack, due to network congestion, traffic load balancing, or unpredictable network behaviour, IP packets may be lost, duplicated, or delivered out of order. TCP detects these problems, requests re-transmission of lost data, rearranges out-of-order data and even helps minimize network congestion to reduce the occurrence of the other problems. If the data still remains undelivered, the source is notified of this failure. Once the TCP receiver has reassembled the sequence of octets originally transmitted, it passes them to the receiving application. Thus, TCP abstracts the application's communication from the underlying networking details.
Reading that, the only reason I can see why a downloaded file might be broken is (1) something went wrong after it was downloaded or (2) the connection was interrupted.
Do I miss something? Why do sites that offer Linux images often also provide an MD5 hash? Is the integrity of a file upload/download over HTTPS (and thus also over TCP) guaranteed or not?
Minimal File Upload Example
HTML:
<!DOCTYPE html>
<html>
<head><title>Upload a file</title></head>
<body>
<form method="post" enctype="multipart/form-data">
<input name="file" type="file" />
<input type="submit"/>
</form>
</body>
</html>
Python/Flask:
"""
Prerequesites:
$ pip install flask
$ mkdir uploads
"""
import os
from flask import Flask, flash, request, redirect, url_for
from werkzeug.utils import secure_filename
app = Flask(__name__)
app.config["UPLOAD_FOLDER"] = "uploads"
#app.route("/", methods=["GET", "POST"])
def upload_file():
if request.method == "POST":
# check if the post request has the file part
if "file" not in request.files:
flash("No file part")
return redirect(request.url)
file = request.files["file"]
# if user does not select file, browser also
# submit an empty part without filename
if file.filename == "":
flash("No selected file")
return redirect(request.url)
filename = secure_filename(file.filename)
file.save(os.path.join(app.config["UPLOAD_FOLDER"], filename))
return redirect(url_for("upload_file", filename=filename))
else:
return """<!DOCTYPE html>
<html>
<head><title>Upload a file</title></head>
<body>
<form method="post" enctype="multipart/form-data">
<input name="file" type="file" />
<input type="submit"/>
</form>
</body>
</html>"""
return "upload handled"
if __name__ == "__main__":
app.run()
Is the integrity of file uploads/downloads guaranteed by TCP/HTTPS?
In short: No. But it is better with HTTPS than with plain TCP.
TCP only has a very weak error detection, so it will likely detect simple bit flips and discard (and resend) the corrupted packet - but it will not detect more complex errors. HTTPS though has (through the TLS layer) a pretty solid integrity protection and undetected data corruption on transport is essentially impossible.
TCP also has a robust detection and prevention of duplicates and reordering. TLS (in HTTPS) has an even more robust detection of this kind of data corruption.
But it gets murky when the TCP connection simply closes early, for example if a server crashes. TCP has no indication of a message by itself so a connection close is often used as an end of message indicator. This is for example true for FTP data connections but it can also be true for HTTP (and thus HTTPS). While HTTP has usually a length indicator (Content-length header or explicit chunk sizes with Transfer-Encoding: chunked) it defines also end of TCP connection as an end of message. Clients vary in the behavior if the end of connection is reached before the declared end of message: some will treat the data as corrupted, other will assume a broken server (i.e. wrong length declaration) and treat connection close as valid end of message.
In theory TLS (in HTTPS) has a clear end-of-TLS message (TLS shutdown) which might help in detecting an early connection close. In practice though implementation might simply close the underlying socket w/o this explicit TLS shutdown so that one unfortunately cannot fully rely on it.
Why do sites that offer Linux images often also provide an MD5 hash?
There is also another point of failure: the download might have been corrupted before it gets downloaded. Download sites often have several mirrors and the corruption might happen when sending the file to the download mirror, or even when sending the file to the download master. Having some strong checksum in parallel to the download helps to detect such errors, as long as the checksum was created at the origin of the download and thus before the data corruption.

How to make JMeter's BinaryTCPClientImpl accept ACK?

I use JMeter's BinaryTCPClientImpl to send a command of a custom protocol. By design this command doesn't produce a response from the application that receives it such that the only response which comes back to JMeter is the TCP ACK frame. JMeter's TCP Sampler doesn't see ACK as a response and thus a read timeout occurs follwed by a closure of the socket. According to the documentation I have tried to set the end-of-line byte value to greater than 128 to turn off the check of the end of stream but nothing has changed with respect to the read timeout.
Is there a way to make BinaryTCPClientImpl accept TCP ACK as a valid response without implementing a custom sampler?
Actually the ACK packet is a part of sending stage for tcp communication. So if you have sent your request successfully, then your requirement was met.
By default, BinaryTCPClientImpl has no such ability to not wait for the response. So you will have to implement your own BinaryTCPClientImplNoResp class, deriving from BinaryTCPClientImpl and overriding its read() method. For now I know no better way to achieve your goal.

why does TCP buffer data on receiver side

In most descriptions of the TCP PUSH function, it is mentioned that the PUSH feature not only requires the sender to send the data immediately (without waiting for its buffer to fill), but also requires that the data be pushed to receiving application on the receiver side, without being buffered.
What I dont understand is why would TCP buffer data on receiving side at all? After all, TCP segments travel in IP datagrams, which are processed in their entirety (ie IP layer delivers only an entire segment to TCP layer after doing any necessary reassembly of fragments of the IP datagram which carried any given segment). Then, why would the receiving TCP layer wait to deliver this data to its application? One case could be if the application were not reading the data at that point in time. But then, if that is the case, then forcibly pushing the data to the application is anyway not possible. Thus, my question is, why does PUSH feature need to dictate anything about receiver side behavior? Given that an application is reading data at the time a segment arrives, that segment should anyway be delivered to the application straightaway.
Can anyone please help resolve my doubt?
TCP must buffer received data because it doesn't know when the application is going to actually read the data and it has told the sender that it is willing to receive (the available "window"). All this data gets stored in the "receive window" until such time as it gets read out by the application.
Once the application reads the data, it drops the data from the receive window and increases the size it reports back to the sender with the next ACK. If this window did not exist, then the sender would have to hold off sending until the receiver told it to go ahead which it could not do until the application issued a read. That would add a full round-trip-delay worth of latency to every read call, if not more.
Most modern implementations also make use of this buffer to keep out-of-order packets received so that the sender can retransmit only the lost ones rather than everything after it as well.
The PSH bit is not generally used acted upon. Yes, implementations send it but it typically doesn't change the behavior of the receiving end.
Note that, although the other comments are correct (the PSH bit doesn't impact application behaviour much at all in most implementations), it's still used by TCP to determine ACK behaviour. Specifically, when the PSH bit is set, the receiving TCP will ACK immediately instead of using delayed ACKs. Minor detail ;)

How does TCP connection terminate if one of the machine dies?

If a TCP connection is established between two hosts (A & B), and lets say host A has sent 5 octets to host B, and then the host B crashes (due to unknown reason).
The host A will wait for acknowledgments, but on not getting them, will resend octets and also reduce the sender window size.
This will repeat couple times till the window size shrinks to zero because of packet loss. My question is, what will happen next?
In this case, TCP eventually times out waiting for the ack's and return an error to the application. The application have to read/recv from the TCP socket to learn about that error, a subsequent write/send call will fail as well. Up till the point that TCP determined that the connection is gone, write/send calls will not fail, they'll succeed as seen from the application or block if the socket buffer is full.
In the case your host B vanishes after it has sent its ACKs, host A will not learn about that until it sends something to B, which will eventually also time out, or result in an ICMP error. (Typically the first write/send call will not fail as TCP will not fail the connection immediately, and keep in mind that write/send calls does not wait for ACKs until they complete).
Note also that retransmission does not reduce the window size.
Please follow this link
now a very simple answer to your question in my view is, The connection will be timed out and will be closed. another possibility that exists is that some ICMP error might be generated due to due un-responsive machine.
Also, if the crashed machine is online again, then the procedure described in the link i just pasted above will be observed.
Depends on the OS implementation. In short it will wait for ACK and resend packets until it times out. Then your connection will be torn down. To see exactly what happens in Linux look here other OSes follow similar algorithm.
in your case, A FIN will be generated (by the surviving node) and connection will eventually migrate to CLOSED state. If you keep grep-ing for netstat output on the destination ip address, you will watch the migration from ESTABLISHED state to TIMED_WAIT and then finally disappear.
In your case, this will happen since TCP keeps a timer to get the ACK for the packet it has sent. This timer is not long enough so detection will happen pretty quickly.
However, if the machine B dies after A gets ACK and after that A doesn't send anything, then the above timer can't detect the same event, however another timer (calls idle timeout) will detect that condition and connection will close then. This timeout period is high by default. But normally this is not the case, machine A will try to send stuff in between and will detect the error condition in send path.
In short, TCP is smart enough to close the connection by itself (and let application know about it) except for one case (Idle timeout: which by default is very high).
cforfun
In normal cases, each side terminating its end of the connectivity by sending a special message with a FIN(finish) bit set.
The device receiving this FIN responds with an acknowledgement to the FIN to indicate that it has been received.
The connection as a whole is not considered terminated until both the devices complete the shut down procedure by sending an FIN and receiving an acknowledgement.

TCP Window size libnids

My intent is to write a app. layer process on top of libnids. The reason for using libnids API is because it can emulate Linux kernel TCP functionality. Libnids would return hlf->count_new which the number of bytes from the last invocation of TCP callback function. However the tcp_callback is called every time a new packet comes in, therefore hlf->count_new contains a single TCP segment.
However, the app. layer is supposed to receive the TCP window buffer, not separate TCP segments.
Is there any way to get the data of the TCP window (and not the TCP segment)? In other words, to make libnids deliver the TCP window buffer data.
thanks in advance!
You have a misunderstanding. The TCP window is designed to control the amount of data in flight. Application reads do not always trigger TCP window changes. So the information you seek is not available in the place you are looking.
Consider, for example, if the window is 128KB and eight bytes have been sent. The receiving TCP stack must acknowledge those eight bytes regardless of whether the application reads them or not, otherwise the TCP connection will time out. Now imagine the application reads a single byte. It would be pointless for the TCP stack to enlarge the window by one byte -- and if window scaling is in use, it can't do that even if it wants to.
And then what? If four seconds later the application reads another single byte, adjust the window again? What would be the point?
The purpose of the window is to control data flow between the two TCP stacks, prevent the buffers from growing infinitely, and control the amount of data 'in flight'. It only indirectly reflects what the application has read from the TCP stack.
It is also strange that you would even want this. Even if you could tell what had been read by the application, of what possible use would that be to you?

Resources