RST, ACK after sending huge portion of data - tcp

I have an IMAP server (Dovecot), on which I am trying to create 1,200 mailboxes (for performance testing). The server successfully creates the mailboxes.
After this operation I then want to list all created folders. The server sends some data, however, after some time (nearly 1 second) the CLIENT sends RST, ACK to the server in response to the server responding with IMAP protocol's command about the list of created folders.
Here is my Wireshark dump snippet:
IMAP: Src Port: imap (143), Dst Port: 56794 (56794), Seq: 29186, Ack: 20533, Len: 24
IMAP: Src Port: 56794 (56794), Dst Port: imap (143), Seq: 20533, Ack: 29210, Len: 15
IMAP: Src Port: imap (143), Dst Port: 56794 (56794), Seq: 29210, Ack: 20548, Len: 16384
TCP: 56794 > imap [ACK] Seq=20548 Ack=45594 Win=49408 Len=0 TSV=3940902 TSER=3940902
IMAP: Src Port: imap (143), Dst Port: 56794 (56794), Seq: 45594, Ack: 20548, Len: 16384
TCP: 56794 > imap [RST, ACK] Seq=20548 Ack=61978 Win=49408 Len=0 TSV=3940902 TSER=3940902
Edit: Well, I think I figured out why RST flag is sent by client. The reason is server exceed MTU value for my loopback interface. I have checked similar behavior for sample Mina server - and all is OK there, i.e. huge packets are spited by TCP/IP protocol. So Dovecot can't manage packets wisely. But I have my own IMAP server (based on MINA) and the problem still persist there!
So why TCP/IP protocol manages sent packets (split them according to MTU value) wisely only for some apps but not for all?

Your assumption about the causation of a TCP Reset being sent is incorrect. If you've exceeded the MTU, that's not managed by TCP. It's managed at the IP layer and an ICMP "fragmentation required" message will be sent to the client. Such a message should then cause the client to fragment the packets at the IP layer. This is not happening in your case based on the information that you've shared.
Regarding the loopback interface, this traffic doesn't go anywhere near the loopback interface, is it not two separate devices?
Sadly, your trace file still doesn't offer any insight into why this packet -
IMAP: Src Port: imap (143), Dst Port: 56794 (56794), Seq: 45594, Ack: 20548, Len: 16384
causes a TCP reset. There's nothing really further that I can deduce from this information.
TCP has an option called Maximum Segment size, which is similar but also different :) The TCP/IP stack is independent of applications and does not apply different settings to each application, it's system-wide.
Edit: Looking at you packet capture, there's nothing indicative of a MTU issue. There's no ICMP traffic anywhere so I suspect that it's not an issue. If there's a MTU issue, it should have happened in the previous response because both LIST responses from the IMAP server are of identical size and there's no issue with the window sizes.
The only thing that I can see is in relation to the first element of the final response (before the RST) where part of the reply looks to be malformed (see the attachment). There's something going wrong in the IMAP application and the data it's replying with is malformed. - see the difference with the bottom two responses that are consistent with all other LIST responses in the pcap.

Related

TCP handshake involving more than two ports

I have an application deployed on a Kubernetes cluster. I am using Istio/Envoy in this deployment to control inbound/outbound traffics. I have collected some TCP packets using TCPdump for investigating some issue.
To my understanding, a TCP handshake should only involves a pair of 5-tuple (src-IP, src-Port, dst-IP, dst-Port, protocol).
For example
IP: 198.168.1.100 Port: 52312 ----SYN----> IP: 198.168.1.101 Port: 80
IP: 198.168.1.100 Port: 52312 <--SYN ACK-- IP: 198.168.1.101 Port: 80
IP: 198.168.1.100 Port: 52312 ----ACK----> IP: 198.168.1.101 Port: 80
But in the packets I collected, what I don't understand is this:
10.X.X.X 127.0.0.1 TCP 76 33500 → 15001 [SYN] Seq=3333992218
X.X.X.X 10.X.X.X TCP 76 80 → 33500 [SYN, ACK] Seq=2228273021 Ack=3333992219
10.X.X.X 127.0.0.1 TCP 68 33500 → 15001 [ACK] Seq=3333992219 Ack=2228273022
Notice the SYN ACK was returned from a port 80. First, I thought there could be missing packets and there was actually two handshakes, but looking at the sequence number and acknowledgment number, it seems to be single handshake.
If this is a single handshake, how would you explain this? Is there a technique that does the TCP handshake differently?
If this is a single handshake, how would you explain this? Is there a technique that does the TCP handshake differently?
According to this blog:
This is a single handshake, just there are 2 separate connections, first goes to envoy sidecar, then envoy sidecar as a middleman send it to your pod.
So this is the magic: the connection is not established between client and server directly, but split into 2 separate connections:
connection between client and sidecar
connection between sidecar and server
Those two connections are independently handshaked, thus even if the latter failed, the former could still be succesful.
Actual view of the two sides: a middleman sits between client and server
If you're looking for more informations about the 15001 port itself you can visit istio documentation.
It is explained in more details here.

How does Wireshark identify a TCP packet's protocol as HTTP?

Port number equals to 80 is obviously not a sufficient condition. Is it a necessary condition that Wireshark has found a request message or response message in application layer payload?
I'm not sure this is a full answer, but here is what I know regarding Wireshark's identification of HTTP packets (all items below are dissected as HTTP):
TCP port 80
TCP or UDP ports 8080, 8008, 591
TCP traffic (on all ports) that has line end (CRLF) and the line begins or ends with the string "HTTP/1.1"
SSDP (Simple Service Discovery Protocol) in TCP or UDP port 1900
DAAP (Apple's Digital Audio Access Protocol) in TCP port 3689
IPP (Internet Printing Protocol) in TCP port 631

TCP client does not send ACK while handshaking

My testing environment
client
IP 192.168.0.2/24
gateway 192.168.0.1
server
IP 192.168.0.1/24
http service run on port 80
When I try getting web page hosted on server, everything goes fine.
Then I write a kernel module using netfilter on server, which will change dest IP to 192.168.0.1 if the origin dest IP is 192.168.1.1, and will change source IP to 192.168.1.1 if origin source IP is 192.168.0.1. I think you can understand that I'm just pretending the server to be 192.168.1.1 for the client.(IP header Checksum and TCP Checksum are changed properly)
I use the web browser(chrome, firefox...) on client to visit 192.168.1.1 and capture the packets on the client, results are like:
192.168.0.2:someport_1 -> 192.168.1.1:80 [SYN]
192.168.1.1:80 -> 192.168.0.2:someport_1 [SYN, ACK]
192.168.0.2:someport_2 -> 192.168.1.1:80 [SYN]
192.168.1.1:80 -> 192.168.0.2:someport_2 [SYN, ACK]
192.168.0.2:someport_3 -> 192.168.1.1:80 [SYN]
192.168.1.1:80 -> 192.168.0.2:someport_3 [SYN, ACK]
I don't know why the client will never send the last ACK of TCP handshaking, any ideas?
Edit1:
Now I think that the browser didn't get the [SYN, ACK] packet from the server although wireshark can see it, so maybe it's because that the OS(Windows7) dropped the [SYN, ACK] packet from the server. Now the question becomes that why would windows drop a correct [SYN, ACK] packet?
You said the IP checksum is OK, but what about the TCP checksum, which is computed from a pseudo-header which includes source and destination IP ?
I've made three mistakes.
The first one is that skb can be nonlinear, which will cause the checksum got from csum_partial() be incorrect.
The second one is that I use csum_tcpudp_magic() to get the checksum, but forgot to change skb->ip_summed, so the NIC will use my correct checksum as the partial checksum of tcp pseudo-header to recalculate the checksum, leading checksum incorrect in the packet.
The third mistake is that my wireshark seems to be set to ignore the tcp checksum, and it always shows the packets with wrong checksum as good ones, while tcpdump will tell me incorrect checksums.

Forward TCP connection which first byte is '{' to port 3333, otherwise to port 80, possible with iptables?

Port 80 accept two different protocols: HTTP and Stratum. The latter is a line-based protocol always start with '{'. If the client connect to port 80 and sends something like 'GET / HTTP/1.0...', forward the connection to port 8000, if it sends '{"id": 1,...', forward it to port 3333. Is it possible to do it with iptables? Thanks!
I don't think you can do that with iptables.
The problem is that, at the time you can detect the first byte of the TCP payload, a connection has been established between source:port to server:80.
Forwarding the packets in mid-connection will result in the packets being rejected, because the TCP stack never sees the SYN/SYN-ACK packets for connection establishment to ports :8000 or :3333.
You'll need something listening on port :80, then based on the very first by received, open a connection to port :8000 or :3333 and replay the contents. That something must also perform reverse-replay of the webserver's/Stratumserver's replay toward the connection initiator.

firefox ipv6 connection failed while tcp layer connected

I am trying to connect to an http server via IPv6 link-local address from Windows xp sp3 with firefox 6.
Although connecting by IPv4 address of serve worked well, IPv6 failed with connection failed error.
By Wireshark, the sequence is observed as:
direction protocol port transmission
1. client -> server: tcp 1061-> 80 [syn]
2. server -> client: tcp 80->1061 [syn, ack]
3. client -> server: tcp 1061->80 [ack]
4. client -> server: http [get /]
5. server -> client: http [200 OK]
In the 5th transmission, requested html file is included.
But the browser shows connection failed.
It seems tcp layer received the messages and cannot deliver it to http layer or browser.
I disabled firewall, and the result is the same.
Can someone give a clue or hint to pursue.
Thank you.
I suspect that it's not the whole response in packet 5.
Usually problems like this are caused by broken Path MTU Discovery. If there is a tunnel in the path then the MTU is probably smaller than 1500 bytes, i.e. 1480 bytes. All the packes that are smaller than 1480 bytes get through. When the server sends a 1500 byte packet it will be too big for the tunnel. The tunnel router sends back a Packet-too-big ICMP error, and the server sends the data in 1480-byte chunks. If the ICMP error is never generated or a firewall blocks the ICMP packet then the server never learns that it should send smaller packets, it keeps sending large packets, and they never arrive...
Most of the time such problems are caused by misconfiured firewalls. Sometimes it's broken hardware or software.

Resources