Why retransmision occured 16 seconds later? - tcp

I use jmeter to test a tomcat web application running behind f5's load balancer.
From the pcap file I captured in jmeter node I see that there is a duplicate ack and then
16 seconds later the client retransmit the packet and do nothing within 20 seconds.
As tomcat's default socket timeout is set as 20 seconds the client received Http error.
My question is:
What could be the reason that make client's TCP stack retransmit after 16 seconds and later do nothing within 20 seconds? Is it too busy? Is there a way to find out the reason?

Related

Scrapy - Set TCP Connect Timeout

I'm trying to scrape a website via Scrapy. However, the website is extremely slow at times and it takes almost 15-20 seconds to respond at first request in browser. Anyways, sometimes, when I try to crawl the website using Scrapy, I keep getting TCP Timeout error. Even though the website opens just fine on my browser. Here's the message:
2017-09-05 17:34:41 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET http://www.hosane.com/result/spec
ialList> (failed 16 times): TCP connection timed out: 10060: A connection attempt failed because the connected party di
d not properly respond after a period of time, or established connection failed because connected host has failed to re
spond..
I have even overridden the USER_AGENT setting for testing.
I don't think DOWNLOAD_TIMEOUT setting works in this case, since it defaults to 180 seconds, and Scrapy doesn't even take 20-30 seconds before giving a TCP timeout error.
Any idea what is causing this issue? Is there a way to set TCP timeout in Scrapy?
A TCP connection timed out can happen before the Scrapy-specified DOWNLOAD_TIMEOUT because the actual initial TCP connect timeout is defined by the OS, usually in terms of TCP SYN packet retransmissions.
By default on my Linux box, I have 6 retransmissions:
cat /proc/sys/net/ipv4/tcp_syn_retries
6
which, in practice, for Scrapy too, means 0 + 1 + 2 + 4 + 8 + 16 + 32 (+64) = 127 seconds before receiveing a twisted.internet.error.TCPTimedOutError: TCP connection timed out: 110: Connection timed out. from Twisted. (That's the initial trial, then exponential backoff between each retry and not receiving a reply after the 6th retry.)
If I set /proc/sys/net/ipv4/tcp_syn_retries to 8 for example, I can verify that I receive this instead:
User timeout caused connection failure: Getting http://www.hosane.com/result/specialList took longer than 180.0 seconds.
That's because 0+1+2+4+8+16+32+64+128(+256) > 180.
10060: A connection attempt failed... seems to be a Windows socket error code. If you want to change the TCP connection timeout to something at least the DOWNLOAD_TIMEOUT, you'll need to change the TCP SYN retry count. (I don't know how to do it on your system, but Google is your friend.)

What happens to a waiting WebSocket connection on a TCP level when server is busy (blocked)

I am load testing my WebSocket Tornado server, running on Ubuntu Server 14.04.
I am playing with a big client machine loading 60,000 users, 150 a second (that's what my small server can comfortably take). Client is a RedHat machine. When a load test suite finishes, I have to wait a few seconds to be able to rerun.
Within these few seconds, my websocket server is handling closing of the 60,000 connections. I can see it in my graphite dashboard (the server logs every connect and and disconnect information there).
I am also logging relevant outputs of the netstat -s and ss -s commands to my graphite dashboard. When the test suite finishes, I can immediately see tcp established seconds dropping from 60,000 to ~0. Other socket states (closed, timewait, synrecv, orphaned) remain constant, very low. My client's sockets go to timewait for a short period and then this number goes to 0 too. When I immediately rerun the suite, and all the tcp sockets on both ends are free, but the server has not finished processing of the previous closing batch yet, I see no changes on the tcp socket level until the server is finished processing and starts accepting new connections again.
My question is - where is the information about the sockets waiting to be established stored (RedHat and Ubuntu)? No counter/queue length that I am tracking shows this.
Thanks in advance.

After HTTP Keep Alive timeout new HTTP request arrives

Our application is ExtJS 3.4 based application we are frequently getting "Communication Failure" error on UI , we have our application deployed on different domain but on some domain we get this very frequently .
Without HTTP Keep Alive we are not getting that error. :
But in different scenarios for 1 sec and 5 sec we get it quite frequently.
We have observed on Wireshark was due to high RTT (Round Trip Time) the request were taking more time than expected.
There were inconsistency in packet flow the scenario was :
If keep alive was 5 sec :
When a request is successfully served it will return 200 OK(success response) and timeout parameter of 5 sec (where server tries to say to client that server will wait for 5 sec before closing this connection).
Now as soon as 5 sec of time is elapsed Server sends a FIN Packet(Finish packet which is to close connection is sent from server to client which is browser in our case).
Now here is the catch the time taken by ACK (Acknowledge Packet) from client to close connection is high ( high RTT).
Now server has initiated close but due to high RTT before the connection is closed client sends a new HTTP request(for eg ExampleABC.do request) before server receives ACK for FINISH from client.
Because of which server was not able to handle that request since it has initiated connection close.
Setting 1 sec as keep alive meant we are reducing time the server will wait to close the connection since we wanted after 1 sec one connection is to be closed and fresh connection is setup for new request to avoid unwanted request coming after 5 sec .
Thanks in advance
This is my first post please correct me if needed.
Sorry for bad English :)
Image for communication failure :
We solved this issue by synchronizing browser timeout and server timeouts.
The fix was to make sure the TCP keepalive time and browser coincide or come at same time, causing the TCP connection to completely drop.

In-practice ideal timeout length for idle HTTP connections

In an embedded device, What is a practical amount time to allow idle HTTP connections to stay open?
I know back in the olden days of the net circa 1999 internet chat rooms would sometimes just hold the connection open and send replies as they came in. Idle timeouts and session length of HTTP connections needed to be longer in those days...
How about today with ajax and such?
REASONING: I am writing a transparent proxy for an embedded system with low memory. I am looking for ways to prevent DoS attacks.
My guess would be 3 minutes, or 1 minute. The system has extremely limited RAM and it's okay if it breaks rare and unpopular sites.
In the old days (about 2000), an idle timeout was up to 5 minutes standard. These days it tends to be 5 seconds to 50 seconds. Apache's default is 5 seconds. With some special apps defaulting to 120 seconds.
So my assumption is, that with AJAX, long held-open HTTP connections are no longer needed.
How about allowing idle HTTP connections to remain open unless another communication request comes in? If a connection is open and no one else is trying to communicate, the open connection won't hurt anything. If someone else does try to communicate, send a FIN+ACK to the first connection and open the second. Many http clients will attempt to receive multiple files using the same connection if possible, but can reconnect between files if necessary.

Relation between HTTP Keep Alive duration and TCP timeout duration

I am trying to understand the relation between TCP/IP and HTTP timeout values. Are these two timeout values different or same? Most Web servers allow users to set the HTTP Keep Alive timeout value through some configuration. How is this value used by the Web servers? is this value just set on the underlying TCP/IP socket i.e is the HTTP Keep Alive timeout and TCP/IP Keep Alive Timeout same? or are they treated differently?
My understanding is (maybe incorrect):
The Web server uses the default timeout on the underlying TCP socket (i.e. indefinite) regardless of the configured HTTP Keep Alive timeout and creates a Worker thread that counts down the specified HTTP timeout interval. When the Worker thread hits zero, it closes the connection.
EDIT:
My question is about the relation or difference between the two timeout durations i.e. what will happen when HTTP keep-alive timeout duration and the timeout on the Socket (SO_TIMEOUT) which the Web server uses is different? should I even worry about these two being same or not?
An open TCP socket does not require any communication whatsoever between the two parties (let's call them Alice and Bob) unless actual data is being sent. If Alice has received acknowledgments for all the data she's sent to Bob, there's no way she can distinguish among the following cases:
Bob has been unplugged, or is otherwise inaccessible to Alice.
Bob has been rebooted, or otherwise forgotten about the open TCP socket he'd established with Alice.
Bob is connected to Alice, and knows he has an open connection, but doesn't have anything he wants to say.
If Alice hasn't heard from Bob in awhile and wants to distinguish among the above conditions, she can resend her last byte of data, wrapped in a suitable TCP frame to be recognizable as a retransmission, essentially pretending she hasn't heard the acknowledgment. If Bob is unplugged, she'll hear nothing back, even if she repeatedly sends the packet over a period of many seconds. If Bob has rebooted or forgotten the connection, he will immediately respond saying the connection is invalid. If Bob is happy with the connection and simply has nothing to say, he'll respond with an acknowledgment of the retransmission.
The Timeout indicates how long Alice is willing to wait for a response when she sends a packet which demands a reply. The Keepalive time indicates how much time she should allow to lapse before she retransmits her last bit of data and demands an acknowledgment. If Bob goes missing, the sum of the Keepalive and Timeout values will indicate the worst-case time between Alice receiving her last bit of data and her deciding that Bob is dead.
They're two separate mechanisms; the name is a coincidence.
HTTP keep-alive (also known as persistent connections) is keeping the TCP socket open so that another request can be made without setting up a new connection.
TCP keep-alive is a periodic check to make sure that the connection is still up and functioning. It's often used to assure that a NAT box (e.g., a DSL router) doesn't "forget" the mapping between an internal and external ip/port.
KeepAliveTimeout Directive
Description: Amount of time the server will wait for subsequent
requests on a persistent connection Syntax: KeepAliveTimeout seconds
Default: KeepAliveTimeout 15 Context: server config, virtual host
Status: Core Module: core The number of seconds Apache will wait for a
subsequent request before closing the connection. Once a request has
been received, the timeout value specified by the Timeout directive
applies.
Setting KeepAliveTimeout to a high value may cause performance
problems in heavily loaded servers. The higher the timeout, the more
server processes will be kept occupied waiting on connections with
idle clients.
In a name-based virtual host context, the value of the first defined
virtual host (the default host) in a set of NameVirtualHost will be
used. The other values will be ignored.
TimeOut Directive
Description: Amount of time the server will wait for certain events
before failing a request Syntax: TimeOut seconds Default: TimeOut 300
Context: server config, virtual host Status: Core Module: core The
TimeOut directive currently defines the amount of time Apache will
wait for three things:
The total amount of time it takes to receive a GET request. The amount
of time between receipt of TCP packets on a POST or PUT request. The
amount of time between ACKs on transmissions of TCP packets in
responses. We plan on making these separately configurable at some
point down the road. The timer used to default to 1200 before 1.2, but
has been lowered to 300 which is still far more than necessary in most
situations. It is not set any lower by default because there may still
be odd places in the code where the timer is not reset when a packet
is sent.

Resources