Forcing request timeout on nginx? - nginx

Is there a way to tell nginx to timeout a request after a certain length of time, regardless of whether data continues to flow through or not?
I know send_timeout will kill it if no data is picked up by the client after so long, but if the client is, for instance, using "curl --limit-rate 1" or something, is there a setting that says "no matter what else is going on, timeout a request after (x) seconds."?

Related

Timeout vs no response from server, how can I separate these?

This question is regarding a bot of mine which's primary focus is scraping.
The path is mapped out correctly and it does what it needs to do.
Rate limits are tested and I am certain this is not a factor, if it was and where it was we received actual responses.
However, the webpage(s) I am trying to scrape seem to have build in a kind of weird/ unfamiliar security manner, something that I haven't came across before. And here I am wondering, how it's executed and how I deal with it appropriately.
While the scraper/bot is doing it's thing, sending requests getting responses, at random times it will encounter this what I suspect is a security measure. There are simply no responses back from the server, not a 4xx error or any at all.
At first sight the proxies just appear dead, but that's not it, because they are not. The proxies work just fine, and manually I can just browse the page on them, no issues here.
The server just stops giving responses.
Now to find a workaround for this, I would need to be able to tell the difference between a timeout (for my proxies) and a no response. They appear the same, but are not.
Does anyone have insight on this problem, maybe there is a genius way to separate those that I am not aware of.
Now to find a workaround for this, I would need to be able to tell the difference between a timeout (for my proxies) and a no response. They appear the same, but are not.
A timeout is if the server does not respond within a specific time. No response means, that the server either closes the connection either before the timeout occurs or that it will close the connection after the timeout occurred without sending anything back.
The first case can be easily detected by the connection close before timeout. If you want to detect instead if the server will close the connection without response only after your current timeout then your only option is to extend the timeout. There is nothing in the server which will indicate that the server will close the connection without response at some future time.
And since your only connection is with the proxy there is no real way to detect if the problem is at the proxy or the server. Your only hope might be to set your timeout waiting for the proxy larger then the timeout the proxy has waiting for the server. This way you'll maybe get a response from the proxy indicating that the connection to the server timed out.
They appear the same, but are not.
They are the same. There is no difference. A read timeout means that data didn't arrive within the timeout period. For whatever reason. TCP doesn't know, and can't tell you. At the C level, recv() returned -1 with errno == EAGAIN/EWOULDBLOCK. That's all the information there is.
What you are asking is tantamount to 'data didn't arrive: where didn't it arrive from?' It's not a meaningful question.

xbuf_frurl timeout=0

I try to use xbuf_frurl to start a worker to do some post-processing.
The worker will finish the job without returning anything.
Thus, the original script can respond to client faster.
So, I try to set the timeout=0ms in xbuf_frurl, hoping that it can return immediately and do the rest of the code and return 200.
xbuf_frurl(&buf, "localhost", 80, HTTP_GET, "/postprocessing", 0, 0);
xbuf_ncat (reply, buf.ptr, buf.len);
But, it seems not to timeout immediately, since the buffer is not null.
Is there any better way to do so?
The xbuf_frurl() timeout is aimed at failing faster when the target host is not accepting the TCP connection.
So, to answer your question, the timeout will have no influence on the reply time since the server is always online.
I am not sure that querying a servlet from another servlet (or self) is the best way to do what you want to do, but I can't give you a better strategy if I don't know what your goal is (the mysterious 'posprocessing' can surely be triggered more efficiently than by using an HTTP connection).
If you want to use an HTTP connection, you can have the servlet to send a minimal answer by using an incorrect HTTP status code is the 1-99 range: this is used to prevent G-WAN from adding missing HTTP headers, for example to send naked JSON payloads.

Nginx not failing over when used for load balancing

We're using Nginx to load balance between two upstream app servers and we'd like to be able to take one or the other down when we deploy to it. We're finding that when we shut one down, Nginx is not failing over to the other. It keeps sending requests and logging errors.
Our upstream directive has the form:
upstream app_servers {
server 10.100.100.100:8080;
server 10.100.100.200:8080;
}
Our understanding from reading the Nginx docs is that we don't need to explictly specify the "max_fails" or "fail_timeout" because they have reasonable defaults. (ie. max_fails of 1).
Any idea what we might be missing here?
Thanks much.
As per the documentation...
max_fails = NUMBER - number of unsuccessful attempts at communicating
with the server within the time period (assigned by parameter
fail_timeout) after which it is considered inoperative. If not set,
the number of attempts is one. A value of 0 turns off this check. What
is considered a failure is defined by proxy_next_upstream or
fastcgi_next_upstream (except http_404 errors which do not count
towards max_fails).
As per the documentation, a failure is define by proxy_next_upstream or fastcgi_next_upstream.
It keeps sending requests and logging errors.
Please check the log what type of errors were logged, if it is not the default (error or timeout), then you may want to define it exclusively in proxy_next_upstream or fastcgi_next_upstream.

HTTP 504 timeout after exactly 120 seconds

I have a server application which runs in the Amazon EC2 cloud. From my client (the browser) I make a HTTP request which uploads a file to the server which then processes the file. If there is a lot of processing (large file
), the server always times out with a 504 backend continuation error always exactly after 120 seconds. Though I get this error, the server continues to process the request and completes it (verified by checking the database) but I cannot see the final result on my client because of the timeout.
I am clueless as to why this is happening. Has anyone faced a similar 504 timeout ? Is there some intermediate proxy server not in my control which is timing out ?
I have a similar problem and in my case I believe it is due to the connection between the Elastic Load Balancer (ELB) and the EC2 instance.
For a long-term solution I will go with the 303 Status response + back-end processing suggested by james.garriss above.
For short-term solution it may be possible for Amazon support to increase the ELB timeout (see their response in https://forums.aws.amazon.com/thread.jspa?messageID=491594&#491594). Unfortunately there doesn't seem to be any way to change the timeout yourself through either API or console.
[Update] AWS now does allow you to update the idle timeout either through console, CLI or .ebextensions configuration. See http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/config-idle-timeout.html (thanks #Daniel Patz for the update)
Assuming that the correct status code is being returned, the problem is that an intermediate proxy is timing out. "The server, while acting as a gateway or proxy, did not receive a timely response from the upstream server specified by the URI." (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.5) It most likely indicates that the origin server is having some sort of issue (i.e., taking a long time to process your request), so it's not responding quickly.
Perhaps the best solution is to re-craft your server app so that it responds with a "303 See Other" status code; then your client can retrieve the data at a later data point, once the server is done processing and creates the final result.
Edit: Another idea is to re-craft your server app so that it responds with a "413 Request Entity Too Large" status code when the request entity size is too large. This will get rid of the error, though it may make your app less useful if it can only process "small" files."
Other possible solutions:
Increase timeout value of the proxy (if it's under your control)
Make your request to a different server (if there's another, faster server with the same app)
Make your request differently (if possible) such that you are sending less data at a time
it is possible that the browser timeouts during the script execution.

long running http connection never gets response back

I am making an http request which ends up taking more than 8 mins. For me, this long running request work fine. I am able to get a response back to my browser without any issues. (I am located on the same network as the server).
However, for some users, the browser never returns any response. (Note: When the same http request executes in 1 min, these users are able to see the response without any problem)
These users happen to be on another network. And there probably is a firewall or two between their location and the server.
I can see on their fiddler that the request is just sitting there waiting for a response.
I am right now assuming that firewall is killing the idle http connection.. but I am not sure.
If you have any idea why the response never gets back, or why the connection never breaks.. it will be really helpful.
Also: Is it possible to fix this issue by writing an Applet that somehow manages to keep the sending dummy signal to the server, even after having sent (flushed) the request to the server?
The users might be behind a connection tracking firewall/NAT gateway. Such gateways tend to drop the TCP connection when nothing has happened for a period of time. In a custom protocol you could send some kind of heartbeat messags to keep the TCP connection alive, but with HTTP you don't have proper control over that connection, nor does HTTP facilitate what's needed to keep a tcp connection "alive".
The usual way to handle long running jobs initated by an HTTP request is to fire off that job in the background, sending a proper response back to the client immediately and have an applet/ajax request poll the status of that job and returning the result when it's done.
If you need a quick fix, see if you can control any timeouts on the gateways between the server and the user.
Have you considered that the users might be using a browser which has a HTTP timeout which causes the browser to stop waiting for a response after a certain amount of time?
http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html
http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html
If you are using Linux machine try
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
# echo 1500 > /proc/sys/net/ipv4/tcp_keepalive_time
# echo 500 > /proc/sys/net/ipv4/tcp_keepalive_intvl
# echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes

Resources