I have a small python web application running on nginx with unicorn. The web application refresh it's page automatically every 1 minute.
Every day I see that around the same hour, the browser reports a 504 Gateway Time-out error and the application stops refreshing obviously.
I checked it with both chrome and firefox on two different client machines and two different server machines and found out it happens almost everyday on the same time (different time for each web server).
The weird thing is that looking at the web server access log I identify these calls and they are reported with 200 OK status code.
Could it be the the browser reports a different error code than the server due to connection issues? Any ideas how should I keep investigating it?
We found out the indeed our server had a maintenance procedure which blocked the access to it. Although it finished the request after a while the browser "gave up" and returned a timeout error. Once the maintenance procedure was canceled - the issue was resolved.
Yes - the server is able to serve the page ok so returns 200, but the client cannot finish the connection.
It could be a part of your infrastructure (firewall?) is choosing to update or something, although the odds of this happening at the exact same time of your request is slim unless it's a long running request or gateway outage.
Related
My web app sits behind a Nginx. Occasionally, the loading of my web page takes more than 10 seconds, I used Chrome DevTools to track the timing, and it looks like this:
The weird thing is, when the page loads slowly, the initial connection time is always 11 seconds long. And after this slow request, subsequent loading of the same page becomes very fast.
What is the possible problem that cause this?
P.S. If this is caused by a resource limitation on my server, can I see some errors/warnings in some system log?
The initial connection refers to the time taken to perform the initial TCP handshake and negotiating an SSL (where applicable). The slowness could be caused by congestion, where the server has hit a limit and can't respond to new connections while existing ones are pending. You could look into some performance enhancements in your Nginx configuration.
Use dig command to check you domain name resolution process.
if return multi answer section, check these ips is valid.
You should eliminate the file incallings which are pointing to not-existing files. I had the same issue at a customer where a 404 image caused the problem, as it delayed the loading of other files.
I have a server application which runs in the Amazon EC2 cloud. From my client (the browser) I make a HTTP request which uploads a file to the server which then processes the file. If there is a lot of processing (large file
), the server always times out with a 504 backend continuation error always exactly after 120 seconds. Though I get this error, the server continues to process the request and completes it (verified by checking the database) but I cannot see the final result on my client because of the timeout.
I am clueless as to why this is happening. Has anyone faced a similar 504 timeout ? Is there some intermediate proxy server not in my control which is timing out ?
I have a similar problem and in my case I believe it is due to the connection between the Elastic Load Balancer (ELB) and the EC2 instance.
For a long-term solution I will go with the 303 Status response + back-end processing suggested by james.garriss above.
For short-term solution it may be possible for Amazon support to increase the ELB timeout (see their response in https://forums.aws.amazon.com/thread.jspa?messageID=491594񸁊). Unfortunately there doesn't seem to be any way to change the timeout yourself through either API or console.
[Update] AWS now does allow you to update the idle timeout either through console, CLI or .ebextensions configuration. See http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/config-idle-timeout.html (thanks #Daniel Patz for the update)
Assuming that the correct status code is being returned, the problem is that an intermediate proxy is timing out. "The server, while acting as a gateway or proxy, did not receive a timely response from the upstream server specified by the URI." (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.5) It most likely indicates that the origin server is having some sort of issue (i.e., taking a long time to process your request), so it's not responding quickly.
Perhaps the best solution is to re-craft your server app so that it responds with a "303 See Other" status code; then your client can retrieve the data at a later data point, once the server is done processing and creates the final result.
Edit: Another idea is to re-craft your server app so that it responds with a "413 Request Entity Too Large" status code when the request entity size is too large. This will get rid of the error, though it may make your app less useful if it can only process "small" files."
Other possible solutions:
Increase timeout value of the proxy (if it's under your control)
Make your request to a different server (if there's another, faster server with the same app)
Make your request differently (if possible) such that you are sending less data at a time
it is possible that the browser timeouts during the script execution.
I have an issue with a long-running page. ASP.NET page takes about 20 minutes to get generated and served to the browser.
Server successfully completes the response (according to logs and ASP.NET web trace) and I assume sends it to the browser. However, browser never recieves the page. It (IE8 & Firefox 3 both) keeps spinning and spinning (I've let it run for several hours, nothing happens).
This issue only appears on the shared host server. When I run the same app on dev machine or internal server, everything works fine.
I've tried fiddler and packet sniffing and it looks like server doesn't send anything back. It doesn't even send keep-alive packets. Yet both browsers I've tried don't time out after pre-defined timeout period (1 hour in IE I believe, not sure what it is in Firefox).
The last packet server sends back is ACK to the POST from the browser.
I've tried this from different client machines, to ensure it's not a broken configuration on my machine.
How can I futher diagnose this problem? Why doesn't browser time-out, even though there're no keep-alive packets?
p.s. server is Windows 2003, so IIS6. It used to work fine on shared hosting, but they've changed something (when they moved to new location) and it broke. Trying to figure out what.
p.p.s. I know I can change page design to avoid page taking this long to get served. I will do this, but I would also like to find the cause of this problem. I'd like to stay focused on this issue and avoid possible alternative designs for the page (using AJAX or whatever else).
Check the server's connection timeout (on the Web Site properties page).
A better approach would be to send the request, start the calculation on the server, and serve a page with a Javascript timer that keeps sending requests to itself. Upon post-back, this page checks whether the server process has completed. While the process is still running, it responds with another timer. Once it has completed, it redirects to the results.
Would you clarify how you fixed this problem? It seems I have the same one. I'm getting .docx report and both browsers (IE10 and Fx37) indefinitely wait for a response (keep on spinning).
But it works great in VS2012's IIS Express and on my localhost IIS7.
Although server has IIS6.1
And when it works on localhost, it's just several minutes to get a report.
I have an IIS7 web application which primarily serves web service requests. As part of our solution we have two web servers and a load balancer, and the load balancer requests a page from each of its load balanced boxes periodically. The page the load balancer loads is named "Health.aspx", it does not have a code behind, and the entire contents of the file Health.aspx is:
OK
However we are observing occasional 400 errors from the load balancer when requesting this page throughout the day which causes the load balancer to eject the machine from its rotation for a period of time.
While there are potentially a number of things which could be causing a problem here I wanted to start at the web boxes themselves and determine whether a nearly empty *.aspx page with no code behind could cause periodic problems.
You should create Health.html and just use that instead. That will also give you some idea of whether it's ASP.NET giving you the 400 problems or IIS.
"400 errors" covers a wide range of potential issues, from a simple not found 404 to bad request 401, and forbidden 403.
Some of these could be caused by bad network cables, nics, lousy load balancer, overloaded server(s), etc.
Incidentally, no, a web page with no code behind is not going to throw that. Also, I'd probably do a little something more in the health.aspx to verify that the server is actually functioning. Like run a calulation or make a simple database request. After all, IIS could simply be caching the file.
Turns out that the problem was just the sheer number of requests and an untimely response time (or really request time - our client connections are very slow) for handling the requests. Requests for Health.aspx were getting caught up with slow client connections and the default MaxConcurrentRequestsPerCPU of 12 was artificially limiting our actual number of requests/second. Increasing this number to 100 - based on carefully testing our hardware against load for our particular application - solved the problem.
Any thoughts on why I might be getting tons of "hangs" when trying to download a file via HTTP, based on the following?
Server is IIS 6
File being downloaded is a binary file, rather than a web page
Several clients hang, including TrueUpdate and FlexNet web updating packages, as well as custom .NET app that just does basic HttpWebRequest/HttpWebResponse logic and downloads using a response stream
IIS log file signature when success is 200 0 0 (sc-status sc-substatus sc-win32-status)
For failure, error signature is 200 0 64
sc-win32-status of 64 is "the specified network name is no longer available"
I can point firefox at the URL and download successfully every time (perhaps some retry logic is happening under the hood)
At this point, it seems like either there's something funky with my server that it's throwing these errors, or that this is just normal network behavior and I need to use (or write) a client that is more resilient to the failures.
Any thoughts?
Perhaps your issue was a low level networking issue with the ISP as you speculated in your reply comment. I am experiencing a similar problem with IIS and some mysterious 200 0 64 lines appearing in the log file, which is how I found this post. For the record, this is my understanding of sc-win32-status=64; I hope someone can correct me if I'm wrong.
sc-win32-status 64 means βThe specified network name is no longer available.β
After IIS has sent the final response to the client, it waits for an ACK message from the client.
Sometimes clients will reset the connection instead of sending the final ACK back to server. This is not a graceful connection close, so IIS logs the β64β code to indicate an interruption.
Many clients will reset the connection when they are done with it, to free up the socket instead of leaving it in TIME_WAIT/CLOSE_WAIT.
Proxies may have a tendancy to do this more often than individual clients.
I've spent two weeks investigating this issue. For me I had the scenario in which intermittent random requests were being prematurely terminated. This was resulting in IIS logs with status code 200, but with a win32-status of 64.
Our infrastructure includes two Windows IIS servers behind two NetScaler load balancers in HA mode.
In my particular case, the problem was that the NetScaler had a feature called "Intergrated Caching" turned on (http://support.citrix.com/proddocs/topic/ns-optimization-10-5-map/ns-IC-gen-wrapper-10-con.html).
After disabling this feature, the request interruptions ceased. And the site operated normally. I'm not sure how or why this was causing a problem, but there it is.
If you use a proxy or a load balancer, do some investigation of what features they have turned on. For me the cause was something between the client and the server interrupting the requests.
I hope that this explanation will at least save someone else's time.
Check the headers from the server, especially content-type, and content-length, it's possible that your clients don't recognize the format of the binary file and hang while waiting for bytes that never come, or maybe they close the underlying TCP connection, which may cause IIS to log the win32 status 64.
Spent three days on this.
It was the timeout that was set to 4 seconds (curl php request).
Solution was to increase the timeout setting:
//curl_setopt($ch, CURLOPT_TIMEOUT, 4); // times out after 4s
curl_setopt($ch, CURLOPT_TIMEOUT, 60); // times out after 60s
You will have to use wireshare or network monitor to gather more data on this problem. Me think.
I suggest you put Fiddler in between your server and your download client. This should reveal the differences between Firefox and other cients.
Description of all sc-win32-status codes for reference
https://learn.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-
ERROR_NETNAME_DELETED
64 (0x40)
The specified network name is no longer available.