Increasing 504 timeout error - nginx

Is there any way I can make the error 504 gateway timeout longer if so how and where is the file to change it located. I am using nginx on centos 6

Depending on the kind of gateway you have you should use something like:
proxy_read_timeout 600s;
Check docs: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout

If it is a fastcgi timeout error, then you need to increase the fastcgi_read_timeout.
# /etc/nginx/conf.d/example.com.conf
server {
location ~ \.(php)$ {
fastcgi_pass unix:/var/run/php74-example.com.sock;
fastcgi_read_timeout 300s;
error log) upstream timed out
# tail -f example.com.error.log
2020/12/29 14:51:42 [error] 30922#30922:
*9494 upstream timed out (110: Connection timed out) while reading response header from upstream,
...
upstream: "fastcgi://unix:/var/run/php74-example.com.sock",
...
nginx manual)
Default: fastcgi_read_timeout 60s;
Context: http, server, location
http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_read_timeout
Result of calling a script that runs for longer then 60 seconds in Chrome DevTools.
default 60s
fastcgi_read_timeout 300s

Related

Nginx facing problem when proxy_pass contains undersocres

I have an Nginx server which redirect requests to another external sites.
The location configuration below :
location ~* ^\/h\/([a-zA-Z0-9.-]+)\/(.*) {
proxy_connect_timeout 20s;
set $remote_host $1;
proxy_set_header Host $remote_host;
proxy_pass http://$remote_host/$2;
So , the server receive a request like :
http://localhost/h/test.com/testinguri.txt
and the proxy pass will be http://test.com/testinguri.txt
Everything works fine until we use two underscores and the server will respond with 502 :
curl -I http://localhost/h/test.com/testing_ur_i.txt
HTTP/1.1 502 Bad Gateway
Server: nginx/1.10.3
Content-Type: text/html
Content-Length: 173
Connection: keep-alive
The error log with debug mode :
2020/11/05 00:00:11 [error] 40859#40859: *66328156 upstream prematurely closed connection while reading response header from upstream, client: ****, server: _, request: "GET /h/test.com/testing_ur_i.txt HTTP/1.1", upstream:
The request failed after least than one second , so it is not a timeout problem. And the distant site works correctly. (i changed the real domain name by test.com for security reasons)
Thank you
Given error actually says that connection was closed by your backend (test.com in your example).
Please try to add it to your location section:
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout
So , it wasn't related to the use of underscore.
The problem was the http version supported by the distant site!
Changing the http to 1.1 for the proxy-pass fixed the issue :
proxy_http_version 1.1;
Thank you.

Performance Issues with .net core api and nginx as reverse proxy

Description:
We use nginx as a reverse proxy, that splits up the load to multiple backend servers that are running a Graphql interface using HotChocolate (.net core 3.1). The Graphql interface then triggers an ElasticSearch Call (using the official NEST Library).
Problem:
We start the system with 4 backend servers and one nginx reverse proxy and load test it with JMeter. That is working absolutely great. It is also performing well when we kill the 4th pod.
The problem only jumps in when we have two (or one) pods left. Nginx starts to return only errors and does no longer split up the load to the two remaining servers.
What we tried:
We thought that the elastic search query is performing badly, which in turn could block the backend. When executing against the query against the elastic search directly, we have a much higher performance. So that should not be the problem.
Our second approach was that the graphql.net library is bogus, so we replaced it with HotChocolate, which had no effect at all.
When we replace the graphql interface with an REST API interface it IS suddenly working?!
We played around with the nginx config but couldn't find settings that actually fixed it.
We replaced the nginx with traefik, but also the same result.
What we discovered is that as soon as we will pod three the number of ESTABLISH Connections on the reverse proxy suddenly doubles (without any additional incoming). => Maybe it is waiting for the timeout of these connections and blocks any additionally incoming one?!
We very much appreciate any help.
Thank you very much.
If you want/need to know anything else, please let me know!
Update: I did some changes to the code. It's working better now, but when scaling from one pod to two we have a performance drop for about 20-30 seconds. Why is that? how can we improve that?
We get this error from nginx:
[warn] 21#21: *9717 upstream server temporarily disabled while reading response header from upstream, client: 172.20.0.1, server: dummy.prod.com, request: "POST /graphql HTTP/1.1", upstream: "http://172.20.0.5:5009/graphql", host: "dummy.prod.com:81"
nginx_1 | 2020/09/14 06:01:24 [error] 21#21: *9717 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.20.0.1, server: dummy.prod.com, request: "POST /graphql HTTP/1.1", upstream: "http://172.20.0.5:5009/graphql", host: "dummy.prod.com:81"
nginx_1 | 172.20.0.1 - - [14/Sep/2020:06:01:24 +0000] "POST /graphql HTTP/1.1" 504 167 "-" "Apache-HttpClient/4.5.12 (Java/14)" "-"
nginx config:
upstream dummy {
server dummy0:5009 max_fails=3 fail_timeout=30s;
server dummy1:5009 max_fails=3 fail_timeout=30s;
server dummy3:5009 max_fails=3 fail_timeout=30s;
server dummy2:5009 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
location / {
proxy_connect_timeout 180s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_buffering off;
proxy_buffer_size 128k;
proxy_buffers 4 128k;
proxy_max_temp_file_size 1024m;
proxy_request_buffering off;
proxy_http_version 1.1;
proxy_cookie_domain off;
proxy_cookie_path off;
# In case of errors try the next upstream server before returning an error
proxy_next_upstream error timeout;
proxy_next_upstream_timeout 0;
proxy_next_upstream_tries 3;
proxy_pass http://dummy;
}
server_name dummy.prod.com;
}

nginx on WSL2 takes minutes to load the page after few requests

I did read and try the following;
https://stackoverflow.com/a/46286973/8068675 listen = 127.0.0.1:9000;
https://stackoverflow.com/a/50615652/8068675 disable buffering
https://github.com/microsoft/WSL/issues/393#issuecomment-442498519 disable buffering + different config
But none of theses fixed the issue.
Issue
From Windows; when I browse my website located in WSL2 through http://myproject.test, https://myproject.test or 127.0.0.1 the first 2-3 requests are going fast (<100 ms). Then the next requests takes exactly 60000ms (1 minute) to be received when they are not blocked.
Configuration on Windows 10
Firewall disabled
127.0.0.1 myproject.test added to C:\Windows\System32\drivers\etc\hosts
mkcerts installed
WSL 2 installed with Ubuntu 20.04 on it
Configuration on WSL 2
Ubuntu 20.04
nginx 1.18
mysql 8.0
php-fpm 7.4
Project
Laravel
location /home/clement/projects/myproject/
certs (generated with mkcert) /home/clement/projects/certs/
owner: clement:www-data
permission : 777 (It's only for test and development purpose)
/etc/nginx/sites-available/myproject.test
server {
listen 80;
listen [::]:80;
listen 443 ssl;
listen [::]:443 ssl;
ssl_certificate /home/clement/projects/certs/myproject.test.pem;
ssl_certificate_key /home/clement/projects/certs/myproject.test-key.pem;
client_max_body_size 108M;
access_log /var/log/nginx/application.access.log;
server_name myproject.test;
root /home/clement/projects/myproject/public;
index index.php;
if (!-e $request_filename) {
rewrite ^.*$ /index.php last;
}
location ~ \.php$ {
fastcgi_buffering off;
fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param PHP_VALUE "error_log=/var/log/nginx/application_php_errors.log";
include fastcgi_params;
}
}
I have the same issue using fastcgi_pass 127.0.0.1:9000; or fastcgi_pass unix:/var/run/php/php7.4-fpm.sock;
When I do php -S localhost:8080 or php artisan serve in the project, everything's working fine.
Edit with log
This is the log I'm getting on nginx, but even with this information I still cannot find any resource that fix the issue.
2020/08/07 23:06:30 [error] 1987#1987: *6 upstream timed out (110: Connection timed out) while reading upstream, client: 127.0.0.1, server: myproject.test, request: "GET /_debugbar/assets/javascript?v=1588748787 HTTP/1.1", upstream: "fastcgi://unix:/run/php/php7.4-fpm.sock:", host: "myproject.test", referrer: "http://myproject.test/"
Or using IP
2020/08/08 01:43:01 [error] 4080#4080: *4 upstream timed out (110: Connection timed out) while reading upstream, client: 127.0.0.1, server: myproject.test, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "myproject.test"
I finally found the issue. Despite I followed the instruction to install WSL 2, it was using WSL 1.
In powershell I ran
wsl -l -v and got the result
|---------------------|------------------|------------------|
| NAME | STATE | VERSION |
|---------------------|------------------|------------------|
| Ubuntu-20.04 | Stopped | 1 |
|---------------------|------------------|------------------|
After updating the kernel I could change the version to 2 with the command
wsl --set-version Ubuntu-20.04 2
and now everything works well

Random "502 Error Bad Gateway" in Amazon Red Hat (Not Ubuntu) - Nginx + PHP-FPM

First of all, I already searched for 502 error in Stackoverflow. There are a lot a threads, but the difference this time is that the error appears without a pattern and it's not in Ubuntu.
Everything works perfectly, but about once a week my site shows: 502 Bad Gateway.
After this first error, every connection starts showing this message. Restarting MySQL + PHP-FPM + Nginx + Varnish doesn't work.
I have to clone this instance, and make another one, to get my site up again (It is hosted in Amazon EC2).
In Nginx log it shows these line again and again:
[error] 16773#0: *7034 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 127.0.0.1
There are nothing in MySQL or Varnish log. But in PHP-FPM it shows theses type of line:
WARNING: [pool www] child 18978, script '/var/www/mysite.com/index.php' (request: "GET /index.php") executing too slow (10.303579 sec), logging
WARNING: [pool www] child 18978, script '/var/www/mysite.com/index.php' (request: "GET /index.php") execution timed out (16.971086 sec), terminating
Inside PHP-FPM slowlog it was showing:
[pool www] pid 20401
script_filename = /var/www/mysite.com/index.php
w3_require_once() /var/www/mysite.com/wp-content/plugins/w3-total-cache/inc/define.php:1478
(Inside the file "define.php" at line number 1478, it has this line of code: require_once $path;)
I thought the problem was with W3 Total Cache plugin. So I removed W3 Total Cache.
About 5 days later it happened again with this error in PHP-FPM slow log:
script_filename = /var/www/mysite.com/index.php
wpcf7_load_modules() /var/www/mysite.com/wp-content/plugins/contact-form-7/includes/functions.php:283
(Inside the file "functions.php" at line number 283, it has this line of code: include_once $file;)
The other day, the first error occurred in another part:
script_filename = /var/www/mysite.com/wp-cron.php
curl_exec() /var/www/mysite.com/wp-includes/class-http.php:1510
And again a different part of code:
[pool www] pid 20509
script_filename = /var/www/mysite.com/index.php
mysql_query() /var/www/mysite.com/wp-includes/wp-db.php:1655
CPU, RAM ... everything is stable when this error occurs (less then 20% usage).
I tried everything, but nothing worked:
Moved to a better server (CPU and RAM)
Decreased timeout from Nginx, PHP-FPM, MySQL (my page loads quickly, so I decrease timeout to kill any outlier process)
Changed the number of PHP-FPM spare servers
Changed a lot of configuration from Nginx and PHP-FPM
I know that there is a bug with PHP-FPM and Ubuntu that could cause this error. But I don't think there is a bug with Amazon instances (Red Hat). (And I don't want to migrate from PHP-FPM to Socks because I've read that Socks don't works well under heavy load)
This was happening about every week since 5 months ago. I'm desperate.
I got to the point that I even put Nginx and PHP-FPM in Linux's crontab, to restart theses services every day. But it didn't work too.
Anyone has any suggestion where I can solve this problem? Anything will help!!
Server:
Amazon c3.large (2 core and 3.75GB RAM)
Linux Amazon Red Hat 4.8.2 64bits
PHP-FPM:
listen = 127.0.0.1:9000
listen.allowed_clients = 127.0.0.1
listen.mode = 0664
pm = ondemand
pm.max_children = 480
pm.start_servers = 140
pm.min_spare_servers =140
pm.max_spare_servers = 250
pm.max_requests = 50
request_terminate_timeout = 15s
request_slowlog_timeout = 10s
php_admin_flag[log_errors] = on
Nginx:
worker_processes 2;
events {
worker_connections 2048;
multi_accept on;
use epoll;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log off;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
types_hash_max_size 2048;
server_tokens off;
client_max_body_size 8m;
reset_timedout_connection on;
index index.php index.html index.htm;
keepalive_timeout 1;
proxy_connect_timeout 30s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
fastcgi_send_timeout 30s;
fastcgi_read_timeout 30s;
listen 127.0.0.1:8080;
location ~ .php$ {
try_files $uri =404;
include fastcgi_params;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_keep_conn on;
fastcgi_pass 127.0.0.1:9000;
fastcgi_param HTTP_HOST $host;
}
}
I would start by tuning some configuration parameters.
PHP-FPM
I think that your pm values are somewhat off, a bit higher than I've normally seen configured on server around your specs... but you say that memory consumption it's normal so that's kind of weird.
Anyway... for pm.max_children = 480, considering that by default WordPress increases the memory limit to 40MB, you would end up using up to 18 gigs of memory, so you definitely would like to lower that.
Check the fourth part on this post for more info about that: http://www.if-not-true-then-false.com/2011/nginx-and-php-fpm-configuration-and-optimizing-tips-and-tricks/
If you're using... let's say 512MB for nginx, MySQL, Varnish and other services, you would have about 3328 MB for php-fpm... divided by 40 MBs per process, pm.max_children should be about 80... but even 80 it's very high.
It's probable that you can also lower the values of pm.start_servers, pm.min_spare_servers and pm.max_spare_servers. I prefer to keep them low and only increase them it's necessary
For pm.max_requests you should keep the default of 500 to avoid server respawns. I think it's only advisable to lower it if you suspect memory leaks.
Nginx
Change keepalive_timeout to 60 to make better use of keep alive.
Other than that, I think everything looks normal.
I had this issue with Ubuntu, but request_terminate_timeout on PHP-FPM and fastcgi_send_timeout + fastcgi_read_timeout were enough to get rid of it.
I hope you can fix it!

PHP-FPM - upstream prematurely closed connection while reading response header

Already saw this same question - upstream prematurely closed connection while reading response header from upstream, client
But as Jhilke Dai said it not solved at all and i agree.
Got same exact error on nginx+phpFPM installation. Current software versions: nginx 1.2.8 php 5.4.13 (cli) on FreeBSd9.1. Actually bit isolated this error and sure it happened when trying to import large files, larger than 3 mbs to mysql via phpMyadmin. Also counted that backend closing connection when 30 secs limit reached.
Nginx error log throwing this
[error] 49927#0: *196 upstream prematurely closed connection while reading response header from upstream, client: 7X.XX.X.6X, server: domain.com, request: "POST /php3/import.php HTTP/1.1", upstream: "fastcgi://unix:/tmp/php5-fpm.sock2:", host: "domain.com", referrer: "http://domain.com/phpmyadmin/db_import.php?db=testdb&server=1&token=9ee45779dd53c45b7300545dd3113fed"
My php.ini limits raised accordingly
upload_max_filesize = 200M
default_socket_timeout = 60
max_execution_time = 600
max_input_time = 600
my.cnf related limit
max_allowed_packet = 512M
Fastcgi limits
location ~ \.php$ {
# fastcgi_split_path_info ^(.+\.php)(.*)$;
fastcgi_pass unix:/tmp/php5-fpm.sock2;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param SCRIPT_NAME $fastcgi_script_name;
fastcgi_intercept_errors on;
fastcgi_ignore_client_abort on;
fastcgi_connect_timeout 60s;
fastcgi_send_timeout 200s;
fastcgi_read_timeout 200s;
fastcgi_buffer_size 128k;
fastcgi_buffers 8 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
Tried to change fastcgi timeouts as well buffer sizes, that's not helped.
php error log doesn't show problem, enabled all notices, warning - nothing useful.
Also tried disable APC - no effect.
I had this same issue, got 502 Bad Gateway frequently and randomly at my development machine (OSX + nginx + php-fpm), and solved it by changing some parameters at /usr/local/etc/php/5.6/php-fpm.conf:
I had this settings:
pm = dynamic
pm.max_children = 10
pm.start_servers = 3
pm.max_spare_servers = 5
... and changed them to:
pm = dynamic
pm.max_children = 10
pm.start_servers = 10
pm.max_spare_servers = 10
... and then restarted the php-fpm service.
This settings are based on what I found here: [https://bugs.php.net/bug.php?id=63395]
How long does your script take to compute? Try to set, both in PHP and Nginx HUGE timeouts and monitor your system during the request. Then tune your values to optimise performance.
Also, lower the log level in PHP-FPM, maybe there is some type of warning, info or debug trace that can give you some info.
Finally, be careful with the number of children and processes available in PHP-FPM. Maybe Nginx is starving, waiting for a PHP-FPM child to be available.

Resources