PHP 5.5, NGINX and Memcached - 502 Error - nginx

I'm having a problem with Memcached pools. I will try to add all the context of the error to see if you guys can help me with this.
Context:
PHP 5.5.17 (cgi-fcgi) (built: Sep 24 2014 20:38:04)
php-pecl-memcache-3.0.8-2.fc17.remi.5.5.x86_64
nginx version:
nginx/1.0.15
My problem:
I am creating a connection with memcached and saving a several keys, just in one server first, something like this:
$_memcache = new Memcache;
$_memcache->addServer("127.0.0.1", "11211", true, 50, 3600, 45);
So, let suppose that I add several keys, in that server and I can get those without problem, I actually can, when I see my site and my code is calling to get the keys, it's getting it.
Now the problem, let say with those keys already saved and working without problem, I added another memcached server to the pool, this way:
$_memcache = new Memcache;
$_memcache->addServer("10.0.0.2", "11211", true, 50, 3600, 45);
$_memcache->addServer("10.0.0.3", "11211", true, 50, 3600, 45);
But before I refreshed the site to run my code and get the keys that I have storage in the first server I stopped the memcached in that server number 1 (10.0.0.2), after that I refreshed my site and then I received a 502 error (Bad Gateway)
The error that I am seeing in the log of NGINX is:
[error] 9364#0: *329504 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XX.XX.XXX.XXX, server: _, request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/run/php-fastcgi/sock:", host: "www.myhost.com"
So, why I am getting that error. The only theory that I have is that for some reason because the connections is persistent is not closing properly when I stop the memcached server, but that only happens when I have a pool of X > 1 servers. If I use just that one and stopped I won't see the error.
There is any way that PHP 5.5 has a bug with the fastcgi socket that I don't know.
NOTE: This problem that I am having wasn't happening on previous PHP version 5.3, after changing the version is this is happening.
NOTE: When I don't use persistent connection seems to work, but this site has a huge traffic and it won't handle the amount of connections open, I tested this in dev environment.
Any help or any suggestions are more than welcome.
Thanks in advance!

Related

GCP deployment with nginx - uwsgi - flask fails

I have a very simple flask app that is deployed on GKE and exposed via google external load balancer. And getting random 502 responses from the backend-service (added a custom headers on backend-service and nginx to make sure the source and I can see the backend-service's header but not nginx's)
The setup is;
LB -> backend-service -> neg -> pod (nginx -> uwsgi) where pod is the application built using flask and deployed via uwsgi and nginx.
The scenario is to handle image uploads in simple-secured way. Sender sends me a token with upload request.
My flask app
receive request and check the sent token via another service using "requests".
If token valid, proceed to handle the image and return 200
If token is not valid, stop and send back a 401 response.
First, I got suspicious about the 200 and 401's. And reverted all responses to 200. Following some of the expected responses, server starts to respond 502 and keep sending it. "Some of the messages at the very beginning succeeded".
nginx error logs contains below lines
2023/02/08 18:22:29 [error] 10#10: *145 readv() failed (104: Connection reset by peer) while reading upstream, client: 35.191.17.139, server: _, request: "POST /api/v1/imageUpload/image HTTP/1.1", upstream: "uwsgi://127.0.0.1:21270", host: "example-host.com"
my uwsgi.ini file is as below;
[uwsgi]
socket = 127.0.0.1:21270
master
processes = 8
threads = 1
buffer-size = 32768
stats = 127.0.0.1:21290
log-maxsize = 104857600
logdate
log-reopen
log-x-forwarded-for
uid = image_processor
gid = image_processor
need-app
chdir = /server/
wsgi-file = image_processor_application.py
callable = app
py-auto-reload = 1
pidfile = /tmp/uwsgi-imgproc-py.pid
my nginx.conf is as below
location ~ ^/api/ {
client_max_body_size 15M;
include uwsgi_params;
uwsgi_pass 127.0.0.1:21270;
}
Lastly, my app has a healthcheck method with simple JSON response. It does no extra stuff and simply returns. This never fails as explained above.
Edit : my nginx access logs in the pod shows the response as 401 while the client receives 502.
for those who gonna face with the same issue, the problem was post data reading (or not reading).
nginx was expecting to get post data read by the proxied, in our case uwsgi, app. But according to my logic I was not reading it in some cases and returning back the response.
Setting uwsgi post-buffering solved the issue.
post-buffering = %(16 * 1024 * 1024)
Which led me to this solution;
https://stackoverflow.com/a/26765936/631965
Nginx uwsgi (104: Connection reset by peer) while reading response header from upstream

Passenger Looking for Permissions on a Missing File

Linux installation of Phusion Passenger(R) 6.0.14 on nginx/1.14.1. I'm not able to load a site (on a simplified nginx.conf with the conf.d/passenger.conf and conf.d/site1.conf includes. That error:
2022/08/03 19:24:34 [alert] 55601#0: *3 Error opening '/home/user3/sites/Passengerfile.json' for reading: Permission denied (errno=13);
This error means that the Nginx worker process (PID 55601, running as UID 992) does not have permission to access this file.
Please read this page to learn how to fix this problem: https://www.phusionpassenger.com/library/admin/nginx/troubleshooting/?a=upon-accessing-the-web-app-nginx-reports-a-permission-denied-error;
Extra info, client: 192.168.1.4, server: domain1.com, request: "GET / HTTP/1.1", host: "server_f.local"
I don't even know what to ask, other than how can I get this to work? That file doesn't exist. I've restarted nginx many times, and this is the only feedback that I get. I've checked over that page, which tells me to look at the error log it's reported in already. The host and server are correct, and I am on my LAN.

How can I stop nginx failling over when openresty throws runtime error deploying cert

We are using openresty and the lua-resty-auto-ssl package to generate certificates from Lets Encrypt but lately the server keeps falling over. Im guessing its triggered when a certificate trys to auto renew as generating a certificate for first time works fine ... the error we are seeing is
2019/05/12 08:25:24 [error] 2623#2623: *1024227 lua entry thread aborted: runtime error: ...sty/luajit/share/lua/5.1/resty/auto-ssl/servers/hook.lua:40: assertion failed!
stack traceback:
coroutine 0:
[C]: in function 'assert'
...sty/luajit/share/lua/5.1/resty/auto-ssl/servers/hook.lua:40: in function 'server'
.../local/openresty/luajit/share/lua/5.1/resty/auto-ssl.lua:99: in function 'hook_server'
content_by_lua(nginx.conf:194):2: in function <content_by_lua(nginx.conf:194):1>, client: 127.0.0.1, server: , request: "POST /deploy-cert HTTP/1.1", host: "127.0.0.1:8999"
From what I can see in the error it is failing to assert something when trying to deploy the cert which could be any of 4 things
assert(params["domain"])
assert(params["fullchain"])
assert(params["privkey"])
assert(params["expiry"])
Im a bit stuck to what I can do, its no good having the server dropping out on use. Thats the last error thats reported before the server goes offline so im guessing thats the cause? but not 100% sure.
Is there anywhere I can look to find out more information what causes the crash. Im new to nginx/openresty so fumbling my round a bit. Has anyone come across a similar issue?
Wrap it all in a function and call it with pcall or xpcall and add some logic to deal with the error.

NGINX (Operation not permitted) while reading upstream

I have NGINX working as a cache engine and can confirm that pages are being cached as well as being served from the cache. But the error logs are getting filled with this error:
2018/01/19 15:47:19 [crit] 107040#107040: *26 chmod()
"/etc/nginx/cache/nginx3/c0/1d/61/ddd044c02503927401358a6d72611dc0.0000000007"
failed (1: Operation not permitted) while reading upstream, client:
xx.xx.xx.xx, server: *.---.com, request: "GET /support/applications/
HTTP/1.1", upstream: "http://xx.xx.xx.xx:80/support/applications/",
host: "---.com"
I'm not really sure what the source of this error could be since NGINX is working. Are these errors that can be safely ignored?
It looks like you are using nginx proxy caching, but nginx does not have the ability to manipulate files in it's cache directory. You will need to get the ownership/permissions correct on the cache directory.
Not explained in the original question is that the mounted storage is an Azure file share. So in the FSTAB I had to include the gid= and uid= for the desired owner. This then removed the need for chown and chmod also became unnecessary. This removed the chmod() error but introduced another.
Then I was getting errors on rename() without permission to perform this. At this point I scrapped what I was doing, moved to a different type of Azure storage (specifically a Disk attached to the VM) and all these problems went away.
So I'm offering this as an answer but realistically, the problem was not solved.
We noticed the same problem. Following the guide from Microsoft # https://learn.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv#create-a-storage-class seems to have fixed it.
In our case the nginx process was using a different user for the worker threads, so we needed to find that user's uid and gid and use that in the StorageClass definition.

Plesk - 502 bad gateway frequently

I'm facing with this error from few days, maybe weeks. I've tried and tried and tried all kind of solutions over the internet but with no success, so:
I have a dedicated server with Plesk 12 and Ubuntu 14.04 LTS.
This error appear daily, 1-2 times per day and I have to restart whole server and then all websites are working properly until next time this error appear again.
First time it appeared when I have added two more domains to be hosted. I suspected high server loading but no, rams never gets almost completely and cpu load is max 20%.
PS: Error appear only on websites, so I can access Plesk Panel and reboot server.
NGINX error log (/var/log/nginx/error.log):
[error] 1406#0: *119083 connect() failed (111: Connection refused) while connecting to upstream, client: IP, server: , request: "GET http://www.example.com/ HTTP/1.1", upstream: "http://SERVER_IP:7080/", host: "www.example.com"
It is first time when I'm using Plesk and I don't really know how to do or check... .
PS: Can this be caused by a cron job service?
Cron log which is worrying me:
254 postfix/master[1664]: warning: master_wakeup_timer_event: service pickup(public/pickup): Connection refused
"(111: Connection refused) while connecting to upstream"
this error means that something goes wrong with Apache or PHP-FPM(if you have using it), check apache logs of site where it happens: /var/www/vhosts/examples.com/logs/error_log

Resources