Why there are 3 processes for nginx? - nginx

My OS is Ubuntu, I use ps -aux |grep nginx, and I've found 3 nginx's processes; so my question is why there are 3 processes for nginx? it seems one process is by root, another two from www-data:
root 7833 0.0 0.0 126092 1476 ? Ss 12:32 0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 7834 0.0 0.0 126504 3124 ? S 12:32 0:00 nginx: worker process
www-data 7835 0.0 0.1 126504 5068 ? S 12:32 0:00 nginx: worker process

The process that is being run as root is the master NGINX process.
The two others are worker processes.
During the launch of NGINX service, the master process is the first one to launch.
It spans off the worker processes that actually handle the connections.
The master process runs as root in order to be able to do things like binding to privileged network ports, reading TLS certificates/keys during configuration load.
The worker processes have dropped privileges, as they only require to be able to read website files.
The number of worker processes can be controlled with worker_processes configuration directive. The default value is 1. Which means on a system with default config you will see a total of 2 processes (1 master and 1 worker).
The more worker processes you have, the more connections your web server can handle on a multi-core system.
E.g. you have 4 core CPU. By setting worker_processes 4; you make sure that all cores are being used to handle connections, so it is going to improve performance on a busy website.
Moreover you can just set worker_processes auto;. That will have NGINX determine the number of logical CPU units and set the number of workers corresponding to that.

The root process is necessary for nginx to access the network and files on your system.
The other two processes are set in your config file. Look there and you will see a setting for that which is dependent on the number of cores in the processor on your server. More available processes means more compute power as access to your server increases with visitors.
It's possible (I do not recall) that two processes is a default setting.

Related

Understanding Docker container resource usage

I have server running Ubuntu 16.04 with Docker 17.03.0-ce running an Nginx container. That server also has ConfigServer Security & Firewall installed. Shortly after starting the Nginx container I start receiving emails about "Excessive resource usage" with the following details:
Time: Fri Mar 24 00:06:02 2017 -0400
Account: systemd-timesync
Resource: Process Time
Exceeded: 1820 > 1800 (seconds)
Executable: /usr/sbin/nginx
Command Line: nginx: worker process
PID: 2302 (Parent PID:2077)
Killed: No
I fully understand that I can add exe:/usr/sbin/nginx to csf.pignore to stop these email alerts but I would like to understand a few things first.
Why is the "systemd-timesync" account being reported? That does not seem to have anything to do with Docker.
Why does the host machine seem to be reporting the excessive resource usage (the extended process time) when that is something running in the container?
Why are other docker containers not running Nginx not resulting in excessive resource usage emails?
I'm sure there are other questions but basically, why is this being reported the way it is being reported?
I can at least answer the first two questions:
Unlike real VMs, Docker containers are simply a collection of processes run under the host system kernel. They just have a different view on certain system resources, including their own file hierarchy, their own PID namespace and their own /etc/passwd file. As a result, they will still show up if you ps aux on the host machine.
The nginx container's /etc/passwd includes a user 'nginx' with UID 104 that runs the nginx worker process. However, in the host's /etc/passwd, UID 104 might belong to a completely different user, such as systemd-timesync.
As a result, if you run ps aux | grep nginx in the container, you might see
nginx 7 0.0 0.0 32152 2816 ? S 11:20 0:00 nginx: worker process
while on the host, you see
systemd-timesync 22004 0.0 0.0 32152 2816 ? S 13:20 0:00 nginx: worker process
even though both are the are the same process (also note the different PID namespaces; in containers, PIDs are counted from 1 again).
As a result, container processes will still be subject to ConfigServer's resource monitoring, but they might show up with random, or even non-existent user accounts.
As to why nginx triggers the emails and other containers don't, I can only assume that nginx is the only one of your containers that crosses ConfigServer's resource thresholds.

Declaring more than 2 upstreams in the nginx config causes nginx to take 80 seconds to restart

I have 4 upstream blocks in my nginx config that I'm using depending on the incoming request's scheme or the geo location of the requesting client.
Every time I have to restart nginx it takes around 80 seconds to complete. If I only have 3 upstreams declared it takes about 40 seconds, and with 2 upstreams it restarts pretty much immediately, like it normally does.
Reloads take 1/2 the time (40 seconds with 4 upstreams, 20 seconds with 3 upstreams).
There are no errors logged in the nginx error log, even on debug log level & if I run /usr/sbin/nginx -t it says the test is successful, but takes as long as a reload does.
Nginx resolves ip of all upstreams at (re)start. Check your DNS.

Problems running flask app on uwsgi / nginx

I have created a flask app and up to this point have been using the default flask server for creating/testing it. Now i want to deploy it to a server. I am using uwsgi and nginx, though i am pretty new to both. i know there are a lot of guides and questions about similar things, but i couldnt find the solution after looking through as much as i could understand
The following is from my uwsgi log :
machine: x86_64
clock source: unix
detected number of CPU cores: 1
current working directory: /home/ben/flask/MLS-Flask
detected binary path: /home/ben/flask/MLS-Flask/mls-flask-ve/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
*** WARNING: you are running uWSGI without its master process manager ***
your processes number limit is 1024
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /home/ben/flask/MLS-Flask/mls_uwsgi.sock fd 3
Python version: 3.3.3 (default, Dec 30 2013, 16:29:41) [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]
Set PythonHome to /home/ben/flask/MLS-Flask/mls-flask-ve
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x11755d0
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 72760 bytes (71 KB) for 1 cores
*** Operational MODE: single process ***
added /home/ben/flask/MLS-Flask/ to pythonpath.
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x11755d0 pid: 2926 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI worker 1 (and the only) (pid: 2926, cores: 1)
I am assuming the uwsgi is at least running? I am fairly new to this so i am not quite sure that the problem is.
my nginx config is :
server{
listen 8080;
charset utf-8;
location / {try_files $uri #app; }
location #app {
include uwsgi_params;
uwsgi_pass unix:/home/ben/flask/MLS-Flask/mls_uwsgi.sock;
}
}
my uwsgi ini is :
[uwsgi]
uid = nginx
gid = nginx
base = /home/ben/flask/MLS-Flask
home = %(base)/mls-flask-ve
pythonpath = %(base)
chdir = /home/ben/flask/MLS-Flask
module = runp
#socket file's location
socket = /home/ben/flask/MLS-Flask/mls_uwsgi.sock
#permissions for the socket file
chmod-socket = 666
#variable that holds a flask application inside the module imported
callable = app
#location of log file
logto = /var/log/uwsgi/%n.log
and the file the uwsgi ini is running is my flask app:
from app import app
if __name__ == "__main__":
app.run(debug = False, port = 8080)
I may have some extraneous stuff in my uwsgi ini or nginx config, but i am not sure if those would necessarily be the problems. Can anyone see any reasons why this might not be working? I am currently getting a 502 bad gateway error on localhost:8080, so i am guessing it has something to do with my flask, uwsgi ini/socket.
i appreciate any help.
It turned out my nginx user didnt have access to the socket because the / and /home/ directory was owned by the root group and root user. I ended up giving full access to the owner and group all the way from / directory to the socket (this probably is not the safest solution security wise, but i can further refine it after i get everything working.)
I had the same problem :
Always check socket permissions by using ls -lhtr
Try putting socket in /run/myapp/mysock.sock folder
Create an empty sock file in this folder vi mysock.sock
Set permissions of this empty file to have full access by your user and group stated
in the service. chown user:group /run/myapp/mysock.sock

Phusion Passenger Standalone seems to be on but nothing appears in browser

I ssh to the dev box where I am suppose to setup Redmine. Or rather, downgrade Redmine. In January I was asked to upgrade Redmine from 1.2 to 2.2. But the plugins we wanted did not work with 2.2. So now I'm being asked to setup Redmine 1.3.3. We figure we can upgrade from 1.2 to 1.3.3.
In January I had trouble getting Passenger to work with Nginx. This was on a CentOS box. I tried several installs of Nginx. I'm left with different error logs:
This:
whereis nginx.conf
gives me:
nginx: /etc/nginx
but I don't think that is in use.
This:
find / -name error.log
gives me:
/opt/nginx/logs/error.log
/var/log/nginx/error.log
When I tried to start Passenger again I was told something was already running on port 80. But if I did "passenger stop" I was told that passenger was not running.
So I did:
passenger start -p 81
If I run netstat I see something is listening on port 81:
netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:81 localhost:42967 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:51874 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62993 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62905 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:50886 ESTABLISHED
tcp 0 0 localhost:81 localhost:42966 TIME_WAIT
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62992 ESTABLISHED
tcp 0 0 localhost:42967 localhost:81 ESTABLISHED
but if I point my browser here:
http: // 10.0.1.253:81 /
(StackOverFlow does not want me to publish the IP address, so I have to malform it. There is no harm here as it is an internal IP that no one outside my company could reach.)
In Google all I get is "Oops! Google Chrome could not connect to 10.0.1.253:81".
I started Phusion Passenger at the command line, and the output is verbose, and I expect to see any error messages in the terminal. But I'm not seeing anything. It's as if my browser request is not being heard, even though netstat seems to indicate the app is listening on port 81.
A lot of other things could be wrong with this app (I still need to reverse migrate the database schema) but I'm not seeing any of the error messages that I expect to see. Actually, I'm not seeing any error messages, which is very odd.
UPDATE:
If I do this:
ps aux | grep nginx
I get:
root 20643 0.0 0.0 103244 832 pts/8 S+ 17:17 0:00 grep nginx
root 23968 0.0 0.0 29920 740 ? Ss Feb13 0:00 nginx: master process /var/lib/passenger-standalone/3.0.19-x86_64-ruby1.9.3-linux-gcc4.4.6-1002/nginx-1.2.6/sbin/nginx -c /tmp/passenger-standalone.23917/config -p /tmp/passenger-standalone.23917/
nobody 23969 0.0 0.0 30588 2276 ? S Feb13 0:34 nginx: worker process
I tried to cat the file /tmp/passenger-standalone.23917/config but it does not seem to exist.
I also killed every session of "screen" and every terminal window where Phusion Passenger might be running, but clearly, looking at ps aux, it looks like something is running.
Could the Nginx be running even if the Passenger is killed?
This:
ps aux | grep phusion
brings back nothing
and this:
ps aux | grep passenger
Only brings back the line with nginx.
If I do this:
service nginx stop
I get:
nginx: unrecognized service
and:
service nginx start
gives me:
nginx: unrecognized service
This is a CentOS machine, so if I had Nginx installed normally, this would work.
The answer is here - Issue Uploading Files from Rails app hosted on Elastic Beanstalk
You probably have /etc/cron.daily/tmpwatch removing the /tmp/passenger-standalone* files every day, and causing you all this grief.

PHP-FPM using 40% CPU for a Single Request

(I googled and searched this forum for hours, found some topics, but none of them worked for me)
I'm using Wordpress with: Varnish + Nginx + PHP-FPM + APC + W3 Total Cache + PageSpeed.
As I'm using Varnish, first time I call www.mysite.com it use just 10% of CPU. Calling the second time, it will be cached. The problem is passing request parameter in URL.
For just 1 request (www.mysite.com?1=1) it shows in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7609 nginx 20 0 438m 41m 28m S 11.6 7.0 0:00.35 php-fpm
7606 nginx 20 0 437m 39m 26m S 10.3 6.7 0:00.31 php-fpm
After the page is fully loaded, these processes above are still active. And after 2 seconds, they are replaced by another 2 php-fpm processes(below), which are active for 3 seconds.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7665 nginx 20 0 444m 47m 28m S 20.9 7.9 0:00.69 php-fpm
7668 nginx 20 0 444m 46m 28m R 20.9 7.9 0:00.63 php-fpm
40% CPU usage just for 1 request not cached!
Strange things:
CPU usage is higher after the page was loaded
When I purged the cache (W3 and Varnish), it take just 10% of CPU to load a not cached page
This high CPU usage just happend passing request parameter or in Wordpress Admin
When I try to do 10 request(pressing F5 key 10x), the server stop serving and in php-fpm log appears:
WARNING: [pool www] server reached max_children setting (10), consider raising it
I raised that value to 20, same problem.
I'm using pm=ondemand (pm.max_children=10 and pm.max_requests=500).
Inittialy I was using pm=dynamic (pm.max_children=10, pm.start_servers=1, pm.min_spare_servers=1, pm.min_spare_servers=2, pm.max_requests=500) and it happened the same problem.
Anyone could help, plz? Any help would be appreciated!
PS:
APC is ON (98% Hits, 2% Misses)
Server is Amazon Micro (613MB RAM)
PHP 5.3.26 (fpm-fcgi)
Linux version 3.4.48-45.46.amzn1.x86_64 Red Hat 4.6.3-2 (I think it's based on CentOS 5)
First reduce the stack of caches. Why using varnish which serves pages from memory when you're using w3 cache already which serves from memory as well?
W3cache is CPU intensive! It does not just cache items but also compresses, minifies and merges files on the fly.
You got a total of 512MB of memory on your machine which is not a lot, also your CPU power is less than a modern smartphone has. Memory access is extremely slow compared to a root server because of the xen virtualization layer - That's why less is more.
Make sure w3cache is properly set up so it actually caches items, then warmup your cache and you should be fine.
Have a look at Googles nginx pagespeed module https://github.com/pagespeed/ngx_pagespeed, it can do the same thing w3cache does, just much more efficient because it happens in the webserver, not in PHP
Nginx can also directly serve from memcached http://www.kingletas.com/2012/08/full-page-cache-with-nginx-and-memcache.html (example article, might need some more investigation)
Problem solved!
For those who are having the same problem:
Check Varnish configuration;
Check your Wordpress's plugin;
1) In my case, TTL was not configured in Varnish, so nothing was being cached.
This config worked for me:
sub vcl_fetch {
if (!(req.url ~ "wp-(login|admin)")) {
unset beresp.http.set-cookie;
set beresp.ttl = 48h;
}
}
2) The high CPU usage AFTER page loads, was caused by a Wordpress plugin called: "Scroll Triggered Box".
It was doing some AJAX after page has loaded. I disabled that plugin and high load stopped.
There are two factors at play here:
You are using micro instance which has a burstable CPU profile. It can burst up to 2 ECU's then be limited to much less than 1 (Some estimates put this at around 0.1 - 0.2 ECU's)
While logged in as an admin, wordpress caching plugins often bypass or reduce caching. W3 should allow you to switch this if you want caching on all the time.

Resources