How does one set worker_rlimit_nofile to a higher number and what's the maxium it can be or is recommended to be?
I'm trying to follow the following advice:
The second biggest limitation that most people run into is also
related to your OS. Open up a shell, su to the user nginx runs as and
then run the command ulimit -a. Those values are all limitations
nginx cannot exceed. In many default systems the open files value is
rather limited, on a system I just checked it was set to 1024. If
nginx runs into a situation where it hits this limit it will log the
error (24: Too many open files) and return an error to the client.
Naturally nginx can handle a lot more than 1024 files and chances are
your OS can as well. You can safely increase this value.
To do this you can either set the limit with ulimit or you can use
worker_rlimit_nofile to define your desired open file descriptor
limit.
From: https://blog.martinfjordvald.com/2011/04/optimizing-nginx-for-high-traffic-loads/
worker_rlimit_nofile = worker_connections * 2 file descriptors
Each worker connection open 2 file descriptors (1 for upstream, 1 for downstream)
While setting worker_rlimit_nofile parameter, you should consider both worker_connections and worker_processes. You may want to check your OS's file descriptor first using: ulimit -Hn and ulimit -Sn which will give you the per user hard and soft file limits respectively. You can change the OS limit using systemctl as:
sudo sysctl -w fs.file-max=$VAL
where $VAL is the number you would like to set. Then, you can verify using:
cat /proc/sys/fs/file-max
If you are automating the configuration, it is easy to set worker_rlimit_nofile as:
worker_rlimit_nofile = worker_connections*2
The worker_processes is set to 1 by default, however, you can set it to a number less than or equal to the number of cores you have on your server:
grep -c ^processor /proc/cpuinfo
EDIT:
The latest versions of nginx set worker_processes: auto by default, which sets to the number of processors available in the machine. Hence, it's important to know why you would really to change it.
Normally, setting it to highest value or to all available processors doesn't always improve the performance beyond certain limit: it's likely you get the same performance by setting it to 24 vs 32 processors. Some kernel/TCP-stack parameters could also help mitigate bottle-necks.
And in micro-services deployment (kubernetes), it's very important to consider pod resource request/limits while setting these configurations.
To check how many workers process nginx has spawned you could run ps -lfC nginx. e.g. on my machine I got the following, since my machine has 12 processors, nginx spawned 12 worker processes.
$ ps -lfC nginx
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
5 S root 70488 1 0 80 0 - 14332 - Jan15 ? 00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
5 S www-data 70489 70488 0 80 0 - 14526 - Jan15 ? 00:08:24 nginx: worker process
5 S www-data 70490 70488 0 80 0 - 14525 - Jan15 ? 00:08:41 nginx: worker process
5 S www-data 70491 70488 0 80 0 - 14450 - Jan15 ? 00:08:49 nginx: worker process
5 S www-data 70492 70488 0 80 0 - 14433 - Jan15 ? 00:08:37 nginx: worker process
5 S www-data 70493 70488 0 80 0 - 14447 - Jan15 ? 00:08:44 nginx: worker process
5 S www-data 70494 70488 0 80 0 - 14433 - Jan15 ? 00:08:46 nginx: worker process
5 S www-data 70495 70488 0 80 0 - 14433 - Jan15 ? 00:08:34 nginx: worker process
5 S www-data 70496 70488 0 80 0 - 14433 - Jan15 ? 00:08:31 nginx: worker process
5 S www-data 70498 70488 0 80 0 - 14433 - Jan15 ? 00:08:46 nginx: worker process
5 S www-data 70499 70488 0 80 0 - 14449 - Jan15 ? 00:08:50 nginx: worker process
5 S www-data 70500 70488 0 80 0 - 14433 - Jan15 ? 00:08:39 nginx: worker process
5 S www-data 70501 70488 0 80 0 - 14433 - Jan15 ? 00:08:41 nginx: worker process
To print the exact count, you could you UID (e.g. for my setup it's UUID is www-data. which is configured in nginx.conf as user www-data;)
$ ps -lfC nginx | awk '/nginx:/ && /www-data/{count++} END{print count}'
12
In kubernetes, nginx spawn worker processes depending on the resource request for the pod by default.
e.g if you have the following in your deployment:
resources:
requests:
memory: 2048Mi
cpu: 2000m
Then nginx will spawn 2 worker process (2000 milli cpu = 2 cpu)
Related
My OS is Ubuntu, I use ps -aux |grep nginx, and I've found 3 nginx's processes; so my question is why there are 3 processes for nginx? it seems one process is by root, another two from www-data:
root 7833 0.0 0.0 126092 1476 ? Ss 12:32 0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 7834 0.0 0.0 126504 3124 ? S 12:32 0:00 nginx: worker process
www-data 7835 0.0 0.1 126504 5068 ? S 12:32 0:00 nginx: worker process
The process that is being run as root is the master NGINX process.
The two others are worker processes.
During the launch of NGINX service, the master process is the first one to launch.
It spans off the worker processes that actually handle the connections.
The master process runs as root in order to be able to do things like binding to privileged network ports, reading TLS certificates/keys during configuration load.
The worker processes have dropped privileges, as they only require to be able to read website files.
The number of worker processes can be controlled with worker_processes configuration directive. The default value is 1. Which means on a system with default config you will see a total of 2 processes (1 master and 1 worker).
The more worker processes you have, the more connections your web server can handle on a multi-core system.
E.g. you have 4 core CPU. By setting worker_processes 4; you make sure that all cores are being used to handle connections, so it is going to improve performance on a busy website.
Moreover you can just set worker_processes auto;. That will have NGINX determine the number of logical CPU units and set the number of workers corresponding to that.
The root process is necessary for nginx to access the network and files on your system.
The other two processes are set in your config file. Look there and you will see a setting for that which is dependent on the number of cores in the processor on your server. More available processes means more compute power as access to your server increases with visitors.
It's possible (I do not recall) that two processes is a default setting.
I am new to Datadog and NGiNX. I noticed when I was creating a monitor for some integrations several of the integrations were labeled as misconfigured. My guess is someone clicked the install button but did finish the remaining integration steps. I started to work with NGiNX and quickly hit a roadblock.
I verified it is running http status module
$ nginx -V 2>&1| grep -o http_stub_status_module
http_stub_status_module
The NGiNX install is under a different directory than is usual
and the configuration file is under
/<dir>/parts/nginx/conf
I created the status.conf file there.
When I reload the NGINX I get a failure. I don't understand what it means or how to proceed from here.
nginx: [error] open() "/<dir>/parts/nginx/logs/nginx.pid" failed (2: No such file or directory)
There is a logs directory with nothing in it.
ps -ef|grep nginx
user 35958 88952 0 May24 ? 00:00:43 nginx: worker process
user 35959 88952 0 May24 ? 00:00:48 nginx: worker process
root 88952 1 0 Feb21 ? 00:00:00 nginx: master process <dir>/parts/nginx/sbin/nginx -c <dir>/etc/nginx/balancer.conf -g pid <dir>/var/nginx-balancer.pid; lock_file /<dir>/var/nginx-balancer.lock; error_log <dir>/var/logs/nginx-balancer-error.log;
user 109169 63043 0 13:13 pts/0 00:00:00 grep --color=auto nginx
I think the issue is that our install doesn't seem to be following the same defaults as the instructions and I'm pretty sure I'm not doing this correctly.
If anyone has any insights that would be great!
Chris
I'm kinda new on setting up a production machine and I don't get why I'm not seeing the default index page for nginx on my EC2 machine. It's installed and running on my server, but when I try to access, it keeps loading and shows nothing, keeps on a blank page. I'm trying to access through the public ip (35.160.22.104) and through public dns(ec2-35-160-22-104.us-west-2.compute.amazonaws.com). Both does the same. What I'm doing wrong?
UPDATE:
I realized that when restarting nginx service, it didn't showed the "ok" message. So I took a look at error.log:
[ 2016-12-12 17:16:11.2439 709/7f3eebc93780 age/Cor/CoreMain.cpp:967 ]: Passenger core shutdown finished
2016/12/12 17:16:12 [info] 782#782: Using 32768KiB of shared memory for push module in /etc/nginx/nginx.conf:71
[ 2016-12-12 17:16:12.2742 791/7fb0c37a0780 age/Wat/WatchdogMain.cpp:1291 ]: Starting Passenger watchdog...
[ 2016-12-12 17:16:12.2820 794/7fe4d238b780 age/Cor/CoreMain.cpp:982 ]: Starting Passenger core...
[ 2016-12-12 17:16:12.2820 794/7fe4d238b780 age/Cor/CoreMain.cpp:235 ]: Passenger core running in multi-application mode.
[ 2016-12-12 17:16:12.2832 794/7fe4d238b780 age/Cor/CoreMain.cpp:732 ]: Passenger core online, PID 794
[ 2016-12-12 17:16:12.2911 799/7f06bb50a780 age/Ust/UstRouterMain.cpp:529 ]: Starting Passenger UstRouter...
[ 2016-12-12 17:16:12.2916 799/7f06bb50a780 age/Ust/UstRouterMain.cpp:342 ]: Passenger UstRouter online, PID 799
Anyway, it doesn't looks like an error, but a usual log.
UPDATE 2:
Nginx is running:
root 810 1 0 17:16 ? 00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data 815 810 0 17:16 ? 00:00:00 nginx: worker process
ubuntu 853 32300 0 17:44 pts/0 00:00:00 grep --color=auto nginx
And when I try do curl localhost, it returns the HTML as expected!
UPDATE3:
When I run systemctl status nginx, I get the following error:
Dec 12 17:54:48 ip-172-31-40-156 systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Trying to figure out what it is
UPDATE4:
Ran the command nmap 35.160.22.104 -Pn PORT STATE SERVICE 22/tcpand got the output:
Starting Nmap 7.01 ( https://nmap.org ) at 2016-12-12 18:05 UTC
Failed to resolve "PORT".
Failed to resolve "STATE".
Failed to resolve "SERVICE".
Unable to split netmask from target expression: "22/tcp"
Nmap scan report for ec2-35-160-22-104.us-west-2.compute.amazonaws.com (35.160.22.104)
Host is up (0.0015s latency).
Not shown: 999 filtered ports
PORT STATE SERVICE
22/tcp open ssh
UPDATE5:
Output for netstat -tuanp | grep 80:
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN -
tcp6 0 0 :::80 :::* LISTEN -
Your ec2 instance have a security group associated.
You should go to AWS console EC2 -> Instances -> Click on your instance -> On the bottom 'Description' -> Security Group. Click on the name and you will be redirect to EC2-> Network and Security. Click on 'Edit inbound rules' Add a rule:
Type: HTTP
Save. And that should be fine!
I ssh to the dev box where I am suppose to setup Redmine. Or rather, downgrade Redmine. In January I was asked to upgrade Redmine from 1.2 to 2.2. But the plugins we wanted did not work with 2.2. So now I'm being asked to setup Redmine 1.3.3. We figure we can upgrade from 1.2 to 1.3.3.
In January I had trouble getting Passenger to work with Nginx. This was on a CentOS box. I tried several installs of Nginx. I'm left with different error logs:
This:
whereis nginx.conf
gives me:
nginx: /etc/nginx
but I don't think that is in use.
This:
find / -name error.log
gives me:
/opt/nginx/logs/error.log
/var/log/nginx/error.log
When I tried to start Passenger again I was told something was already running on port 80. But if I did "passenger stop" I was told that passenger was not running.
So I did:
passenger start -p 81
If I run netstat I see something is listening on port 81:
netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:81 localhost:42967 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:51874 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62993 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62905 ESTABLISHED
tcp 0 0 10.0.1.253:ssh 10.0.1.91:50886 ESTABLISHED
tcp 0 0 localhost:81 localhost:42966 TIME_WAIT
tcp 0 0 10.0.1.253:ssh 10.0.1.91:62992 ESTABLISHED
tcp 0 0 localhost:42967 localhost:81 ESTABLISHED
but if I point my browser here:
http: // 10.0.1.253:81 /
(StackOverFlow does not want me to publish the IP address, so I have to malform it. There is no harm here as it is an internal IP that no one outside my company could reach.)
In Google all I get is "Oops! Google Chrome could not connect to 10.0.1.253:81".
I started Phusion Passenger at the command line, and the output is verbose, and I expect to see any error messages in the terminal. But I'm not seeing anything. It's as if my browser request is not being heard, even though netstat seems to indicate the app is listening on port 81.
A lot of other things could be wrong with this app (I still need to reverse migrate the database schema) but I'm not seeing any of the error messages that I expect to see. Actually, I'm not seeing any error messages, which is very odd.
UPDATE:
If I do this:
ps aux | grep nginx
I get:
root 20643 0.0 0.0 103244 832 pts/8 S+ 17:17 0:00 grep nginx
root 23968 0.0 0.0 29920 740 ? Ss Feb13 0:00 nginx: master process /var/lib/passenger-standalone/3.0.19-x86_64-ruby1.9.3-linux-gcc4.4.6-1002/nginx-1.2.6/sbin/nginx -c /tmp/passenger-standalone.23917/config -p /tmp/passenger-standalone.23917/
nobody 23969 0.0 0.0 30588 2276 ? S Feb13 0:34 nginx: worker process
I tried to cat the file /tmp/passenger-standalone.23917/config but it does not seem to exist.
I also killed every session of "screen" and every terminal window where Phusion Passenger might be running, but clearly, looking at ps aux, it looks like something is running.
Could the Nginx be running even if the Passenger is killed?
This:
ps aux | grep phusion
brings back nothing
and this:
ps aux | grep passenger
Only brings back the line with nginx.
If I do this:
service nginx stop
I get:
nginx: unrecognized service
and:
service nginx start
gives me:
nginx: unrecognized service
This is a CentOS machine, so if I had Nginx installed normally, this would work.
The answer is here - Issue Uploading Files from Rails app hosted on Elastic Beanstalk
You probably have /etc/cron.daily/tmpwatch removing the /tmp/passenger-standalone* files every day, and causing you all this grief.
This problem is perplexing me, because I seem to be following everything within the docs that would allow for a graceful restart.
I am running uWSGI in Emperor mode, with a bunch of vassals. When I try to do a graceful restart of one of the vassals, I receive an nginx 502 Bad Gateway response for about half a second. Here's some information:
One of my vassal .ini file:
[uwsgi]
master = true
processes = 2
home = /var/www/.virtualenvs/www.mysite.com
socket = /tmp/uwsgi.sock.myapp
pidfile = /tmp/uwsgi.pid.myapp
module = myapp
pythonpath = /var/www/www.mysite.com/mysite
logto = /var/log/uwsgi/myapp.log
chmod-socket = 666
vacuum = true
gid = www-data
uid = www-data
Then, I want to gracefully restart this process:
kill -HUP `cat /tmp/uwsgi.pid.myapp`
The output from the vassal log file looks alright (I think?)
...gracefully killing workers...
Gracefully killing worker 1 (pid: 29957)...
Gracefully killing worker 2 (pid: 29958)...
binary reloading uWSGI...
chdir() to /var/www/www.mysite.com/vassals
closing all non-uwsgi socket fds > 2 (max_fd = 1024)...
found fd 3 mapped to socket 0 (/tmp/uwsgi.sock.kilroy)
running /var/www/.virtualenvs/www.mysite.com/bin/uwsgi
*** has_emperor mode detected (fd: 15) ***
[uWSGI] getting INI configuration from kilroy.ini
open("/var/log/uwsgi/kilroy.log"): Permission denied [utils.c line 250]
unlink(): Operation not permitted [uwsgi.c line 998]
*** Starting uWSGI 1.2.3 (64bit) on [Fri Jun 8 09:15:10 2012] ***
compiled with version: 4.6.3 on 01 June 2012 09:56:19
detected number of CPU cores: 2
current working directory: /var/www/www.mysite.com/vassals
writing pidfile to /tmp/uwsgi.pid.kilroy
detected binary path: /var/www/.virtualenvs/www.mysite.com/bin/uwsgi
setgid() to 33
setuid() to 33
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
uwsgi socket 0 bound to UNIX address /tmp/uwsgi.sock.kilroy fd 3
Python version: 2.7.3 (default, Apr 20 2012, 23:04:22) [GCC 4.6.3]
Set PythonHome to /var/www/.virtualenvs/www.mysite.com
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x19e3e90
your server socket listen backlog is limited to 100 connections
*** Operational MODE: preforking ***
added /var/www/www.mysite.com/gapadventures/ to pythonpath.
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x19e3e90 pid: 30041 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 30041)
spawned uWSGI worker 1 (pid: 30042, cores: 1)
spawned uWSGI worker 2 (pid: 30043, cores: 1)
But when I try to access the site quickly after this, my nginx log gets this result:
2012/06/08 09:44:43 [error] 5885#0: *873 connect() to unix:///tmp/uwsgi.sock.kilroy failed (111: Connection refused) while connecting to upstream, client: 10.100.50.137, server: mydomain.com, request: "GET /favicon.ico HTTP/1.1", upstream: "uwsgi://unix:///tmp/uwsgi.sock.kilroy:", host: "mydomain.com"
This happens for about half a second after sending the signal, so this is clearly not very graceful.
Any advice? Thanks so much!
Correct sockets path in nginx config and uWSGI. Sockt have to be identical
Was
unix:///tmp/uwsgi.sock.kilroy
or
/tmp/uwsgi.sock.myapp
Need:
nginx
unix:/tmp/uwsgi.sock.myapp
and
uwsgi
socket = /tmp/uwsgi.sock.myapp