NGINX microcache - nginx

I'm using D7 with perusio's config, so far so good.
What I want to understand is how microchache works internally and what is the meaning of the configuration parameters in this line:
fastcgi_cache_path /var/cache/nginx/microcache levels=1:2
keys_zone=microcache:5M max_size=1G inactive=2h
loader_threshold=2592000000 loader_sleep=1 loader_files=100000;
I don't know, for example, if max_size=1G (which I assume is the max-size of entire cache) refers to 1G in RAM, and could be a problem if my VPS only has 1G RAM.
loader_files? Maybe the files that could be cached?
loader_sleep? Is it one second for microcaching? That sounds logical but in testing microcaches max time is about 15 seconds.
... And the other parameters I have no idea. Please help me to understand this line.
Thanks.

Related

Mysterious mariadb 10.4.1 ram ussage

i have upgraded from mariadb 10.1.36 to 10.4.8 and i can see mysterious increasing ram ussage on that new version. I also edited innodb_buffer_pool_size ant seems there is no effect if its set to 15M or 4G, ram is just slowly increasing. After while it eat whole ram and oom killer kills mariadb and this is repeating.
My server has 8GB RAM and its increasing like 60-150MB per day. Its not terrible but i have around 150 database servers so its huge problem.
I can temporary fix problem by restarting mariadb and its start again.
Info about database server:
databases: 200+
tables: 28200(141 per database)
average active connections: 100-200
size of stored data: 100-350GB
cpu: 4
ram: 8GB
there is my config:
server-id=101
datadir=/opt/mysql/
socket=/var/lib/mysql/mysql.sock
tmpdir=/tmp/
gtid-ignore-duplicates=True
log_bin=mysql-bin
expire_logs_days=4
wait_timeout=360
thread_cache_size=16
sql_mode="ALLOW_INVALID_DATES"
long_query_time=0.8
slow_query_log=1
slow_query_log_file=/opt/log/slow.log
log_output=TABLE
userstat = 1
user=mysql
symbolic-links=0
binlog_format=STATEMENT
default_storage_engine=InnoDB
slave_skip_errors=1062,1396,1690innodb_autoinc_lock_mode=2
innodb_buffer_pool_size=4G
innodb_buffer_pool_instances=5
innodb_log_file_size=1G
innodb_log_buffer_size=196M
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=24
innodb_file_per_table
innodb_write_io_threads=24
innodb_read_io_threads=24
innodb_adaptive_flushing=1
innodb_purge_threads=5
innodb_adaptive_hash_index=64
innodb_flush_neighbors=0
innodb_flush_method=O_DIRECT
innodb_io_capacity=10000
innodb_io_capacity_max=16000
innodb_lru_scan_depth=1024
innodb_sort_buffer_size=32M
innodb_ft_cache_size=70M
innodb_ft_total_cache_size=1G
innodb_lock_wait_timeout=300
slave_parallel_threads=5
slave_parallel_mode=optimistic
slave_parallel_max_queued=10000000
log_slave_updates=on
performance_schema=on
skip-name-resolve
max_allowed_packet = 512M
query_cache_type=0
query_cache_size = 0
query_cache_limit = 1M
query_cache_min_res_unit=1K
max_connections = 1500
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
open_files_limit=1020000
collation-server = utf8_general_ci
character-set-server = utf8
log-error=/opt/log/error.log
log-error=/opt/log/error.log
pid-file=/var/run/mysqld/mysqld.pid
malloc-lib=/usr/lib64/libjemalloc.so.1
I solved it! The problem is memory allocation library.
If you do this SQL query:
SHOW VARIABLES LIKE 'version_malloc_library';
You must to get value "jemalloc" library. If you get only "system", you may have problems.
To change that, you need edit any .conf file in this directory:
/etc/systemd/system/mariadb.service.d/
There, add this line:
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1"
(this library file may be in other folder)
Then you must to restart mysqld
service mysqld stop && systemctl daemon-reload && service mysqld start
You got carried away in increasing values in my.cnf.
Many of the caches grow until hitting their limit, hence the memory growth you experienced.
What is the value from SHOW GLOBAL STATUS LIKE 'Max_used_connections';? Having a large max_connections accentuates several of the other values; lower it.
But perhaps the really bad one(s) involve table caches -- which have units of tables, not bytes. Crank these down a lot:
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
I have exactly the same problem. Is it due to a bad configuration? Or is it a bug of the new version?
mariadb 10.1 was updated to 10.3 just when I upgraded Debian 9 to Debian 10. I tried solve the problem with mariadb 10.4 but nothing changed.
I want to downgrade version but I think it's neccesary dump all database and restore it, and that means being hours without service.
I don't think Debian 10 has to do with the issue
Please read my previous comments about alternative memory allocators...
When jemalloc is used:
When default memory allocator used:
Try with tcmalloc
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4.5.3.1"

EC2 with wordpress - everyday running out of space (no space left on device) - can't start apache

I'm having the strangest problem for days now. I took over a WordPress website of a company that was originally developed by another person – the codebase is a mess but I was able to go over it and make sure it at least is working.
The database is huge (70mb) and there is a lot of plugin dependencies on the site.
However the site works generally without issues now and I'm hosting it on an EC2 with a bitnami stack for WordPress.
The weird thing though is that everyday (for instance today morning) I check the site and it's down … 
Service Unavailable The server is temporarily unable to service your
request due to maintenance downtime or capacity problems. Please try
again later.
Additionally, a 503 Service Unavailable error was encountered while
trying to use an ErrorDocument to handle the request.
When logging into the server with ssh and trying to restart apache I get this:
Failed to unmonitor apache: write /var/lib/gonit/state: no space left
on device Syntax OK /opt/bitnami/apache2/scripts/ctl.sh : apache not
running Syntax OK (98)Address already in use: AH00072: make_sock:
could not bind to address [::]:80 (98)Address already in use: AH00072:
make_sock: could not bind to address 0.0.0.0:80 no listening sockets
available, shutting down AH00015: Unable to open logs
/opt/bitnami/apache2/scripts/ctl.sh : httpd could not be started
Failed to monitor apache: write /var/lib/gonit/state: no space left on
device
I had this the 3rd time in 3 days now even though I restored the server from a snapshot with a volume of 200gb (for testing purposes) and all site files including uploads only have 5gb.
The site is running on an EC2 (t2.medium) with 200gb volume now and today morning I can't restart apache. Yesterday evening when restoring from a snapshot the site works well and normal - it's actually even fast.
I don't know where to start investigating here. What could cause the server to run out of disc space in one night?
Thanks,
Matt
Also one of the weirdest things it seems. I reset everything yesterday eventing from an EC2 snapshot to a 200gb volume and attached it to the instance. Everything was working fine. I made some changes on the files, deleted some plugins, updated some settings.
And it seems this is all gone now. And I'm using an elastic IP, so I couldn't connect to a wrong device or something.
Bitnami Engineer here, you will probably need to resize the disk of your instance. But you can investigate those issues later, these commands will show the directories with large number of files
cd /opt/bitnami
sudo find . -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
du -h -d 1
If MySQL is the service that is taking more space, you can try adding this line under the [mysqld] block of the /opt/bitnami/mysql/my.cnf configuration file
expire_logs_days = 7
That will force MySQL to purge the old logs of the server after 7 days. You will need to restart MySQL after that:
sudo /opt/bitnami/ctlscript.sh restart mysql
More information here:
https://community.bitnami.com/t/something-taking-up-space-and-growing/64532/7
What you need to do is increase the size of partition on the disk and the size of file system on that partition. Even you increased the volume size, these figure kept unchanged. Create another from snapshot would not help too.
Check how to do it here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html
Your df result shows
Filesystem 1K—blocks Used Available Use% Mounted on
udev 2014560 0 2014560 0% /dev
tmpfs 404496 5872 398624 2% /run
/dev/xvdal 20263528 20119048 128096 100%
tmpfs 2022464 0 2022464 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 2022464 0 2022464 0% /sys/fs/cgroup
/dev/loop0 18432 18432 0 100% /snap/amazon—ssm—agent/1480
/dev/loopl 91264 91264 0 100% /snap/core/7713
/dev/loop2 12928 12928 0 100% /snap/amazon—ssm—agent/295
/dev/loop3 91264 91264 0 100% /snap/core/7917
tmpfs 404496 0 404496 0% /run/user/1000
where the root volume /deb/xvda1 has only 20GB and that is marked as 100% of the volume, not 200GB as you mentioned.
When you increase the volume size during the instance running, it is not automatically applied. In your EC2, you have to apply the change of volume as follows:
sudo resize2fs /dev/xvda1
and check the size of the volume by doing df -h then you will see the size is now 200GB.

Alternate max_post_size and upload_max_filesize for different subdomains and/or directories php.ini nginx fastcgi php-fpm

I have nginx and php5-fpm working wonderfully, and www.mysite.com, this.mysite.com, and that.mysite.com all go to different directories elegantly. I have been working on a site for uploading files. I'd like the maximum file size to be 10 GB. For this to work, I have to tell php.ini that max_post_size and upload_max_filesize are 10240 MB instead of the default 2 MB.
I am well aware of the security implications. I would therefore like those php.ini values of 10240 MB to apply ONLY to one or both of:
upload.mysite.com
and/or
www.mysite.com/upload
One option is to also install Apache to listen on a different port, do some redirect/rewrite magic, and have mod_php's php.ini file with the 10240 MB values handle only the uploads site. I'd prefer to not do that.
WITHOUT having a separate web server handle requests to my upload page, how can I accomplish this in a single instance of nginx?
Use client_max_body_size and set it to the desired value in your server blocks. Nginx will directly drop the handling of the request if the request body exceeds the size specified in this directive. Please note that you won't get any POST submitted in that case.
With php-fpm you can have several pools running, one for each website.
Then you can alter any php.ini setting inside the pool configuration (but be carefull, you cannot use MB or G shortcuts), using this syntax inside the pool:
; UPLOAD
php_admin_flag[file_uploads] =1
php_admin_value[upload_tmp_dir]="/some/path/var/tmp"
;Maximum allowed size for uploaded files. 10240MB *1024 *1024 -> 10737418240
php_value[upload_max_filesize] ="10737418240"
php_admin_value[max_input_time] = (...)
php_admin_value[post_max_size] (...)
This is really near the available syntax for Apache virtualhosts. PHP configuration has never been stuck in php.ini files.
As you can see you can use php_value or php_admin_value, the big difference is that when using php_value the setting can later be altered by the application, using ini_set command. So you could use a low value for upload_max_filesize and alter it in the application only on the upload script.

PHP-FPM using 40% CPU for a Single Request

(I googled and searched this forum for hours, found some topics, but none of them worked for me)
I'm using Wordpress with: Varnish + Nginx + PHP-FPM + APC + W3 Total Cache + PageSpeed.
As I'm using Varnish, first time I call www.mysite.com it use just 10% of CPU. Calling the second time, it will be cached. The problem is passing request parameter in URL.
For just 1 request (www.mysite.com?1=1) it shows in top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7609 nginx 20 0 438m 41m 28m S 11.6 7.0 0:00.35 php-fpm
7606 nginx 20 0 437m 39m 26m S 10.3 6.7 0:00.31 php-fpm
After the page is fully loaded, these processes above are still active. And after 2 seconds, they are replaced by another 2 php-fpm processes(below), which are active for 3 seconds.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7665 nginx 20 0 444m 47m 28m S 20.9 7.9 0:00.69 php-fpm
7668 nginx 20 0 444m 46m 28m R 20.9 7.9 0:00.63 php-fpm
40% CPU usage just for 1 request not cached!
Strange things:
CPU usage is higher after the page was loaded
When I purged the cache (W3 and Varnish), it take just 10% of CPU to load a not cached page
This high CPU usage just happend passing request parameter or in Wordpress Admin
When I try to do 10 request(pressing F5 key 10x), the server stop serving and in php-fpm log appears:
WARNING: [pool www] server reached max_children setting (10), consider raising it
I raised that value to 20, same problem.
I'm using pm=ondemand (pm.max_children=10 and pm.max_requests=500).
Inittialy I was using pm=dynamic (pm.max_children=10, pm.start_servers=1, pm.min_spare_servers=1, pm.min_spare_servers=2, pm.max_requests=500) and it happened the same problem.
Anyone could help, plz? Any help would be appreciated!
PS:
APC is ON (98% Hits, 2% Misses)
Server is Amazon Micro (613MB RAM)
PHP 5.3.26 (fpm-fcgi)
Linux version 3.4.48-45.46.amzn1.x86_64 Red Hat 4.6.3-2 (I think it's based on CentOS 5)
First reduce the stack of caches. Why using varnish which serves pages from memory when you're using w3 cache already which serves from memory as well?
W3cache is CPU intensive! It does not just cache items but also compresses, minifies and merges files on the fly.
You got a total of 512MB of memory on your machine which is not a lot, also your CPU power is less than a modern smartphone has. Memory access is extremely slow compared to a root server because of the xen virtualization layer - That's why less is more.
Make sure w3cache is properly set up so it actually caches items, then warmup your cache and you should be fine.
Have a look at Googles nginx pagespeed module https://github.com/pagespeed/ngx_pagespeed, it can do the same thing w3cache does, just much more efficient because it happens in the webserver, not in PHP
Nginx can also directly serve from memcached http://www.kingletas.com/2012/08/full-page-cache-with-nginx-and-memcache.html (example article, might need some more investigation)
Problem solved!
For those who are having the same problem:
Check Varnish configuration;
Check your Wordpress's plugin;
1) In my case, TTL was not configured in Varnish, so nothing was being cached.
This config worked for me:
sub vcl_fetch {
if (!(req.url ~ "wp-(login|admin)")) {
unset beresp.http.set-cookie;
set beresp.ttl = 48h;
}
}
2) The high CPU usage AFTER page loads, was caused by a Wordpress plugin called: "Scroll Triggered Box".
It was doing some AJAX after page has loaded. I disabled that plugin and high load stopped.
There are two factors at play here:
You are using micro instance which has a burstable CPU profile. It can burst up to 2 ECU's then be limited to much less than 1 (Some estimates put this at around 0.1 - 0.2 ECU's)
While logged in as an admin, wordpress caching plugins often bypass or reduce caching. W3 should allow you to switch this if you want caching on all the time.

nginx large static file serving slow on AWS EBS-backed server

I'm trying to figure out the correct tuning for nginx on an AWS server that is wholly backed by EBS. The basic issue is that when downloading a ~100MB static file, I'm seeing consistent download rates of ~60K/s. If I use scp to copy the same file from the AWS server, I'm seeing rates of ~1MB/s. (So, I'm not sure EBS even comes into play here).
Initially, I was running nginx with basically the out-of-the-box configuration (for CentOS 6.x). But in an attempt to speed things up, I've played around with various tuning parameters to no avail -- the speed has remained basically the same.
Here is the relevant fragment from my config as it stands at this moment:
location /download {
root /var/www/yada/update;
disable_symlinks off;
autoindex on;
# Transfer tuning follows
aio on;
directio 4m;
output_buffers 1 128k;
}
Initially, these tuning settings were:
sendfile on;
tcp_nopush on;
tcp_nodelay on;
Note, I'm not trying to optimize for a large amount of traffic. There is likely only a single client ever downloading at any given time. The AWS server is a 'micro' instance with 617MB of memory. Regardless, the fact that scp can download at ~1MB/s leads me to believe that HTTP should be able to match or beat that throughput.
Any help is appreciated.
[Update]
Additional information. Running a 'top' command while a download is running, I get:
top - 07:37:33 up 11 days, 1:56, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 63 total, 1 running, 62 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
and 'iostat' shows:
Linux 3.2.38-5.48.amzn1.x86_64 04/03/2013 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.02 0.00 0.03 0.03 0.02 99.89
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
xvdap1 0.23 2.66 8.59 2544324 8224920
Have you considered turning sendfile on? Sendfile allows nginx to use the kernel directly to send static files, so it should be faster than any other option.
By default scp will much faster then your HTTP connection. I have a suggestion for you. If you are serving a static file, I prefer to use S3 with Cloud front. Which makes it faster. Its very difficult to achieve better performance there is a file transfer.
Given that things work well on the same machine you are getting throttled. First check your usage policy with AWS, perhaps it's in the fine print. Alternatively, try different ISP'S. If they all give you 60kB/s you know it's AWS.

Resources