nginx upload client_max_body_size issue

nginx upload client_max_body_size issue - http

I'm running nginx/ruby-on-rails and I have a simple multipart form to upload files.
Everything works fine until I decide to restrict the maximum size of files I want uploaded.
To do that, I set the nginx client_max_body_size to 1m (1MB) and expect a HTTP 413 (Request Entity Too Large) status in response when that rule breaks.
The problem is that when I upload a 1.2 MB file, instead of displaying the HTTP 413 error page, the browser hangs a bit and then dies with a "Connection was reset while the page was loading" message.
I've tried just about every option there is that nginx offers, nothing seems to work. Does anyone have any ideas about this?
Here's my nginx.conf:
worker_processes 1;
timer_resolution 1000ms;
events {
worker_connections 1024;
}
http {
passenger_root /the_passenger_root;
passenger_ruby /the_ruby;
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
server {
listen 80;
server_name www.x.com;
client_max_body_size 1M;
passenger_use_global_queue on;
root /the_root;
passenger_enabled on;
error_page 404 /404.html;
error_page 413 /413.html;
}
}
Thanks.
**Edit**
Environment/UA: Windows XP/Firefox 3.6.13

nginx "fails fast" when the client informs it that it's going to send a body larger than the client_max_body_size by sending a 413 response and closing the connection.
Most clients don't read responses until the entire request body is sent. Because nginx closes the connection, the client sends data to the closed socket, causing a TCP RST.
If your HTTP client supports it, the best way to handle this is to send an Expect: 100-Continue header. Nginx supports this correctly as of 1.2.7, and will reply with a 413 Request Entity Too Large response rather than 100 Continue if Content-Length exceeds the maximum body size.

Does your upload die at the very end? 99% before crashing? Client body and buffers are key because nginx must buffer incoming data. The body configs (data of the request body) specify how nginx handles the bulk flow of binary data from multi-part-form clients into your app's logic.
The clean setting frees up memory and consumption limits by instructing nginx to store incoming buffer in a file and then clean this file later from disk by deleting it.
Set body_in_file_only to clean and adjust buffers for the client_max_body_size. The original question's config already had sendfile on, increase timeouts too. I use the settings below to fix this, appropriate across your local config, server, & http contexts.
client_body_in_file_only clean;
client_body_buffer_size 32K;
client_max_body_size 300M;
sendfile on;
send_timeout 300s;

From the documentation:
It is necessary to keep in mind that the browsers do not know how to correctly show this error.
I suspect this is what's happening, if you inspect the HTTP to-and-fro using tools such as Firebug or Live HTTP Headers (both Firefox extensions) you'll be able to see what's really going on.

Related

Nginx: cache Brotli compressed proxied upstream responses

I enabled Brotli compression in Nginx for a dynamically generated, but rarely changing resource.
My expectation was that when Nginx caches upstream responses, it will also cache the compression result. Thus, I assumed the CPU cost of enabling Brotli would be negligible. Instead, I see a performance impact, confirmed by perf top to be related to Brotli.
I verified that caching to the upstream server works. However, Nginx stores in its cache only the uncompressed upstream requests. Because of that, it will have to run the costly Brotli compression for each request. That is the problem.
There are sources (relating to gzip compression) recommending to compress either in upstream, or if that is not an option to create a second Nginx to proxy the request through, which takes the role of upstream and does the compression. Both solutions are not very elegant.
Is there a way to make Nginx cache not only the uncompressed upstream requests, but the result of the compression as well?
Maybe I am overlooking some. Here is a simplified config:
proxy_cache_path /var/cache/nginx levels=1 keys_zone=my_config_cache:8M
inactive=60m use_temp_path=off;
server {
location = /foo {
proxy_pass http://test-upstream;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_ignore_headers Expires;
proxy_ignore_headers Cache-Control;
brotli on;
brotli_comp_level 11;
proxy_cache my_config_cache;
proxy_cache_valid 10s;
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
expires 60s;
}
}

brotli_comp_level 11;
It is too high. The recommendation is 4 for dynamic content.
You cannot do what you want with the current setup.
If you can just set up your upstream to be brotli-capable instead, then you can cache compressed responses from it, by putting $http_accept_encoding as part of the cache key. That alone, however, will not be good enough, because you will have to normalize its value (think, all possible Accept-Encoding incoming headers, will result in a bloated and highly inefficient cache).
If you really care about Brotli enabled clients (now that most of the browsers supporting Brotli anyway), and highest compression level possible, then you can
enforce compression to the brotli-capable upstream by supplying Accept-Encoding: br while proxy-passing, which would result in caches always having brotli encoded response. (you don't need to adjust your cache key then). However, this requires a feature, currently unavailable, e.g. I call it unbrotli.
The idea is that everything goes to upstream saying "I want Brotli encoded response". The upstream delivers Brotli-ed response (where applicable, of course, e.g. for text responses). But for clients that only support gzip or no compression at all, things should be dynamically uncompressed from Brotli (very low CPU impact). This isn't so great, but there is a declining number of Brotli incapable clients.

NGINX caching upstream server when it shouldn't be

I want to set up an NGINX server which provides the following functionality:
When a request is made NGINX to get the page at /path/to/page, it fetches the page at /path/to/page.
If the upstream server is down or NGINX can't connect to it for some reason, NGINX returns a cached version of the page if it has one.
If the cached file is over 6 hours old, don't use it, just return a 502.
If the upstream server is available, never use the cache.
I have an NGINX config here which I think should work based on my understanding of the docs, but it doesn't and I can't see why. The problem is with point (4), this NGINX server returns the cached version of the file even if the upstream server is online.
daemon off;
error_log /dev/stdout info;
events {
}
http {
proxy_cache_path
"/home/jack/Code/NGINX Caching/Codebase/cache" # Cache path
keys_zone=cache:10m # Name of cacahe, max size for keys 10 megabytes
levels=1:2 # Don't store all cached files in a single directory
max_size=500m # Max size of cache
inactive=6h; # Cached file deleted if not used within six hours
proxy_cache_valid 6h;
proxy_cache_key "$request_method$request_uri";
access_log /dev/stdout;
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
proxy_cache_valid 6h;
proxy_buffering on;
proxy_cache_use_stale error timeout;
}
}
}
Replace the proxy_cache_path with a path to a directory on your machine, and run another webserver on your machine on port 8000. When I modify a file served by the server on port 8000, NGINX doesn't see the change until I erase the cache. The issue is with NGINX and not my client (Firefox), even if I turn off caching in the browser, NGINX returns a 200 with the old file contents.

Can you please check if these two directives might help you:
proxy_cache_revalidate:
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_revalidate
and
proxy_cache_use_stale: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_use_stale
There is a video from nginx.conf '17 online describing all the cool things you could achive with caching: https://www.youtube.com/watch?v=xZrOjmAkFC8. maybe this is also of interest for you.

So, it seems I misunderstood the NGINX proxy cache directives. The docs are quite confusing on this subject, so I'll lay it out point by point.
This official help page gives a decent overview of the various directives, however it makes no mention of something which as it turns out is a very important conceptual building block in understanding how NGINX caching works: the notion of a cached file being stale.
NGINX's default behavior is to always use the cache if it's there, rather than querying the upstream server. With this config, a minimal config to do caching, NGINX will query the upstream server the first time a page is accessed, and then use the cached version forever after that:
events {
}
http {
proxy_cache_path
/path/to/cache
keys_zone=my_cache:10m;
proxy_cache_key "$request_method$request_uri";
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
}
}
}
You can use the proxy_cache_valid directive to tell NGINX when a cached file should be considered "stale". For example, if we set proxy_cache_valid 5m, then 5 minutes after a cache file is created NGINX will stop serving it and querying the upstream server again on the next request. If the upstream is down, NGINX will return a 502. However, during those five minutes, NGINX will still use the cache even if the upstream server is available, so this is still not what we want.
NGINX has another directive, proxy_cache_use_stale, gives NGINX conditions under which it may use cached files even if they're stale. We can combine these together to get a server which caches pages, makes them stale immediately (or almost immediately), and then only uses them if the upstream is down:
events {
}
http {
proxy_cache_path
/path/to/cache
keys_zone=my_cache:10m;
proxy_cache_key "$request_method$request_uri";
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
proxy_cache_valid 1s;
proxy_cache_use_stale error timeout;
}
}
}
This config has almost the behavior we want, except that if the upstream server goes down for an extended period of time, NGINX will continue to use the cache indefinitely. As far as I know there is no way to tell NGINX to totally invalidate/clear a cached file after a given amount of time. Normally that's what proxy_cache_valid is for, but we're already using that for a different purpose, to make files stale after 1 second so they're only used when the upstream is down. We would need some next level after "stale" that means the file is completely invalidated, but I don't think that exists in NGINX.
So the simplest solution is to just clear the cache manually. It's sufficient to just delete all files in the cache directory (or its subdirectories) which were last modified more than 6 hours ago, or whatever you want the expiry time to be. On a Linux system, you can run this script every 5 minutes, for example:
find /path/to/cache -type f -mmin +360 -delete

Nginx with memcache, gunzip and ssi don't working together

I'm trying to save packed (gzip) html in Memcached and use in from nginx:
load html from memcached by memcached module
unpack by nginx gunzip module if packed
process ssi insertions by ssi module
return result to user
mostly, configuration works, except ssi step:
location / {
ssi on;
set $memcached_key "$uri?$args";
memcached_pass memcached.up;
memcached_gzip_flag 2; # net.spy.memcached use second byte for compression flag
default_type text/html;
charset utf-8;
gunzip on;
proxy_set_header Accept-Encoding "gzip";
error_page 404 405 400 500 502 503 504 = #fallback;
}
Looks like, nginx do ssi processing before unpacking by gunzip module.
In result HTML I see unresolved ssi instructions:
<!--# include virtual="/remote/body?argument=value" -->
No errors in the nginx log.
Have tried ssi_types * -- no effect
Any idea how to fix it?
nginx 1.10.3 (Ubuntu)
UPDATE
Have tried with one more upstream. Same result =(
In the log, I see, ssi filter applied after upstream request, but without detected includes.
upstream memcached {
server localhost:11211;
keepalive 100;
}
upstream unmemcached {
server localhost:21211;
keepalive 100;
}
server {
server_name dev.me;
ssi_silent_errors off;
error_log /var/log/nginx/error1.log debug; log_subrequest on;
location / {
ssi on;
ssi_types *;
proxy_pass http://unmemcached;
proxy_max_temp_file_size 0;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
location #fallback {
ssi on;
proxy_pass http://proxy.site;
proxy_max_temp_file_size 0;
proxy_http_version 1.1;
proxy_set_header Connection "";
error_page 400 500 502 503 504 /offline.html;
}
}
server {
access_log on;
listen 21211;
server_name unmemcached;
error_log /var/log/nginx/error2.log debug; log_subrequest on;
location / {
set $memcached_key "$uri?$args";
memcached_pass memcached;
memcached_gzip_flag 2;
default_type text/html;
charset utf-8;
gunzip on;
proxy_set_header Accept-Encoding "gzip";
error_page 404 405 400 500 502 503 504 = #fallback;
}
location #fallback {
#ssi on;
proxy_pass http://proxy.site;
proxy_max_temp_file_size 0;
proxy_http_version 1.1;
proxy_set_header Connection "";
error_page 400 500 502 503 504 /offline.html;
}
}
I want to avoid solution with dynamic nginx modules if possible

There are basically two issues to consider — whether the order of the filter modules is appropriate, and whether gunzip works for your situation.
0. The order of gunzip/ssi/gzip.
A simple search for "nginx order of filter modules" reveals that the order is determined at the compile time based on the content of the auto/modules shell script:
http://www.evanmiller.org/nginx-modules-guide.html
Multiple filters can hook into each location, so that (for example) a response can be compressed and then chunked. The order of their execution is determined at compile-time. Filters have the classic "CHAIN OF RESPONSIBILITY" design pattern: one filter is called, does its work, and then calls the next filter, until the final filter is called, and Nginx finishes up the response.
https://allthingstechnical-rv.blogspot.de/2014/07/order-of-execution-of-nginx-filter.html
The order of filters is derived from the order of execution of nginx modules. The order of execution of nginx modules is implemented within the file auto/modules in the nginx source code.
A quick glance at auto/modules reveals that ssi is between gzip and gunzip, however, it's not immediately clear which way the modules get executed (top to bottom or bottom to top), so, the default might either be reasonable, or, you may need to switch the two (which wouldn't necessarily be supported, IMHO).
One hint here is the location of the http_not_modified filter, which is given as an example of the If-Modified-Since handling on EMiller's guide above; I would imagine that it has to go last, after all the other ones, and, if so, then, indeed, it seems that the order of gunzip/ssi/gzip is exactly the opposite of what you need.
1. Does gunzip work?
As per http://nginx.org/r/gunzip, the following text is present in the documentation for the filter:
Enables or disables decompression of gzipped responses for clients that lack gzip support.
It is not entirely clear whether that the above statement should be construed as the description of the module (e.g., the clients lacking gzip support is why you might want to use this module), or whether it's the description of the behaviour (e.g., whether the module determines by itself whether or not gzip would be supported by the client). The source code at src/http/modules/ngx_http_gunzip_filter_module.c appears to imply that it simply checks whether the Content-Encoding of the reply as-is is gzip, and proceed if so. However, the next sentence in the docs (after the above quoted one) does appear to indicate that it has some more interaction with the gzip module, so, perhaps something else is involved as well.
My guess here is that if you're testing with a browser, then the browser DOES support gzip, hence, it would be reasonable for gunzip to not engage, hence, the SSI module would never have anything valid to process. This is why I suggest you determine whether the gunzip works properly and/or differently between doing simple plain-text requests through curl versus those made by the browser with the Accept-Encoding that includes gzip.
Solution.
Depending on the outcome of the investigation as above, I would try to determine the order of the modules, and, if incorrect, there's a choice whether recompiling or double-proxying would be the solution.
Subsequently, if the problem is still not fixed, I would ensure that gunzip filter would unconditionally do the decompression of the data from memcached; I would imagine you may have to ignore or reset the Accept-Encoding headers or some such.

Optimizing nginx for video streaming

I'm new to nginx, so please bear with me...
I have a Tomcat/JSP application whose primary purpose is to serve video via JWPlayer. The application is currently hosted on my laptop, but it will be migrating to an Amazon AWS EC2 instance.
On that same EC2 Instance, I have an nginx server set up to deliver said video content. Right now I have about 15 videos that I am serving as a single playlist. These videos range in size, with the smallest one being ~40 MB. The problem is that buffering is taking too long, and I'm trying to figure out how to optimize this.
I have tried to increase my buffer size in the nginx.conf file as follows:
location /videofiles {
# activate flv/mp4 parsing for pseudostreaming
flv;
mp4;
mp4_buffer_size 24M;
mp4_max_buffer_size 48M;
}
I have also tried increasing the chunk_size (though I can't find any doc on this property):
chunk_size 16384;
Here is some other potentially useful config info from nginx.conf:
worker_processes 10;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
client_max_body_size 100000M;
}
keepalive_timeout 65;
For perspective, I have FIOS and my upload/download is 15Mbps, but the content is coming over at about 15Kbps (according to the Charles app). I would have expected it to stream much faster...
Are there other properties I can set to help optimize this?
Thanks!

Configuring Nginx for large URIs

I have a large URI and I am trying to configure Nginx to accept it. The URI parameters are 52000 characters in length with a size of 52kb. I have tried accessing the URI without Nginx and it works fine.
But when I use Nginx it gives me an error. --- 414 (Request-URI Too Large)
I have configured the large_client_header_buffers and client_header_buffer_size in the http block but it doesn't seem to be working.
client_header_buffer_size 5120k;
large_client_header_buffers 16 5120k;
Any help will be appreciated.
Thank you.

I have found the solution. The problem was that there were multiple instances of nginx running. This was causing a conflict and that's why the large_client_header_buffers wasnt working. After killing all nginx instances I restarted nginx with the configuration:
client_header_buffer_size 64k;
large_client_header_buffers 4 64k;
Everything started working after that.
Hope this helps anyone facing this problem.

The problem for me was that there was an earlier server {} block (the default one, actually). By earlier, I mean it comes first when doing nginx -T. Nginx "shouldn't" have matched it, but it did, and nginx was using its configuration, which did not specify any of the directives regarding header length, such as large_client_header_buffers. Why was it used? According to this github comment that's for an add-on nginx headers module:
nginx throws out 414 very early, during reading and parsing the request header's first line. Because it happens so early, no location is matched against the current (guilty) request yet while your more_set_headers directive is a "location configuration" which only takes effect for a request that has already been associated with a location {} block
Solutions:
Do 1 of these 3 things:
Put directives related to request header size in the EARLIER server {} block. It probably needs to go in the earliest one.
Remove the EARLIER server {} block
Delete the EARLIER config file
Example problem configs
#/etc/nginx/conf.d/default.conf
server {
# Note: even if this block were completely empty, it'd be the same problem!
listen 80;
server_name localhost;
}
#/etc/nginx/conf.d/mysite
server {
listen 80;
server_name mysite.test;
large_client_header_buffers 4 64k;
}
How to fix
#/etc/nginx/conf.d/default.conf
server {
# put your request header size directives in here, or remove this file
client_header_buffer_size 5120k;
large_client_header_buffers 16 5120k;
}
#/etc/nginx/conf.d/mysite
server {
listen 80;
server_name mysite.test;
# no need for large_client_header_buffers here
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex