Nginx - error_log for dynamic vhosts - nginx

How can I have an error log file for dynamic virtual hosts?
server {
listen 80;
server_name ~^(?<folder>[^.]*).(?<user>[^.]*).dev.example.com;
root /var/www/projects/dev/$user/$folder/htdocs;
access_log /var/www/projects/dev/$user/$folder/access.log;
error_log /var/www/projects/dev/$user/$folder/error.log;
}
The root and access_log are working correctly but if I add the error_log line then nginx fails to start because /var/www/projects/dev/$user/$folder/ doesn't exists.

This is simply not supported — you can use the variables within http://nginx.org/r/access_log, but not within http://nginx.org/r/error_log.
P.S. Note that in general, it's a pretty bad idea to use user-input variables within either access_log or error_log, since you introduce the potential for a malicious user to exhaust the inodes on your filesystem by making requests with random strings in the Host header, which may result in creation of a new file for every new request. This could even happen inadvertently (without a malicious intent) by someone simply trying to enumerate all the possible users on your server. Your specific code doesn't necessarily suffer from this, as directories are not normally created automatically by any UNIX software, but it's still not the best way to do things.
In nginx philosophy, it'd be a better idea to generate a separate http://nginx.org/r/server config for each user (since nginx can be restarted without any downtime). Consider that it has additional benefits because nginx heavily relies on mathematically efficient datastructures for finding the correct server (which regexp-based server configurations aren't). Not using variables within access_log would also ensure that writes to the access_log could be buffered, which can greatly increase the effective throughput of your server (especially if you log onto non-SSD HDDs).
Basically, there are already many bandaids in nginx to support variables within access_log (just look at the list of limitations at http://nginx.org/r/access_log for when variables are used to specify the file), and, I guess, it was deemed inappropriate to introduce even more of such bandaids to error_log as well (especially given that the error_log in production scenarios is not supposed to be nearly as large as access_log, so, if necessary, you can easily write external tools to split it out).

Related

Can't stream more than 5-6 static videos simultaneously on a single client

Intro
Hi! First, let me say that I am a networking n00b. With that in mind :
We have a small video conferencing app, dedicated to eLearning, which does live streaming through janus (webRTC). We offer the possibility to our clients to record their videos. The live streaming part we got running pretty well now, but we are having a small issue with video playback.
When we are playing back the videos, as long as there aren't many large video files to stream, everything plays back correctly. However, we had fringe cases of trouble for a while and I was sent to investigate them.
Problem encountered
When playing back multiple large video files, I noticed that I can't get more than 4-5 large files to stream simultaneously on a single client.
When the limit is hit, there seems to be some kind of race condition lock happening : a few videos are stuck, and a few are playing (usually 2 or 3). Then one of the playing videos will get stuck, and one of the stuck videos will start playing.
However, it doesn't affect the playback on other clients. When it get stuck, I can't even connect to the MinIO web interface from the same client, but I can from another client (i.e. another browser, or from another machine). I can also stream as much from the other client as I can from the one that is stuck.
I've been testing different configurations on a test minio server by loading the same file many times from different tabs in Chrome, which seems to recreate the problem.
Server Description
The files are hosted on a cloud storage that offers > 7 Gpbs bandwidth. The storage is mounted on a MinIO instance in a kubernetes cluster, behind a NGINX Ingress Controller that serves as the single point of entry to the cluster, and so it also controls traffic to the other micro-services on the k8s cluster.
Each k8s node has a guaranteed bandwitdth of > 250 Mbps, if that matters in this case.
The MinIO instance is mainly used to create transient sharing rights. We call the videos simply by pointing to their location using the DNS we set up for the service.
What has been investigated and tried
At first, I thought it might be a MinIO misconfiguration. However, I looked at the config files and the documentation and couldn't find anything that seemed to limit the number of connections / requests per client.
While reading, I stumbled upon the fact that HTML/1.1 didn't allow for more than 6 connections on Chrome and thought I hit the jackpot. But then I went and looked and the protocol used to get the files is already HTTP2 (h2).
Then I went one level higher and looked through the configuration of the NGINX Ingress Controller. Here again, everything seems ok :
events {
multi_accept on;
worker_connections 16384;
use epoll;
}
[...]
http2_max_field_size 4k;
http2_max_header_size 16k;
http2_max_requests 1000;
http2_max_concurrent_streams 128;
[...]
So I've been scouring the net for a good while now and I'm getting more and more confused by what I could investigate next, so I thought I'd come here and ask my very first StackOverflow question.
So, is there anything I could do with the current setup to make it so we can stream more large files simultaneously? If not, what are your thoughts and recommendations?
Edit :
I've found a workaround by searching hard enough : Increase Concurrent HTTP calls
At first I was not a fan - HTTP2 is supposed, from my understanding, to support a lot of parallel requests. However, I think I found the crux of the problem here : https://en.wikipedia.org/wiki/Head-of-line_blocking
Further research led me to find these mitigations to that problem : https://tools.ietf.org/id/draft-scharf-tcpm-reordering-00.html#rfc.section.3.2
I'll have to look into SCTP and see if it is something I'd like to implement, however. At first glance, that seems rather complicated and might not be worth the time investment.

How should I configure Nginx to maximise the throughput for single Ruby application running on Passenger?

I want to benchmark Nginx+Passenger, and am wondering if there is anything that can be adjusted in the following nginx.conf to improve throughput and reduce latency. This is running on a 4-core i7 (8 hardware threads) with 16GB of main memory.
load_module /usr/lib/nginx/modules/ngx_http_passenger_module.so;
# One per CPU core:
worker_processes auto;
events {
}
http {
include mime.types;
default_type application/octet-stream;
access_log off;
sendfile on;
keepalive_timeout 60;
# 8 should be number of CPU threads.
passenger_root /usr/lib/passenger;
passenger_max_pool_size 8;
server {
listen [::]:80;
server_name passenger;
root /srv/http/benchmark/public;
passenger_enabled on;
passenger_min_instances 8;
passenger_ruby /usr/bin/ruby;
passenger_sticky_sessions on;
}
}
I am using wrk with multiple concurrent connections (e.g. 100).
Here are some specific issues:
Can the Nginx configuration be improved further?
Is it using HTTP/1.1 persistent connections to the Passenger application servers?
Is using a dynamic module causing any performance issues?
Do I need to do anything else to maximise the efficiency of how the integration is working?
I haven't set a passenger log file to ensure that logging IO is not a bottleneck.
Regarding the number of processes - I have 8 hardware threads, so I’ve set it to use 8 instances minimum.
Would it make sense to use threads per application server? I assume it's only relevant for IO bound workloads.
If I am pegging the processors with 8 application servers, does that indicate a sufficient amount of servers? Or should I try with, say, 16?
What is the expected performance difference between Nginx+Passenger vs Passenger Standalone?
Passenger dev here.
"Can the Nginx configuration be improved further?"
Probably, Nginx has a lot of levers, and if all you are doing is serving known payloads in a benchmark then you can seriously improve performance with Nginx's caching, for example.
"Is it using HTTP/1.1 persistent connections to the Passenger application servers?"
No it uses unix sockets.
"Is using a dynamic module causing any performance issues?"
No, once nginx loads the library, making a function call into it is the same as any other c++ function call.
"Do I need to do anything else to maximize the efficiency of how the integration is working?"
You might want to look into Passenger's turbo caching, and/or nginx caching.
"I haven't set a passenger log file to ensure that logging IO is not a bottleneck."
Good, but turn the logging level down to 0 to avoid a bit of processing.
"Would it make sense to use threads per application server? I assume it's only relevant for IO bound workloads."
Not sure exactly what you mean, are you talking about Passenger's multithreading support or nginx's?
"If I am pegging the processors with 8 application servers, does that indicate a sufficient amount of servers?"
If you are CPU bound then adding more processes won't help.
"What is the expected performance difference between Nginx+Passenger vs Passenger Standalone?"
Not much, Passenger standalone uses nginx internally. You might see some improvement if you use the builtin engine with passenger standalone, but that means you can't use caching which is far more important.

Simultaneously running nginx and apache2

I am running my websites on Ubuntu 14.04 server with nginx. I would like to set up a mail server (probably using squirrelmail) which uses apache2 (presently not in use). I intend to keep the port numbers for the web server and mail server entirely different.
Can I do this? Do I have to do anything out of the ordinary (secret handshakes) to set this up, and if so, what exactly do I need to do?
Yes it's entirely possible. Generally the complications you will have to sort out is both servers trying to listen on the same port, and directory permission issues.
In this case you will have to take precaution to keep Nginx user www-data as root because only root is allowed access to port 80 and 443 (which I'm assuming you are using for your website).
If you work out the kinks though it's entirely possible and has been done many times for situations like yours.
With that said I think it would be a lot of extra work setting up an entire apache webserver just to run squirrelmail. You can configure squirrelmail to work with Nginx and it's not that difficult.
Tutorial here.
Imho that is the easiest way forward with the fewest possible ways to crash your entire server and be forced to startover. Which brings me to another point.
Before you do any work on a server BACK UP EVERYTHING. Clone your hard drive if you can, if not make sure you get all databases static files config files ect....
It's also best practice to not work on a live production server if you can avoid it. Use a backup in the mean time while you play around with yours.
It will save you a lot of stress to work at a comfortable pace rather than work against the clock of hundreds of customers trying to view your website.
I know that doesn't apply in all cases, but it's a good practice to get into if you can to prepare you for any bigger jobs that come down the road.

Is client_body_buffer_size per-connection?

I'm not able to tell from reading the documentation whether client_body_buffer_size means per-connection or per-server (or does it depend on where the directive is set?)
I would like to create a large in-memory buffer (16m) to allow occasional large uploads to be speedy. But I want that to be a shared 16m -- if there are a lot of concurrent uploads then slowing down to disk-speed is fine.

Nginx proxy buffering - changing buffer's number vs size ?

I was wondering and trying to figure out how these two settings:
proxy_buffers [number] [size];
may affect (improve / degrade) proxy server performance, and whether to change buffers' size, or the number, or both...?
In my particular case, we're talking about a system serving dynamically generated binary files, that may vary in size (~60 - 200kB). Nginx serves as a load-balancer in front of 2 Tomcats that act as generators. I saw in Nginx's error.log that with default buffers' size setting all of proxied responses are cached to a file, so what I found to be logical is to change the setting to something like this:
proxy_buffers 4 32k;
and the warning message disappeared.
What's not clear to me here is if I should preferably set 1 buffer with the larger size, or several smaller buffers... E.g.:
proxy_buffers 1 128k; vs proxy_buffers 4 32k; vs proxy_buffers 8 16k;, etc...
What could be the difference, and how it may affect performance (if at all)?
First, it's a good idea to see what the documentation says about the directives:
Syntax: proxy_buffers number size;
Default: proxy_buffers 8 4k|8k;
Context: http, server, location
Sets the number and size of the buffers used for reading a response from the proxied server, for a single connection. By default, the buffer size is equal to one memory page. This is either 4K or 8K, depending on a platform.
The documentation for the proxy_buffering directive provides a bit more explanation:
When buffering is enabled, nginx receives a response from the proxied server as soon as possible, saving it into the buffers set by the proxy_buffer_size and proxy_buffers directives. If the whole response does not fit into memory, a part of it can be saved to a temporary file on the disk. …
When buffering is disabled, the response is passed to a client synchronously, immediately as it is received. …
So, what does all of that mean?
An increase of buffer size would apply per connection, so even 4K would be quite an increase.
You may notice that the size of the buffer is by default equivalent to platform page. Long story short, choosing the "best" number might as well go beyond the scope of this question, and may depend on operating system and CPU architecture.
Realistically, the difference between a bigger number of smaller buffers, or a smaller number of bigger buffers, may depend on the memory allocator provided by the operating system, as well as how much memory you have and how much memory you want to be wasted by being allocated without being used for a good purpose.
E.g., I would not use proxy_buffers 1 1024k, because then you'll be allocating a 1MB buffer for every buffered connection, even if the content would easily fit in 4KB, that would be wasteful (although, of course, there's also the little-known fact that unused-but-allocated-memory is virtually free since 1980s). There's likely a good reason that the default number of buffers was chosen to be 8 as well.
Increasing the buffers at all might actually be a bit pointless if you do caching of the responses of these binary files with the proxy_cache directive, because Nginx will still be writing it to disk for caching, and you might as well not waste the extra memory for buffering these responses.
A good operating system should be capable of already doing appropriate caching of the stuff that gets written to disk, through the file-system buffer-cache functionality. There is also the somewhat strangely-named article at Wikipedia, as "disk-buffer" name was already taken for the HDD hardware article.
All in all, there's likely little need to duplicate buffering directly within Nginx. You might also take a look at varnish-cache for some additional ideas and inspiration about the subject of multi-level caching. The fact is, "good" operating systems are supposed to take care of many things that some folks mistakenly attempt to optimise through application-specific functionality.
If you don't do caching of responses, then you might as well ask yourself whether or not buffering is appropriate in the first place.
Realistically, buffering may come useful to better protect your upstream servers from the Slowloris attack vector — however, if you do let your Nginx have megabyte-sized buffers, then, essentially you're exposing Nginx itself for consuming an unreasonable amount of resources to service clients with malicious intents.
If the responses are too large, you might want to look into optimising things at the response level. E.g. doing splitting of some content into individual files; doing compression on the file level; doing compression with gzip with HTTP Content-Encoding etc.
TL;DR: this is really a pretty broad question, and there are too many variables that require non-trivial investigation to come up with the "best" answer for any given situation.

Resources