in one of our sites, we want to create a reverse proxy cache structure with a drupal backend. Our structure is nginx + apache, and as we've tested in several sites, we don't want to do it with boost (we have our reasons, it is not the topic of this question).
What we want is something similar to our nginx + apache reverse proxy cache, doing all it on nginx, but it seems that I'm not lucky at searching for right solution: seems that all pages are nginx + drupal + boost.
Is there any proved solution that provides nginx configuration to reverse proxy cache a drupal backend without boost?
Thank you in advance,
You can create a simple nginx reverse proxy cache like so:
http {
proxy_cache_path /data/nginx/cache keys_zone=CACHE_NAME:10m max_size=500m;
server {
location / {
proxy_pass http://localhost;
proxy_set_header Host $host;
proxy_cache CACHE_NAME;
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 10m;
}
}
}
Set a proxy_cache_path in your HTTP directive
Provide a proxy name and shared memory zone inside keys_zone
Set a max_size for your cache folder
In the example above, proxy_cache_valid will cache 200 & 302 requests for 60m and 404 requests for 10m.
Read the full documentation on http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache for advanced configurations.
Hope this can get you started.
Related
I want to set up an NGINX server which provides the following functionality:
When a request is made NGINX to get the page at /path/to/page, it fetches the page at /path/to/page.
If the upstream server is down or NGINX can't connect to it for some reason, NGINX returns a cached version of the page if it has one.
If the cached file is over 6 hours old, don't use it, just return a 502.
If the upstream server is available, never use the cache.
I have an NGINX config here which I think should work based on my understanding of the docs, but it doesn't and I can't see why. The problem is with point (4), this NGINX server returns the cached version of the file even if the upstream server is online.
daemon off;
error_log /dev/stdout info;
events {
}
http {
proxy_cache_path
"/home/jack/Code/NGINX Caching/Codebase/cache" # Cache path
keys_zone=cache:10m # Name of cacahe, max size for keys 10 megabytes
levels=1:2 # Don't store all cached files in a single directory
max_size=500m # Max size of cache
inactive=6h; # Cached file deleted if not used within six hours
proxy_cache_valid 6h;
proxy_cache_key "$request_method$request_uri";
access_log /dev/stdout;
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
proxy_cache_valid 6h;
proxy_buffering on;
proxy_cache_use_stale error timeout;
}
}
}
Replace the proxy_cache_path with a path to a directory on your machine, and run another webserver on your machine on port 8000. When I modify a file served by the server on port 8000, NGINX doesn't see the change until I erase the cache. The issue is with NGINX and not my client (Firefox), even if I turn off caching in the browser, NGINX returns a 200 with the old file contents.
Can you please check if these two directives might help you:
proxy_cache_revalidate:
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_revalidate
and
proxy_cache_use_stale: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_use_stale
There is a video from nginx.conf '17 online describing all the cool things you could achive with caching: https://www.youtube.com/watch?v=xZrOjmAkFC8. maybe this is also of interest for you.
So, it seems I misunderstood the NGINX proxy cache directives. The docs are quite confusing on this subject, so I'll lay it out point by point.
This official help page gives a decent overview of the various directives, however it makes no mention of something which as it turns out is a very important conceptual building block in understanding how NGINX caching works: the notion of a cached file being stale.
NGINX's default behavior is to always use the cache if it's there, rather than querying the upstream server. With this config, a minimal config to do caching, NGINX will query the upstream server the first time a page is accessed, and then use the cached version forever after that:
events {
}
http {
proxy_cache_path
/path/to/cache
keys_zone=my_cache:10m;
proxy_cache_key "$request_method$request_uri";
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
}
}
}
You can use the proxy_cache_valid directive to tell NGINX when a cached file should be considered "stale". For example, if we set proxy_cache_valid 5m, then 5 minutes after a cache file is created NGINX will stop serving it and querying the upstream server again on the next request. If the upstream is down, NGINX will return a 502. However, during those five minutes, NGINX will still use the cache even if the upstream server is available, so this is still not what we want.
NGINX has another directive, proxy_cache_use_stale, gives NGINX conditions under which it may use cached files even if they're stale. We can combine these together to get a server which caches pages, makes them stale immediately (or almost immediately), and then only uses them if the upstream is down:
events {
}
http {
proxy_cache_path
/path/to/cache
keys_zone=my_cache:10m;
proxy_cache_key "$request_method$request_uri";
server {
listen 8080;
location ~ ^/(.+)$ {
proxy_pass http://0.0.0.0:8000/$1;
proxy_cache cache;
proxy_cache_valid 1s;
proxy_cache_use_stale error timeout;
}
}
}
This config has almost the behavior we want, except that if the upstream server goes down for an extended period of time, NGINX will continue to use the cache indefinitely. As far as I know there is no way to tell NGINX to totally invalidate/clear a cached file after a given amount of time. Normally that's what proxy_cache_valid is for, but we're already using that for a different purpose, to make files stale after 1 second so they're only used when the upstream is down. We would need some next level after "stale" that means the file is completely invalidated, but I don't think that exists in NGINX.
So the simplest solution is to just clear the cache manually. It's sufficient to just delete all files in the cache directory (or its subdirectories) which were last modified more than 6 hours ago, or whatever you want the expiry time to be. On a Linux system, you can run this script every 5 minutes, for example:
find /path/to/cache -type f -mmin +360 -delete
All, i'm running into issues proxying a subdomain to a DO space (suspect that AWS/S3 would act the same way); in this case, trying to serve logos off the bucket, with the requested url being logos.mysite.com/dev/logo1. Nginx should proxy_pass this to https://mybucket.dospace.com/logos/dev/logo1.png (and it will always be .png).
NGINX setup is currently:
server{
server_name logos.mysite.com
location / {
set $bucket "mybucket.dospace.com"
proxy_pass https://$bucket/logos$request_uri.png;
proxy_set_header Host $host;
... other proxy_set/hide settings ...
}
Above is simplest version of many rewrite/return/regex location attempts, but nothing works. In the example above (using chrome), a redirect to https://mybucket.dospace.com/dev/logo1.png occurs, which is missing /logos in the path. Moreover, I don't want chrome to redirect at all - it's more convenient for the user to see the original request from logos.mysite.com (if that's possible).
Was anyone able to execute on a similar setup?
To bypass cache if upstream is up (max-age 1) and use cache if down (proxy_cache_use_stale) I created following config:
proxy_cache_path /app/cache/ui levels=1:2 keys_zone=ui:10m max_size=1g inactive=30d;
server {
...
location /app/ui/config.json {
proxy_cache ui;
proxy_cache_valid 1d;
proxy_ignore_headers Expires;
proxy_hide_header Expires;
proxy_hide_header Cache-Control;
add_header Cache-Control "max-age=1, public";
proxy_cache_use_stale error timeout http_500 http_502 http_503 http_504;
add_header X-Cache-Status $upstream_cache_status;
add_header X-Cache-Date $upstream_http_date;
proxy_pass http://app/config.json;
}
}
But cache is not used when upstream is down and client only gets 504 Gateway Timeout. I've already read following articles:
https://nginx.org/ru/docs/http/ngx_http_proxy_module.html#proxy_cache_use_stale
How to configure NginX to serve Cached Content only when Backend is down (5xx Resp. Codes)?
https://serverfault.com/questions/752838/nginx-use-proxy-cache-if-backend-is-down
And It does not work as I expect. Any help is appreciated.
Let's discuss a really simple setup with two servers. One running apache2 serving a simple html page. The other running nginx that reverse proxies to the first one.
http {
[...]
proxy_cache_path /var/lib/nginx/tmp/proxy levels=2:2 keys_zone=one:10m inactive=48h max_size=16g use_temp_path=off;
upstream backend {
server foo.com;
}
server {
[...]
location / {
proxy_cache one;
proxy_cache_valid 200 1s;
proxy_cache_lock on;
proxy_connect_timeout 1s;
proxy_cache_use_stale error timeout updating http_502 http_503 http_504;
proxy_pass http://backend/
}
}
}
This setup works for me. The most important difference is the proxy_cache_valid 200 1s; It means that only responses with http code 200 will be cached, and will only be valid for 1 second. Which does imply that the first request to a certain resource will be get from the backend and put in the cache. Any further request to that same resource will be served from the cache for a full second. After that the first request will go to the backend again, etc, etc.
The proxy_cache_use_stale is the important part in your scenario. It basically says in which cases it should still serve the cached version although the time specified by proxy_cache_valid has already passed. So here you have to decided in which cases you still want to serve from cache.
The directive's parameters are the same as for proxy_next_upstream.
You will need these:
error: In case the server is still up, but not responding, or is not responding correctly.
timeout: connecting to the server, requesting or response times out. This is also why you want to set proxy_connect_timeout to something low. The default is 60s and is way to long for an end-user.
updating: there is already a request for new content on it's way. (not really needed but better from a performance point of view.)
The http_xxx parameters are not going to do much for you, when that backend server is down you will never get a response with any of these codes anyway.
In my real life case however the backend server is also nginx which proxies to different ports on the localhost. So when nginx is running fine, but any of those backends is down the parameters http_502, http_503 and http_504 are quit useful, as these are exactly the http codes I will receive.
The http_403, http_404 and http_500 I would not want to serve from cache. When a file is forbidden (403) or no longer on the backend (404) or when a script goes wrong (500) there is a reason for that. But that is my take on it.
This, like the other similar questions linked to, are examples of the XY Problem.
A users wants to do X, wrongly believes the solution is Y but cannot do Y and so asks for help on how to do Y instead of actually asking about X. This invariably results in problems for those trying to give an answer.
In this case, the actual problem, X, appears to be that you will like to have a failover for your backend but would like to avoid spending money on a separate server instance and would like to know what options are available.
The idea of using a cache for this is not completely off but you have to approach and set the cache like a failover server which means it has to be a totally separate and independent system from the backend. This rules out proxy_cache which is intimately linked to the backend.
In your shoes, I will set up a memcached server and configure this to cache your stuff but not ordinarily serve your requests except on a 50x error.
There is a memcached module that comes with Nginx that can be compiled and used but it does not have a facility to add items to memcached. You will have to do this outside Nginx (usually in your backend application).
A guide to setting memcached up can be found here or just do a web search. Once this is up and running, this will work for you on the Nginx side:
server {
location / {
# You will need to add items to memcached yourself here
proxy_pass http://backend;
proxy_intercept_errors on
error_page 502 504 = #failover;
}
location #failover {
# Assumes memcached is running on Port 11211
set $memcached_key "$uri?$args";
memcached_pass host:11211;
}
}
Far better than the limited standard memcached module is the 3rd party memc module from OpenResty which allows you to add stuff directly in Nginx.
OpenResty also has the very flexible lua-resty-memcached which is actually the best option.
For both instances, you will need to compile them into your Nginx and familiarise yourself on how to set them up. If you need help with this, ask a new question here with the OpenResty tag or try the OpenResty support system.
Summary
What you actually need is a failover server.
This has to be separate and independent of the backend.
You can use a caching system as this but it cannot be proxy_cacheif you cannot live with getting cached results for the minimum time of 1 second.
You will need to extend a typical Nginx installation to do this.
It is not working because http_500, http_502, http_503, http_504 codes are expected from backend. In your case 504 is nginx code.
So you need to have the following:
proxy_connect_timeout 10s;
proxy_cache_use_stale ... timeout ...
or
proxy_cache_use_stale ... updating ...
or both.
I have three sites configured on my server using NGINX and the first two are working fine. One is a static site and one is running Rails (using Unicorn). I have attempted to mirror the NGINX/Unicorn configurations.
For the non-working site, I get "problem loading site" in my browser and absolutely nothing in my NGINX error logs (even at debug level) or my Unicorn log. I also get nothing when I attempt to cURL to the site.
I have double checked DNS by pinging domain name and am running out of ideas. I've also tried making this the default server and browsing by IP address.
Thoughts on how I should go about debugging? I would like to at least understand if NGINX is seeing these requests or not.
NGINX configuration:
upstream unicorn-signup {
server unix:/home/signup/app/tmp/sockets/unicorn.sock;
}
server {
listen 80;
listen [::]:80;
root /home/signup/app/current/public;
server_name signup.quote2bill.com;
# configure for Unicorn (NGINX acts as reverse proxy)
location / {
try_files $uri #unicorn;
}
location #unicorn {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded_Proto $scheme;
proxy_redirect off;
proxy_pass http://unicorn-signup;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
}
}
Fixed! It was the dreaded force_ssl flag in my production configuration. For future travelers, here is how I went about troubleshooting:
Went on a Costco run to clear my mind and buy huge quantities of stuff.
To determine if it was a DNS, NGINX or Unicorn/Rails problem, I replaced my NGINX configuration with a very simple one and placed a simple index.html in my public root. This worked fine - which lets DNS off the hook (I could resolve the domain name at the web server).
I diff'd the working and non-working NGINX configuration files for the nth time and made them as close as possible but didn't find anything.
Then I noticed that when I was serving the simple index.html file in #2 above, the domain was not getting redirected to https:// but when switched to my "normal" Unicorn/Rails version, I was always getting redirected.
I searched for Rails redirecting to SSL and remembered the force_ssl flag.
I checked my two projects and noticed the flag was not set in the working project, but set in the non-working one (smoking gun).
I changed, committed, redeployed and reloaded the browser and it... didn't work (!) Fortunately, I had the good sense to clear browser cache and try again and it is all good now.
Hope this helps someone.
My title isn't the best, my knowledge of webstuff is quite basic, sorry.
What I want to achieve
I have one box fanbox running nginx on Archlinux that I use as main entry point to my home LAN from the internet (namely work where I can only get out to port 80 and 443) via the reverse proxy facility using a changing domain name over which I have no control and that we will call home.net for now.
fanbox has its ports 80 and 443 mapped to home.net, that part was easy.
I have 2 webservers behind the firewall, web1.lan, web2.lan, web2ilo.lan. Both of these have various applications (which may have same name on different machines) that can directly be accessed on the LAN via standard URLs (the names are given as examples, I have no control over the content):
http://web1.lan/phpAdmin/
http://web1.lan/gallery/
http://web2.lan/phpAdmin/
http://web2.lan/dlna/
...and so on...
Now web2ilo.lan is a particular case. It's the out of band management web interface of the HP server web2.lan. That particular webserver offers only 1 application, so it can only be accessed via its root URL:
http://web2ilo/login.html
My goal is to access these via subpath of home.net like this:
http://home.net/web1/phpAdmin/
http://home.net/web1/gallery/
http://home.net/web2/phpAdmin/
http://home.net/web2/dlna/
http://home.net/web2ilo/login.html
My problem
That nearly works but the web applications tend to rewrite URLs so that after I login to, respectively:
http://home.net/web1/phpAdmin/login.php
http://home.net/web2ilo/login.html
the browser is redirected respectively to
http://home.net/phpAdmin/index.php
http://home.net/index.html
Note that the relative subpaths web1 and web2ilo have gone, which logically give me a 404.
My config
So far, I've searched a lot and I tried many options in nginx without understanding too much what I was doing. Here is my config that reproduces this problem. I've left SSL out for clarity.
server {
listen 443 ssl;
server_name localhost;
# SSL stuff left out for clarity
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
location /web1/ {
proxy_set_header Host $host;
proxy_redirect off;
proxy_pass https://web1.lan/;
}
location /web2/ {
proxy_set_header Host $host;
proxy_redirect off;
proxy_pass https://web2.lan/;
}
location /web2ilo/ {
proxy_set_header Host $host;
proxy_redirect off;
proxy_pass https://web2ilo.lan/;
}
}
After first answers
After the first couple of answers (thanks!), I realise that my setup is far from common and that I may be heading for trouble all alone.
What would then be a better idea to access the webserver behind the firewall without touching frontend ports and domain/hostname ?
You may wish to consider the use of setting proxy_redirect to let nginx know that it should modify the "backend" server response headers (Location and Refresh) to the appropriate front-end URLs. You can either use the default setting to allow nginx to calculate the appropriate values from your location and proxy_pass directives, or explicitly specify the mappings like below:
proxy_redirect http://web1.lan/ /web1/
See: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_redirect
Note: This only affects response headers - not any links in HTML content, or any Javascript.
If you experience problems with links in content or Javascript, you can either modify the content on the backend servers (which you've indicated may not be possible), or adjust your proxy solution such that front end paths are the same as the back end ones (e.g., rather than http://frontend/web1/phpAdmin you simply have http://frontend/phpAdmin). This would entail adding location directives for each application, e.g.,
location /phpAdmin/ {
proxy_set_header Host $host;
proxy_redirect off;
proxy_pass https://web1.lan/phpAdmin/;
}
Here is a tested example.
In my docker-compose.yml I use the demo image whoami to test:
whoareyou:
image: "containous/whoami"
restart: unless-stopped
networks:
- default
labels:
- "traefik.enable=true"
- "traefik.http.routers.whoareyou.rule=Path(`/whoareyou`)"
- "traefik.http.routers.whoareyou.entrypoints=web"
- "traefik.http.routers.whoareyou.middlewares=https-redirect#file"
- "traefik.http.routers.whoareyou-secure.rule=Path(`/whoareyou`)"
- "traefik.http.routers.whoareyou-secure.entrypoints=web-secure"
- "traefik.http.routers.whoareyou-secure.tls=true"
In my config.yaml I have my https redirect:
http:
middlewares:
https-redirect:
redirectScheme:
scheme: https
I did not find an answer to my question and decided to try a different approach:
I now have containerized most of my servers
I use Traefik (https://containo.us/traefik/) as my "fanbox" (= reverse proxy) as it covers my needs but also solves in a quite easy fashion the SSL certificates stuff.
Thanks all for your suggestions.