I use nginx as proxy server and I just set proxy_intercept_errors on; and also error_page directive error_page 400 /400.html; location = /400.html { root /path/to/error; }.
So the backend server which is tomcat(servlet) sometimes sendError likes HttpServletResponse.sendError(404); , that request may come back to nginx and redirect to 400.html.
In this situation I need to handle the internal redirect to 400.
My problem is I use a lua script which is checking some stuff from all income request, so I need to tell my lua to skip check when internal request is come.
Is it possible to identify internal request?
The below ngx.req.is_internal() is the answer.
https://github.com/openresty/lua-nginx-module#ngxreqis_internal
Related
Goal
Always serve content from a CDN EDGE cache, regardless of how stale. Refresh it in the background when possible.
Problem
I have a NextJS app that renders some React components server-side and delivers them to the client. For this discussion, let's just consider my homepage, which is unauthenticated and the same for everyone.
What I'd like is for the server rendered homepage to be cached at a CDN's EDGE nodes and served to end clients from that cache as often as possible, or always.
From what I've read, CDNs (like Fastly) which properly support cache related header settings like Surrogate-Control and Cache-Control: stale-while-revalidate should be able to do this, but in practice, I'm not seeing this working like I'd expect. I'm seeing either:
requests miss the cache and return to the origin when a prior request should have warmed it
requests are served from cache, but never get updated when the origin publishes new content
Example
Consider the following timeline:
[T0] - Visitor1 requests www.mysite.com - The CDN cache is completely cold, so the request must go back to my origin (AWS Lambda) and recompute the homepage. A response is returned with the headers Surrogate-Control: max-age=100 and Cache-Control: public, no-store, must-revalidate.
Visitor1 then is served the homepage, but they had to wait a whopping 5 seconds! YUCK! May no other visitor ever have to suffer the same fate.
[T50] - Visitor2 requests www.mysite.com - The CDN cache contains my document and returns it to the visitor immediately. They only had to wait 40ms! Awesome. In the background, the CDN refetches the latest version of the homepage from my origin. Turns out it hasn't changed.
[T80] - www.mysite.com publishes new content to the homepage, making any cached content truly stale. V2 of the site is now live!
[T110] - Visitor1 returns to www.mysite.com - From the CDNs perspective, it's only been 60s since Visitor2's request, which means the background refresh initiated by Visitor2 should have resulted in a <100s stale copy of the homepage in the cache (albeit V1, not V2, of the homepage). Visitor1 is served the 60s stale V1 homepage from cache. A much better experience for Visitor1 this time!
This request initiates a background refresh of the stale content in the CDN cache, and the origin this time returns V2 of the website (which was published 30s ago).
[T160] - Visitor3 visits www.mysite.com - Despite being a new visitor, the CDN cache is now fresh from Visitor1's most recent trigger of a background refresh. Visitor3 is served a cached V2 homepage.
...
As long as at least 1 visitor comes to my site every 100s (because max-age=100), no visitor will ever suffer the wait time of a full roundtrip to my origin.
Questions
1. Is this a reasonable ask of a modern CDN? I can't imagine this is more taxing than always returning to the origin (no CDN cache), but I've struggled to find documentation from any CDN provider about the right way to do this. I'm working with Fastly now, but am willing to try any others as well (I tried Cloudflare first, but read that they don't support stale-while-revalidate)
2. What are the right headers to do this with? (assuming the CDN provider supports them)
I've played around with both Surrogate-Control: maxage=<X> and Cache-Control: public, s-maxage=<X>, stale-while-revalidate in Fastly and Cloudflare, but none seem to do this correctly (requests well within the maxage timeframe dont pickup changes on the origin until there is a cache miss).
3. If this isn't supported, are there API calls that could allow me to PUSH content updates to my CDN's cache layer, effectively saying "Hey I just published new content for this cache key. Here it is!"
I could use a Cloudflare worker to implement this kinda caching myself using their KV store, but I thought I'd do a little more research before implementing a code solution to a problem that seems to be pretty common.
Thanks in advance!
I've been deploying a similar application recently. I ended up running a customised nginx instance in front of the Next.js server.
Ignore cache headers from the upstream server.
I wanted to cache markup and JSON, but I didn't want to send Cache-Control headers to the client. You could tweak this config to use the values in Cache-Control from Next.js, and then drop that header before responding to the client if the MIME type is text/html or application/json.
Consider all responses valid for 10 minutes.
Remove cached responses after 30 days.
Use up to 800 MB for the cache.
After serving a stale response, attempt to fetch a new response from the upstream server.
This isn't perfect, but it handles the important stale-while-revalidate behaviour. You could run a CDN over this as well if you want the benefit of global propagation.
Warning: This hasn't been extensively tested. I'm not confident that all the behaviour around error pages and response codes is right.
# Available in NGINX Plus
# map $request_method $request_method_is_purge {
# PURGE 1;
# default 0;
# }
proxy_cache_path
/nginx/cache
inactive=30d
max_size=800m
keys_zone=cache_zone:10m;
server {
listen 80 default_server;
listen [::]:80 default_server;
# Basic
root /nginx;
index index.html;
try_files $uri $uri/ =404;
access_log off;
log_not_found off;
# Redirect server error pages to the static page /error.html
error_page 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 500 501 502 503 504 505 /error.html;
# Catch error page route to prevent it being proxied.
location /error.html {}
location / {
# Let the backend server know the frontend hostname, client IP, and
# client–edge protocol.
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
# This header is a standardised replacement for the above two. This line
# naively ignores any `Forwarded` header passed from the client (which could
# be another proxy), and instead creates a new value equivalent to the two
# above.
proxy_set_header Forwarded "for=$remote_addr;proto=$scheme";
# Use HTTP 1.1, as 1.0 is default
proxy_http_version 1.1;
# Available in NGINX Plus
# proxy_cache_purge $request_method_is_purge;
# Enable stale-while-revalidate and stale-if-error caching
proxy_cache_background_update on;
proxy_cache cache_zone;
proxy_cache_lock on;
proxy_cache_lock_age 30s;
proxy_cache_lock_timeout 30s;
proxy_cache_use_stale
error
timeout
invalid_header
updating
http_500
http_502
http_503
http_504;
proxy_ignore_headers X-Accel-Expires Expires Cache-Control Vary;
proxy_cache_valid 10m;
# Prevent 502 error
proxy_buffers 8 32k;
proxy_buffer_size 64k;
proxy_read_timeout 3600;
proxy_pass "https://example.com";
}
}
Regarding question 3.
You can use new names for every single files where you did the changes
to prevent not wanted cache or use API cloudFlare for release the cache like you suggest.
It's possible, more information you can find here
https://api.cloudflare.com/#zone-purge-files-by-cache-tags-or-host
I have a web application that wants to access files from a third party site without CORS enabled. The requests can be to an arbitrary domain with arbitrary parameters. I'm sending a request to my domain containing the target encoded as a GET parameter, i.e.
GET https://www.example.com/proxy/?url=http%3A%2F%2Fnginx.org%2Fen%2Fdocs%2Fhttp%2Fngx_http_proxy_module.html
Then in Nginx I do
location /proxy/ {
resolver 8.8.8.8;
set_unescape_uri $dst $arg_url;
proxy_pass $dst;
}
This works for single files, but the target server sometimes returns a Location header, which I want to intercept and modify for the client to retry.
Basically I would like to escape $sent_http_location, append it to https://www.example.com/proxy/?url= and pass that back to the browser to retry.
I've tried doing
set_escape_uri $tmp $sent_http_location;
proxy_redirect $sent_http_header /pass/?v=$tmp;
but this doesn't work. I've also tried saving the Location header, then ignoring the incoming header with
proxy_hide_header
and replacing it with my own
proxy_set_header
but ignoring causes me to lose the variable saving it.
How can I configure Nginx to accomplish this handling of redirects so that I could pass a encoded URL would be returned to the user when the proxied site redirects?
There are several problems with your unsuccessful approach:
proxy_set_header sets the header that goes to the upstream server, not to the client. So even if $sent_http_location hadn't been empty, your configuration couldn't possibly work as you wanted it to.
$sent_http_<header> variables point exactly to the same area of memory as the response headers that will be send to the client. So when proxy_hide_header takes effect, the specified header is being removed from the memory along with the value of the corresponding $sent_http_<header>.
set_escape_uri works at a very early stage of the request processing, way before proxy_pass is called and the Location header is returned from the upstream server. So it will always process the at that time empty variable $sent_http_location and the result also will always be the empty variable.
The last problem is the most serious. The only way to make set_escape_uri work after proxy_pass is to force Nginx to leave the current location and start the processing all over again. This can be done with the error_page trick:
location /proxy/ {
resolver 8.8.8.8;
set_unescape_uri $dst $arg_url;
proxy_pass $dst;
proxy_intercept_errors on;
error_page 301 = #rewrite_301;
}
location #rewrite_301 {
set_escape_uri $location $upstream_http_location;
return 301 /pass/?v=$location;
}
Note the use of $upstream_http_location instead of $sent_http_location. When Nginx leaves the context of the location, it assumes that the request will be proxied to another upstream, or processed in some other way, and so it clears the headers recieved from the last proxy_pass, to make room for new response headers.
Unlike $sent_http_<header> vairables, which represent response headers that will be send to the client, $upstream_http_<header> variables represent response headers that were recieved from the upstream. Because of that, they are only replaced with new values when the request is proxied to another upstream server. So, once set, these variables can be used at any moment, they will not be cleared.
I have a scenario when server needs to do authorization request before an actual request. so, one request are served by 2 different services.
Nginx's location has to be handled by Auth-Service, and if the response status is 200 OK, then the request should be forwarded to the Feature-Service. Otherwise, if the response status 401 then this status should be replied to front-end.
upstream auth_service {
server localhost:8180;
}
upstream feature_service {
server localhost:8080;
}
location /authAndDo {
# suggest here
}
Code snippet in nginscript will be also OK.
Specifically for this purpose, http://nginx.org/r/auth_request exists through http://nginx.org/docs/http/ngx_http_auth_request_module.html (not built by default).
It lets you put authentication, through a subrequest, into any location you want, effectively separating the authentication from the actual resource.
In general, such not possible from web server. 401 is a response at front end plus gives HTTP WWW-Authenticate response header. Develop web application according to need or edit 401 file. HTTP 401 has RFC specification. Users, browsers should understand the message. Nginx doc described how 401 will be handled.
Nginx community edition's auth_request will only process if the subrequest returns HTTP 200, else for 401 will not redirect more than to 401 by default, other headers will not be process the response to protect the application & the users. Nginx community edition not even support all features of HTTP/2. It can go worser.
Apache2 web server has full HTTP/2 support and custom 401 location in auth module and works only on few browsers. Few browsers allow Apache2 to do that perfectly. Others show fail to load page. On Stack Exchange networks's various subdomains peoples asked before for Apache2 to make it working for all the browsers.
Hardly you can redirect on Nginx :
error_page 401 /401.html;
location ~ (401.html)$ {
alias /usr/share/nginx/html/$1;
}
Another way may be using reverse proxy with another server like peoples talking here on Github. I can not give warranty of failure of loading page.
I have a web application that wants to access files from a third party site without CORS enabled. The requests can be to an arbitrary domain with arbitrary parameters. I'm sending a request to my domain containing the target encoded as a GET parameter, i.e.
GET https://www.example.com/proxy/?url=http%3A%2F%2Fnginx.org%2Fen%2Fdocs%2Fhttp%2Fngx_http_proxy_module.html
Then in Nginx I do
location /proxy/ {
resolver 8.8.8.8;
set_unescape_uri $dst $arg_url;
proxy_pass $dst;
}
This works for single files, but the target server sometimes returns a Location header, which I want to intercept and modify for the client to retry.
Basically I would like to escape $sent_http_location, append it to https://www.example.com/proxy/?url= and pass that back to the browser to retry.
I've tried doing
set_escape_uri $tmp $sent_http_location;
proxy_redirect $sent_http_header /pass/?v=$tmp;
but this doesn't work. I've also tried saving the Location header, then ignoring the incoming header with
proxy_hide_header
and replacing it with my own
proxy_set_header
but ignoring causes me to lose the variable saving it.
How can I configure Nginx to accomplish this handling of redirects so that I could pass a encoded URL would be returned to the user when the proxied site redirects?
There are several problems with your unsuccessful approach:
proxy_set_header sets the header that goes to the upstream server, not to the client. So even if $sent_http_location hadn't been empty, your configuration couldn't possibly work as you wanted it to.
$sent_http_<header> variables point exactly to the same area of memory as the response headers that will be send to the client. So when proxy_hide_header takes effect, the specified header is being removed from the memory along with the value of the corresponding $sent_http_<header>.
set_escape_uri works at a very early stage of the request processing, way before proxy_pass is called and the Location header is returned from the upstream server. So it will always process the at that time empty variable $sent_http_location and the result also will always be the empty variable.
The last problem is the most serious. The only way to make set_escape_uri work after proxy_pass is to force Nginx to leave the current location and start the processing all over again. This can be done with the error_page trick:
location /proxy/ {
resolver 8.8.8.8;
set_unescape_uri $dst $arg_url;
proxy_pass $dst;
proxy_intercept_errors on;
error_page 301 = #rewrite_301;
}
location #rewrite_301 {
set_escape_uri $location $upstream_http_location;
return 301 /pass/?v=$location;
}
Note the use of $upstream_http_location instead of $sent_http_location. When Nginx leaves the context of the location, it assumes that the request will be proxied to another upstream, or processed in some other way, and so it clears the headers recieved from the last proxy_pass, to make room for new response headers.
Unlike $sent_http_<header> vairables, which represent response headers that will be send to the client, $upstream_http_<header> variables represent response headers that were recieved from the upstream. Because of that, they are only replaced with new values when the request is proxied to another upstream server. So, once set, these variables can be used at any moment, they will not be cleared.
Is there any way I can get nginx to not forward a specific request header to uwsgi?
I want to enable nginx basic auth, but if the Authorization header gets forwarded to my app it breaks things (for reasons, I won't go into). If it was just a simple proxy_pass I would be able to do proxy_set_header Authorization ""; but I don't think this works with uwsgi_pass and there's no equivalent uwsgi_set_header as far as I can see.
Thanks.
Try hide header and ignore header directives:
uwsgi_hide_header
Syntax: uwsgi_hide_header field; Default: — Context: http, server,
location
By default, nginx does not pass the header fields “Status” and
“X-Accel-...” from the response of a uwsgi server to a client. The
uwsgi_hide_header directive sets additional fields that will not be
passed. If, on the contrary, the passing of fields needs to be
permitted, the uwsgi_pass_header directive can be used.
uwsgi_ignore_headers
Syntax: uwsgi_ignore_headers field ...; Default: —
Context: http, server, location Disables processing of certain
response header fields from the uwsgi server. The following fields can
be ignored: “X-Accel-Redirect”, “X-Accel-Expires”,
“X-Accel-Limit-Rate” (1.1.6), “X-Accel-Buffering” (1.1.6),
“X-Accel-Charset” (1.1.6), “Expires”, “Cache-Control”, “Set-Cookie”
(0.8.44), and “Vary” (1.7.7).
If not disabled, processing of these header fields has the following
effect:
“X-Accel-Expires”, “Expires”, “Cache-Control”, “Set-Cookie”, and
“Vary” set the parameters of response caching; “X-Accel-Redirect”
performs an internal redirect to the specified URI;
“X-Accel-Limit-Rate” sets the rate limit for transmission of a
response to a client; “X-Accel-Buffering” enables or disables
buffering of a response; “X-Accel-Charset” sets the desired charset of
a response.
It's probably too late for you but for anyone who would have the same problem, this answer provides a valid solution.
In this case the Authorization header could be passed by using the following directive:
uwsgi_param HTTP_Authorization "";