nginx reverse proxy treats request urls differently - emr

we're using nginx to provide a security layer in front of an AWS presto cluster. We registered an SSL certificate for nginx-presto-cluster.our-domain.com
Requests for presto are passed through nginx with Basic authentication.
An SQL query for presto results in multipe sequential requests to the server, to fetch query results.
We created a nginx.conf that looks like this:
location / {
auth_basic $auth;
auth_basic_user_file /etc/nginx/.htpasswd;
sub_filter_types *;
sub_filter_once off;
sub_filter 'http://localhost:8889/' 'https://presto.nginx-presto-cluster.our-domain.com/';
proxy_pass http://localhost:8889/;
}
Presto's responses contain a nextUri to fetch results. The sub_filter rewrites these Uri's from localhost:8889 to our secure domain, where they are passed through nginx again.
The problem:
The first response has a body that looks exactly as desired:
{
"id":"20171123_104423_00092_u7hmr"
, ...
,"nextUri":"https://presto.nginx-presto-cluster.our-domain.com/v1/statement/20171123_104423_00092_u7hmr/1"
, ...
}
The second request, however, looks like:
{
"id":"20171123_105250_00097_u7hmr"
, ...
, "nextUri":"http://localhost:8889/v1/statement/20171123_105250_00097_u7hmr/2"
, ...
}
We would've expected the rewrite to work always the same way.
Can you help us?

We resolved the issue by adding
proxy_set_header Accept-Encoding "";
into the config snippet above.
The reason is that the traffic might be compressed in the process, if this is enabled.
The string replacement then would not work on the compressed content.
By not accepting any encoding we prevent this compression.

Related

Nginx reverse proxy prevent changing host for other locations

I’m trying to get this done for a couple of days and I can’t. I’m having two web apps, one production and one testing, let’s say https://production.app and https://testing.app. I need to make a reverse proxy under https://production.app/location that points to https://testing.app/location (at the moment I only need one functionality from the testing env in prod). The configuration I created indeed proxies this exact location, but that functionality also loads resources from /static directory resulting in request to https://production.app/static/xyz.js instead of https://testing.app/static/xyz.js, and /static can’t be proxied. Is there a way to change headers only in this proxied traffic so that it’s https://testing.app/static (and of course any other locations)?
Below is my current config (only directives regarding proxy):
server {
listen 443 ssl;
server_name production.app;
root /var/www/production.app;
location /location/ {
proxy_pass https://testing.app/location/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Have a good day :)
Your question title isn't well written. It has nothing to do about "changing headers", and nginx location, while being essentially tied to the request processing, isn't the right term there too. It should be something like "how to serve the assets requested by proxied page in a different way than the others".
Ok, here we go. First of all, do you really understand what exactly does the following config?
location /location/ {
proxy_pass https://testing.app/location/;
}
The URI prefix /location/ specified after the https://testing.app upstream name has the following effect:
The /location/ prefix specified in the location directive truncated from the processed URI.
Processed UIR is being prepended with the /location/ prefix specified in the proxy_pass directive before being passed to the upstream.
That is, if these prefixes are equal, the following configuration will do the same eliminating those steps (therefore processing the request slightly more efficiently):
location /location/ {
proxy_pass https://testing.app;
}
Understanding this is a key concept to the trick we're going to use.
Since all the requests for static assets made from the /location/ page will have the HTTP Referer header equal to http://production.app/location/ (well, unless you specify no-referer as you page referrer policy, and if you are, the whole trick won't work at all unless you change it), we can rewrite a request URI that matched some conditions (say start with the /static/ or /img/ prefix) redirecting it to some special internal location using the following if block (should be placed at the server configuration level):
if ($http_referer = https://production.app/location/) {
rewrite ^/static/ /internal$uri;
rewrite ^/img/ /internal$uri;
...
}
We can use any prefix, the /internal one used here is only as example, however used prefix should not interfere with your existing routes. Next, we will define that special location using the internal keyword:
location /internal/ {
internal;
proxy_pass https://testing.app/;
}
Trailing slash after the http://testing.app upstream name here is the essential part. It will make the proxied URI returning to its original state, removing the /internal/ prefix added by the rewrite directive earlier and replacing it with a single slash.
You can use this trick for more than one page using regex pattern to match the Referer header value, e.g.
if ($http_referer ~ ^https?://production\.app/(page1|page2|page3)/) {
...
}
You should no try this for anything else but the static assets or it can break the app routing mechanism. This is also only a workaround that should be used for testing purposes only, not something that I can recommend for the production use on a long term basis.
BTW, are you sure you really need that
proxy_set_header Host $host;
line in your location? Do you really understand what does it mean or you are using it just copy-pasting from some other configuration? Check this answer to not be surprised if something goes wrong.

DO Spaces [or S3] + Nginx with subdomain

All, i'm running into issues proxying a subdomain to a DO space (suspect that AWS/S3 would act the same way); in this case, trying to serve logos off the bucket, with the requested url being logos.mysite.com/dev/logo1. Nginx should proxy_pass this to https://mybucket.dospace.com/logos/dev/logo1.png (and it will always be .png).
NGINX setup is currently:
server{
server_name logos.mysite.com
location / {
set $bucket "mybucket.dospace.com"
proxy_pass https://$bucket/logos$request_uri.png;
proxy_set_header Host $host;
... other proxy_set/hide settings ...
}
Above is simplest version of many rewrite/return/regex location attempts, but nothing works. In the example above (using chrome), a redirect to https://mybucket.dospace.com/dev/logo1.png occurs, which is missing /logos in the path. Moreover, I don't want chrome to redirect at all - it's more convenient for the user to see the original request from logos.mysite.com (if that's possible).
Was anyone able to execute on a similar setup?

Redirect/Rewrite (in nginx) Requests to Subdirectory to S3-Compatible Bucket

I am attempting to get nginx to redirect/rewrite requests to a specific subdirectory so that they are served by an S3-compatible bucket instead of the server. Here is my current server block:
{snip} (See infra.)
Despite fiddling with this for some time now, I've only been able to get it to return 404s.
Additional Information
https://omnifora.com/t/redirect-rewrite-in-nginx-requests-to-subdirectory-to-s3-compatible-bucket/402
Attempts Solutions So Far
rewrite
rewrite ^/security-now/(.*) $scheme://s3.us-west-1.wasabisys.com/bits-podcasts/security-now/$1;
return
return 302 $scheme://s3.us-west-1.wasabisys.com/bits-podcasts/security-now/$1;
proxy_pass
proxy_set_header Host s3.us-west-1.wasabisys.com;
proxy_pass $scheme://s3.us-west-1.wasabisys.com/bits-podcasts/security-now/$1;
You can't use rewrite for cross-domain redirections, for this case you must use proxy_pass, for example:
location ~ ^/directory1/(.*) {
proxy_set_header Host s3.us-west-1.wasabisys.com;
proxy_pass $scheme://s3.us-west-1.wasabisys.com/target-bucket/security-now/$1;
}
Note that if you specify your server with domain name instead or IP address, you'll need to specify additional parameter resolver in your server configuration block, for example:
server {
...
resolver 8.8.8.8;
...
}
Update.
It seems I was wrong stating that you can't use rewrite for cross-domain redirections. You can, but in this case your user got HTTP 301 redirect instead of "transparent" content delivery. Maybe you got 404 error because you missed a $ sign before scheme variable?

Nginx: Punching a Hole Through My Cache Not Working as Expected

I have a single page app and am trying to use the query parameter nocahce=true to bypass Nginx cache for the first response (HTML file) and ALL subsequent requests initiated by it (to get CSS, JS, etc).
According to this, I can bypass my cache using the query parameter but it is not working as expected.
Steps to reproduce the issue:
Use this minified generic configuration:
http {
...
proxy_cache_path /var/temp/ levels=1:2 keys_zone=STATIC:10m inactive=24h max_size=1g;
server {
...
location / {
# Using angularjs.org as an example
proxy_pass https://angularjs.org;
proxy_set_header Host angularjs.org;
proxy_cache STATIC;
proxy_cache_valid 200 10m;
proxy_cache_bypass $arg_nocache;
add_header X-Cache-Status $upstream_cache_status always;
}
}
}
Expected:
The response header of the requests "http://servername" and "http://servername/css/bootstrap" (or any other subsequent requests initiated by http://servername?nocache=true) to bypass the cache, i.e. contain
"X-Cache-Status: BYPASS".
Actual:
The response header of "http://servername" contains "X-Cache-Status: BYPASS" but "http://servername/css/bootstrap" does not, instead the value of "X-Cache-Status" is HIT/MISS/etc depending on the cache status.
Am I using the proxy_cache_bypass in a wrong way or do I need to do more to achieve the expected behavior?
Thanks!
I was able to solve this by using cookie_nocache.
Update the directive proxy_cache_bypass to:
proxy_cache_bypass $cookie_nocache;
If you need to bypass the cache, set a cookie named "nocache" to true (any value that isn't empty nor 0 will work). Since the browser will send the cookies to subsequent requests, this will work.
To quickly test this, open the console and add the cookie like this.
document.cookie="nocache=true"

Encode and decode pathname in nginx

Normally files can be accessed at:
http://example.com/cats/cat1.zip
I want to encode/encrypt the pathname (/cats/cat1.zip) so that the link is not normally accessible but accessible after the pathname is encrypted/encoded:
http://example.com/Y2F0cy9jYXQxLnppcAo=
I'm using base64 encoding above for simplicity but would prefer encryption. How do I do about doing this? Do I have to write a custom module?
You can use a Nginx rewrite rule rewrite the url (from encoded to unencoded). And, to apply your encoding logic you can use a custom function (I did it with the perl module).
Could be something like this:
http {
...
perl_modules perl/lib;
...
perl_set $uri_decode 'sub {
my $r = shift;
my $uri = $r->uri;
$uri = perl_magic_to_decode_the_url;
return $uri;
}';
...
server {
...
location /your-protected-urls-regex {
rewrite ^(.*)$ $scheme://$host$uri_decode;
}
If your only concern is limiting access to certain URLs you may take a look at this post on Securing URLs with the Secure Link Module in Nginx.
It provides fairly simple method for securing your files — the most basic and simple way to encrypt your URLs is by using the secure_link_secret directive:
server {
listen 80;
server_name example.com;
location /cats {
secure_link_secret yoursecretkey;
if ($secure_link = "") { return 403; }
rewrite ^ /secure/$secure_link;
}
location /secure {
internal;
root /path/to/secret/files;
}
}
The URL to access cat1.zip file will be http://example.com/cats/80e2dfecb5f54513ad4e2e6217d36fd4/cat1.zip where 80e2dfecb5f54513ad4e2e6217d36fd4 is the MD5 hash computed on a text string that concatenates two elements:
The part of the URL that follows the hash, in our case cat1.zip
The parameter to the secure_link_secret directive, in this case yoursecretkey
The above example also assumes the files accessible via the encrypted URLs are stored at /path/to/secret/files/secure directory.
Additionally, there is a more flexible, but also more complex method, for securing URLs with ngx_http_secure_link_module module by using the secure_link and secure_link_md5 directives, to limit the URL access by IP address, define expiration time of the URLs etc.
If you need to completely obscure your URLs (including the cat1.zip part), you'll need to make a decision between:
Handling the decryption of the encrypted URL on the Nginx side — writing your own, or reusing a module written by someone else
Handling the decryption of the encrypted URL somewhere in your application — basically using Nginx to proxy your encrypted URLs to your application where you decrypt them and act accordingly, as #cnst writes above.
Both approaches have pros and cons, but IMO the latter one is simpler and more flexible — once you set up your proxy you don't need to worry much about Nginx, nor need to compile it with some special prerequisites; no need to write or compile code in language other than what you already are writing in your application (unless your application includes code in C, Lua or Perl).
Here's an example of a simple Nginx/Express application where you'd handle the decryption within your application. The Nginx configuration might look like:
server {
listen 80;
server_name example.com;
location /cats {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-NginX-Proxy true;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_pass http://127.0.0.1:8000;
}
location /path/to/secured/files {
internal;
}
}
and on the application (Node.js/Express) side you may have something like:
const express = require ('express');
const app = express();
app.get('/cats/:encrypted', function(req, res) {
const encrypted = req.params.encrypted;
//
// Your decryption logic here
//
const decryptedFileName = decryptionFunction(encrypted);
if (decryptedFileName) {
res.set('X-Accel-Redirect', `/path/to/secured/files/${decryptedFileName}`);
} else {
// return error
}
});
app.listen(8000);
The above example assumes that the secured files are located at /path/to/secured/files directory. Also it assumes that if the URL is accessible (properly encrypted) you are sending the files for download, but the same logic would apply if you need to do something else.
The easiest way would be to write a simple backend (with interfacing through proxy_pass, for example) that would decrypt the filename from the $uri, and provide the results within the X-Accel-Redirect response header (which is subject to proxy_ignore_headers in nginx), which would subsequently be subject to an internal redirect within nginx (to a location that cannot be accessed without going through the backend first), and served with all the optimisations that are already part of nginx.
location /sec/ {
proxy_pass http://decryptor/;
}
location /x-accel-redirect-here/ {
internal;
alias …;
}
The above approach follows the ‘microservices’ architecture, in that your decryptor service's only job is to perform decryption and access control, leaving it up to nginx to ensure files are served correctly and in the most efficient way possible through the use of the internal specially-treated X-Accel-Redirect HTTP response header.
Consider using something like OpenResty with Lua.
Lua can do almost everything you want in nginx.
https://openresty.org/
https://github.com/openresty/
==== UPDATE ====
Now we have njs https://nginx.org/en/docs/njs/ , javascript for nginx, it can also do almost everything.
If anyone bumps into the problem of unescaping query arguments in Nginx, this is how I solved it on a standard Nginx setup on Ubuntu 20.04 (the query argument in the url is foo as in https://some.domain.com/some/path?foo=some%2Fquery%2Fvalue):
perl_modules perl/lib;
perl_set $arg_foo_decoded 'sub {
my $r = shift;
my $arg_foo = $r->variable("arg_foo");
$arg_foo =~ s/\+/ /ig;
my $arg_foo_decoded = $r->unescape($arg_foo);
return $arg_foo_decoded;
}';
I could then use the $arg_foo_decoded variable in my location blocks. It now contains some/query/value.
Edit
Added a line to ensure compatibility with form values containing spaces encoded as + (see also: https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1).

Resources