HttpRequest: inconsistent Scheme & Host properties

HttpRequest: inconsistent Scheme & Host properties - http

My ASP.NET Core 3.1 application runs in Kubernetes. The ingress (load balancer) terminates SSL and talks to the pod over plain HTTP.
Let's say the ingress is reachable at https://my-app.com:443, and talks to the pod (where my app is running) at http://10.0.0.1:80.
When handling a request, the middleware pipeline sees an HttpRequest object with the following:
Scheme == "http"
Host.Value == "my-app.com"
IsHttps == False
This is weird:
Judging by the Scheme (and IsHttps), it seems the HttpRequest object describes the forwarded request from the ingress to the pod, the one that goes over plain HTTP. However, why isn't Host.Value equal to 10.0.0.1 in that case?
Or vice versa: if HttpRequest is trying to be clever and represent the original request, the one to the ingress, why doesn't it show "https" along with "my-app.com"?
At no point in handling the request is there a request coming to http://my-app.com. It's either https://my-app.com or http://10.0.0.1. The combination is inconsistent.
Other details
Digging deeper, the HttpRequest object has the following headers (among others) that show the reverse proxying in action:
Host: my-app.com
Referer: https://my-app.com/swagger/index.html
X-Real-IP: 10.0.0.1
X-Forwarded-For: 10.0.0.1
X-Forwarded-Host: my-app.com
X-Forwarded-Port: 443
X-Forwarded-Proto: https
X-Scheme: https
I'm guessing HttpRequest is using these to get hold of the original host (my-app.com rather than 10.0.0.1) but it doesn't do the same for the original scheme (https rather than http).
Q1: Is this expected, and if so, what is the rationale?
Q2: What's the best way to get at the original URL (https://my-app.com)? The best I've found so far was to check if the X-Scheme and X-Forwarded-Host headers were present (by inspecting HttpRequest.Headers) and if so, using those. However, it's a little weird having to go to the raw HTTP headers in the middleware pipeline.

Q1: Is this expected, and if so, what is the rationale?
I would say yes, it's the expected behavior.
The 'Host.Value=my-app.com' of HttpRequest object is reflecting the request header field originated by client (web browser, curl, ...), e.g.:
curl --insecure -H 'Host: my-app.com' https://<FRONTEND_FOR_INGRESS-MOST_OFTEN_LB-IP>
which is set*[1] in location block for that server 'my-app.com' inside generated nginx.conf file:
...
set $best_http_host $http_host;
# [1] - Set Host value
proxy_set_header Host $best_http_host;
...
# Pass the extracted client certificate to the backend
...
proxy_set_header Connection $connection_upgrade;
proxy_set_header X-Request-ID $req_id;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Host $best_http_host;
...
whereas 'http_host' variable is created based on following "ngx_http_core_module" core functionality:
$http_name -
arbitrary request header field; the last part of a variable name is the field name converted to lower case with dashes replaced by underscores
So described behavior is not anyhow unique to ASP.NET Core 3.0, you see unknown Dictionary containing key/value pairs of custom headers set explicitly by nginx controller, accordingly to nginx ingress current configuration, that's it.
You can inspect current nginx controller's configuration by your self with following command:
kubectl exec -it po/<nginx-ingress-controller-pod-name> -n <ingress-controller-namespace> -- cat /etc/nginx/nginx.conf
Q2: What's the best way to get at the original URL
(https://my-app.com)?
I would try with constructing it using Configuration snippet, this is by introducing another custom header, where you concatenate values.

Related

Assigning a custom variable to real_ip_header

I am trying to get the client IP when the requests are either coming through an application load balancer, or through AWS Cloudfront.
When its just coming through load balancer, I use X-Forwarded-For header (set by load balancer) and if its coming through the CloudFront, I use the custom header CloudFront-Viewer-Address set by Cloudfront.
Since the application is not aware if its coming through Cloudfront or ALB, I need to make the distinction, which I do it with a map:
map $http_CloudFront_Viewer_Address $remote_addr_header {
"~*" $http_CloudFront_Viewer_Address;
default $http_x_forwarded_for;
}
This map is working. I can log $remote_addr_header and it is getting the correct value.
However, this is not working:
real_ip_header $remote_addr_header;
Although the following are working:
real_ip_header X-Forwarded-For;
real_ip_header CloudFront-Viewer-Address;
So I am wondering if I am not able to directly assign a variable to real_ip_header, as the documentation says
Syntax: real_ip_header field | X-Real-IP | X-Forwarded-For | proxy_protocol;
Default:
real_ip_header X-Real-IP;
Context: http, server, location
Is there a way I can use the custom variable $remote_addr_header in real_ip_header?

As the name implies and documentation explicitly states (emphasis is mine):
The ngx_http_realip_module module is used to change the client address and optional port to those sent in the specified header field.
The directive interprets its argument as the name of a HTTP header where the IP address should be taken from - it is not being interpreted as an immediate constant as you may hope.

airflow 1.10.10 behind Nginx Proxy: Oauth Redirect URL http instead of https

I am deploying Airflow 1.10.10 on Kubernetes using the official Helm Chart (v.7.0.0) but I am running into issues with Oauth.
Here's my setup:
Airflow Webserver with RBAC enabled. Airflow uses Flask Appabuilder in this case.
Server runs behind an Nginx reverse proxy that does https termination.
I try to authenticate against an Azure AD tenant using OAuth
my problem
- When I try to login with the Microsoft account I get the error message "The reply URL specified in the request does not match the reply URLs configured for the application".
- The error is caused by Airflow setting the redirect URL to http://airflow.example.com/oauth_authorized/azure instead of httpS://airflow.example.com/oauhth_authorized/azure
what I think the issue is
Since nginx sends http requests to Flask, flask generates an http url for the redirect url instead of https.
So from what I understand, I need to find a way to tell Airflow/Flask that it should use https to generate the redirect URL instead.
What I tried:
I have two angles of attack:
1. setting the base URL to https explicitly in the webserver_config.py file
I tried putting environ['wsgi.url_scheme'] = 'https' in the config file, but I get a "environ is not defined" error.
Can I even set this in in the config.py file? What would I need to import for it to work?
2. Setting proxy headers in nginx
I tried to set multiple headers in Nginx using Kubernetes annotations, my current settings are:
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Port $server_port;
I also tried to set
proxy_set_header Host $host;
but this leads to all traffic being redirected to a comma separated list of domains
airflow.example.com,airflow.example.com
which obviously does not work.
I based these settings on the Flask documentation.
The rest of the Nginx config is the default of the official Nginx ingress controller I have running in my cluster.
Does anybody have an idea what the issue could be? Are my two angles of attack valid or is there a third one that I am missing?
Thanks a lot, any help is appreciated!

nginx reverse proxy not detecting dropped load balancer

We have the following config for our reverse proxy:
location ~ ^/stuff/([^/]*)/stuff(.*)$ {
set $sometoken $1;
set $some_detokener "foo";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Authorization "Basic $do_token_decoding";
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_redirect https://place/ https://place_with_token/$1/;
proxy_redirect http://place/ http://place_with_token/$1/;
resolver 10.0.0.2 valid=10s;
set $backend https://real_storage$2;
proxy_pass $backend;
}
Now, all of this works .... until the real_storage rotates a server. For example, say real_storage comes from foo.com. This is a load balancer which directs to two servers: 1.1.1.1 and 1.1.1.2. Now, 1.1.1.1 is removed and replaced with 1.1.1.3. However, nginx continues to try 1.1.1.1, resulting in:
epoll_wait() reported that client prematurely closed connection, so upstream connection is closed too while connecting to upstream, client: ..., server: ..., request: "GET ... HTTP/1.1", upstream: "https://1.1.1.1:443/...", host: "..."
Note that the upstream is the old server, shown by a previous log:
[debug] 1888#1888: *570837 connect to 1.1.1.1:443, fd:60 #570841
Is this something misconfigured on our side or the host for our real_storage?
*The best I could find that sounds even close to my issue is https://mailman.nginx.org/pipermail/nginx/2013-March/038119.html ...
Further Details
We added
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
and it still failed. I am now beginning to suspect that since it is two ELBs (ours and theirs) then the resolver we are using is the problem - since it is amazon specific (per https://serverfault.com/a/929517/443939)...and amazon still sees it as valid, but it won't resolve externally (our server trying to hit theirs..)
I have removed the resolver altogether from one configuration and will see where that goes. We have not been able to reproduce this using internal servers, so we must rely on waiting for the third party servers to cycle (about once per week).
I'm a bit uncertain about this resolver being the issue only because a restart of nginx will solve the problem and get the latest IP pair :/
Is it possible that I have to set the dns variable without the https?:
set $backend real_storage$2;
proxy_pass https://$backend;
I know that you have to use a variable or else the re-resolve won't happen, but maybe it is very specific which part of the variable - as I have only ever seen it set up as above in my queries....but no reason was ever given...I'll set that up on a 2nd server and see what happens...
And for my 3rd server I am trying this comment and moving the set outside of location. Of course if anybody else has a concrete idea then I'm open to changing my testing for this go round :D
set $rootbackend https://real_storage;
location ~ ^/stuff/([^/]*)/stuff(.*)$ {
set $backend $rootbackend$2;
proxy_pass $backend;
}
Note that I have to set it inside because it uses a dynamic variable, though.

As it was correctly noted by #cnst, using a variable in proxy_pass makes nginx resolve address of real_storage for every request, but there are further details:
Before version 1.1.9 nginx used to cache DNS answers for 5 minutes.
After version 1.1.9 nginx caches DNS answers for a duration equal to their TTL, and the default TTL of Amazon ELB is 60 seconds.
So it is pretty legal that after rotation nginx keeps using old address for some time. As per documentation, the expiration time of DNS cache can be overridden:
resolver 127.0.0.1 [::1]:5353 valid=10s;
or
resolver 127.0.0.1 ipv6=off valid=10s;

There's nothing special about using variables within http://nginx.org/r/proxy_pass — any variable use will make nginx involve the resolver on each request (if not found in a server group — perhaps you have a clash?), you can even get rid of $backend if you're already using $2 in there.
As to interpreting the error message — you have to figure out whether this happens because the existing connections get dropped, or whether it's because nginx is still trying to connect to the old addresses.
You might also want to look into lowering the _time values within http://nginx.org/en/docs/http/ngx_http_proxy_module.html; they all appear to be set at 60s, which may be too long for your use-case:
http://nginx.org/r/proxy_connect_timeout
http://nginx.org/r/proxy_send_timeout
http://nginx.org/r/proxy_read_timeout
I'm not surprised that you're not able to reproduce this issue, because there doesn't seem to be anything wrong with your existing configuration; perhaps the problem manifested itself in an earlier revision?

Nginx Reverse Proxy WebSocket Timeout

I'm using java-websocket for my websocket needs, inside a wowza application, and using nginx for ssl, proxying the requests to java.
The problem is that the connection seems to be cut after exactly 1 hour, server-side. The client-side doesn't even know that it was disconnected for quite some time. I don't want to just adjust the timeout on nginx, I want to understand why the connection is being terminated, as the socket is functioning as usual until it isn't.
EDIT:
Forgot to post the configuration:
location /websocket/ {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
include conf.d/proxy_websocket;
proxy_connect_timeout 1d;
proxy_send_timeout 1d;
proxy_read_timeout 1d;
}
And that included config:
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_pass http://127.0.0.1:1938/;
Nginx/1.12.2
CentOS Linux release 7.5.1804 (Core)
Java WebSocket 1.3.8 (GitHub)

The timeout could be coming from the client, nginx, or the back-end. When you say that it is being cut "server side" I take that to mean that you have demonstrated that it is not the client. Your nginx configuration looks like it shouldn't timeout for 1 day, so that leaves only the back-end.
Test the back-end directly
My first suggestion is that you try connecting directly to the back-end and confirm that the problem still occurs (taking nginx out of the picture for troubleshooting purposes). Note that you can do this with command line utilities like curl, if using a browser is not practical. Here is an example test command:
time curl --trace-ascii curl-dump.txt -i -N \
-H "Host: example.com" \
-H "Connection: Upgrade" \
-H "Upgrade: websocket" \
-H "Sec-WebSocket-Version: 13" \
-H "Sec-WebSocket-Key: BOGUS+KEY+HERE+IS+FINE==" \
http://127.0.0.1:8080
In my (working) case, running the above example stayed open indefinitely (I stopped with Ctrl-C manually) since neither curl nor my server was implementing a timeout. However, when I changed this to go through nginx as a proxy (with default timeout of 1 minute) as shown below I saw a 504 response from nginx after almost exactly 1 minute.
time curl -i -N --insecure \
-H "Host: example.com" \
https://127.0.0.1:443/proxied-path
HTTP/1.1 504 Gateway Time-out
Server: nginx/1.14.2
Date: Thu, 19 Sep 2019 21:37:47 GMT
Content-Type: text/html
Content-Length: 183
Connection: keep-alive
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.14.2</center>
</body>
</html>
real 1m0.207s
user 0m0.048s
sys 0m0.042s
Other ideas
Someone mentioned trying proxy_ignore_client_abort but that shouldn't make any difference unless the client is closing the connection. Besides, although that might keep the inner connection open I don't think it is able to keep the end-to-end stream intact.
You may want to try proxy_socket_keepalive, though that requires nginx >= 1.15.6.
Finally, there's a note in the WebSocket proxying doc that hints at a good solution:
Alternatively, the proxied server can be configured to periodically send WebSocket ping frames to reset the timeout and check if the connection is still alive.
If you have control over the back-end and want connections to stay open indefinitely, periodically sending "ping" frames to the client (assuming a web browser is used then no change is needed on the client-side as it is implemented as part of the spec) should prevent the connection from being closed due to inactivity (making proxy_read_timeout unnecessary) no matter how long it's open or how many middle-boxes are involved.

Most likely it's because your configuration for the websocket proxy needs tweaking a little, but since you asked:
There are some challenges that a reverse proxy server faces in
supporting WebSocket. One is that WebSocket is a hop‑by‑hop protocol,
so when a proxy server intercepts an Upgrade request from a client it
needs to send its own Upgrade request to the backend server, including
the appropriate headers. Also, since WebSocket connections are long
lived, as opposed to the typical short‑lived connections used by HTTP,
the reverse proxy needs to allow these connections to remain open,
rather than closing them because they seem to be idle.
Within your location directive which handles your websocket proxying you need to include the headers, this is the example Nginx give:
location /wsapp/ {
proxy_pass http://wsbackend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
This should now work because:
NGINX supports WebSocket by allowing a tunnel to be set up between a
client and a backend server. For NGINX to send the Upgrade request
from the client to the backend server, the Upgrade and Connection
headers must be set explicitly, as in this example
I'd also recommend you have a look at the Nginx Nchan module which adds websocket functionality directly into Nginx. Works well.

nginx redirect not correct

I'm using nginx as the front end of my mongrel. And mongrel is listening on 3001, and nginx is listening on 3000.
In my application, there will be a redirection after creating a model. let's say, I post a request to http://xxxx:3000/users, it should be redirect to http://xxxx:3000/users/1, (1 is the id of the new user), but actually, it was redirected to http://xxxx/users/1, which cause a 404 error.
Why the port 3000 is missing?

Are you using proxy_pass ? You should add this line:
proxy_set_header Host $host:3000;
You need put your nginx config up here.
====
better solution:
proxy_set_header Host $http_host;
$host not include port, and $http_host is the value from http header, it is added by browser.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex