Fixing unescaped # in the URL with nginx - nginx

A bad HTTP client isn't escaping hash signs and is sending them to nginx, like so:
GET /foo/escaped#stuff
Instead of:
GET /foo/escaped%23stuff
This breaks my nginx configuration, since nginx strips the text after the # in the proxy_pass directive. How do I escape the hash sign?
Using return 200 "$request_uri"; does show me that nginx is reading it, so it seems like it's possible. Nginx, however, ignores it in location blocks, so I can't actually match it with anything.
You can use the below code to send unescaped HTTP GET requests in Python:
import socket
def get(host, port, uri):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((host, port))
sock.send('GET {} HTTP/1.0\r\nHost: {}\r\n\r\n'.format(uri, host))
return sock.recv(1000)

Related

nginx - connection timed out while reading upstream

I have a flask server with and endpoint that processes some uploaded .csv files and returns a .zip (in a JSON reponse, as a base64 string)
This process can take up to 90 seconds
I've been setting it up for production using gunicorn and nginx and I'm testing the endpoint with smaller .csv s. They get processed fine and in a couple seconds I get the "got blob" log. But nginx doesn't return it to the client and finally it times out. I set up a longer fail-timeout of 10 minutes and the client WILL wait 10 minutes, then time out
the proxy read timeout offered as solution here is set to 3600s
Also the proxy connect timeout is set to 75s according to this
also the timeout for the gunicorn workers according to this
The error log says: "upstream timed out connection timed out while reading upstream"
I also see examples of nginx receiving an OPTIONS request and immediately after the POST request (some CORS weirdness from the client) where nginx passes the OPTIONS request but fails to pass the POST request to gunicorn despite nginx having received it
Question:
What am I doing wrong here?
Many thanks
http {
upstream flask {
server 127.0.0.1:5050 fail_timeout=600;
}
# error log
# 2022/08/18 14:49:11 [error] 1028#1028: *39 upstream timed out (110: Connection timed out) while reading upstream, ...
# ...
server {
# ...
location /api/ {
proxy_pass http://flask/;
proxy_read_timeout 3600;
proxy_connect_timeout 75s;
# ...
}
# ...
}
}
# wsgi.py
from main import app
if __name__ == '__main__':
app.run()
# flask endpoint
#app.route("/process-csv", methods=['POST'])
def process_csv():
def wrapped_run_func():
return blob, export_filename
# ...
try:
blob, export_filename = wrapped_run_func()
b64_file = base64.b64encode(blob.getvalue()).decode()
ret = jsonify(file=b64_file, filename=export_filename)
# return Response(response=ret, status=200, mimetype="application/json")
print("got blob")
return ret
except Exception as e:
app.logger.exception(f"0: Error processing file: {export_filename}")
return Response("Internal server error", status=500)
ps. getting this error from stackoverflow
"Your post appears to contain code that is not properly formatted as code. Please indent all code by 4 spaces using the code toolbar button or the CTRL+K keyboard shortcut. For more editing help, click the [?] toolbar icon."
for having perfectly well formatted code with language syntax, I'm sorry that I had to post it ugly
Sadly I got no response
See last lines for the "solution" finally implemented
CAUSE OF ERROR: I believe the problem is that I'm hosting the Nginx server on wsl1
I tried updating to wsl2 and see if that fixed it but I need to enable some kind of "nested virtualization", as the wsl1 is running already on a VM.
Through conf changes I got it to the point where no error is logged, gunicorn return the file then it just stays in the ether. Nginx never gets/sends the response
"SOLUTION":
I ended up changing the code for the client, the server and the nginx.conf file:
the server saves the resulting file and only returns the file name
the client inserts the filename into an href that then displays a link
on click a request is sent to nginx which in turn just sends the file from a static folder, leaving gunicorn alone
I guess this is the optimal way to do it anyway, though it still bugs me I couldn't (for sure) find the reason of the error

Nginx as reverse proxy: How to display a custom error page for upstream errors, UNLESS the upstream says not to?

I have an Nginx instance running as a reverse proxy. When the upstream server does not respond, I send a custom error page for the 502 response code. When the upstream server sends an error page, that gets forwarded to the client, and I'd like to show a custom error page in that case as well.
If I wanted to replace all of the error pages from the upstream server, I would set proxy_intercept_errors on to show a custom page on each of them. However, there are cases where I'd like to return the actual response that the upstream server sent: for example, for API endpoints, or if the error page has specific user-readable text relating to the issue.
In the config, a single server is proxying multiple applications that are behind their own proxy setups and their own rules for forwarding requests around, so I can't just specify this per each location, and it has to work for any URL that matches a server.
Because of this, I would like to send the custom error page, unless the upstream application says not to. The easiest way to do this would be with a custom HTTP header. There is a similar question about doing this depending on the request headers. Is there a way to do this depending on the response headers?
(It appears that somebody else already had this question and their conclusion was that it was impossible with plain Nginx. If that's true, I would be interested in some other ideas on how to solve this, possibly using OpenResty like that person did.)
So far I have tried using OpenResty to do this, but it doesn't seem compatible with proxy_pass: the response that the Lua code generates seems to overwrite the response from the upstream server.
Here's the location block I tried to use:
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://localhost:65000;
content_by_lua_block{
ngx.say("This seems to overwrite the content from the proxy?!")
}
body_filter_by_lua_block {
ngx.arg[1]="Truncated by code!"
ngx.arg[2]=false
if ngx.status >= 400 then
if not ngx.resp.get_headers()["X-Verbatim"] then
local file = io.open('/usr/share/nginx/error.html', 'w')
local html_text = file:read("*a")
ngx.arg[1] = html_text
ngx.arg[2] = true
return
end
end
}
}
I don't think that you can send custom error pages based on the response header since the only way, as per my knowledge, you could achieve that was using either map or if directive. Since both of these directives don't have scope after the request is sent to the upstream, they can't possibly read the response header.
However, you could do this using openresty and writing your own lua script. The lua script to do such a thing would look something like:
location / {
body_filter_by_lua '
if ngx.resp.get_headers()["Cust-Resp-Header"] then
local file = io.open('/path/to/file.html', 'r')
local html_text = f:read()
ngx.arg[1] = html_text
ngx.arg[2] = true
return
end
';
#
.
.
.
}
You could also use body_filter_by_lua_block (you could enclose your lua code inside curly brances instead writing as nginx string) or body_filter_by_lua_file (you could write your lua code in a separate file and provide the file path).
You can find how to get started with openresty here.
P.S.: You can read the response status code from the upstream using ngx.status. As far as reading the body is concerned, the variable ngx.arg[1] would contain the response body after the response from the upstream which we're modifying here. You can save the ngx.arg[1] in a local variable and try to read the error message from that using some regexp and appending later in the html_text variable. Hope that helps.
Edit 1: Pasting here a sample working lua block inside a location block with proxy_pass:
location /hello {
proxy_pass http://localhost:3102/;
body_filter_by_lua_block {
if ngx.resp.get_headers()["erratic"] == "true" then
ngx.arg[1] = "<html><body>Hi</body></html>"
end
}
}
Edit 2: You can't use content_by_lua_block with proxy_pass or else your proxy wouldn't work. Your location block should look like this (assuming X-Verbatim header is set to "false" (a string) if you've to override the error response body from the upstream).
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://localhost:65000;
body_filter_by_lua_block {
if ngx.status >= 400 then
if ngx.resp.get_headers()["X-Verbatim"] == "false" then
local file = io.open('/usr/share/nginx/error.html', 'w')
local html_text = file:read("*a")
ngx.arg[1] = html_text
ngx.arg[2] = true
end
end
}
}
This is somewhat opposite of the requested but I think it can fit anyway. It shows the original response unless upstream says what to show.
There is a set of X-Accel custom headers that are evaluated from upstream responses. X-Accel-Redirect allows you to tell NGINX to process another location instead. Below is an example how it can be used.
This is a Flask application that gives 50/50 normal responses and errors. The error responses come with X-Accel-Redirect header, instructing NGINX to reply with contents from the #error_page location.
import flask
import random
application = flask.Flask(__name__)
#application.route("/")
def main():
if random.randint(0, 1):
resp = flask.Response("Random error") # upstream body contents
resp.headers['X-Accel-Redirect'] = '#error_page' # the header
return resp
else:
return "Normal response"
if __name__ == '__main__':
application.run("0.0.0.0", port=4000)
And here is a NGINX config for that:
server {
listen 80;
location / {
proxy_pass http://localhost:4000/;
}
location #error_page {
return 200 "That was an error";
}
}
Putting these together you will see either "Normal response" from the app, or "That was an error" from the #error_page location ("Random error" will be suppressed). With this setup you can create a number of various locations (#error_502, #foo, #etc) for various errors and make your application to use them.

Properly forwarding visitor's IP address from flask_restful to nginx

I'm running a flask_restful API service that is being forwarded traffic via an nginx proxy. While the IP address is being forward through the proxy via some variables, flask_restful doesn't seem to be able to see these variables, as indicated by its output which points to 127.0.0.1:
127.0.0.1 - - [25/Oct/2017 21:55:37] "HEAD sne/event/SN2014J/photometry HTTP/1.0" 200 -
While I know I can retrieve the IP address via the request object (nginx forwards X-Forwarded-For and X-Real-IP), I don't know how to make the above output from flask_restful show/use this IP address, which is important if you want to say limit the number of API calls from a given IP address with flask_limiter. Any way to make this happen?
You can use (for older version of werkzeug)
from werkzeug.contrib.fixers import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app)
For newer version of werkzeug (1.0.0+)
from werkzeug.middleware.proxy_fix import ProxyFix
app.wsgi_app = ProxyFix(app.wsgi_app)
This will fix the IP using X-Forwarded-For. If you need a enhanced version you case use
class SaferProxyFix(object):
"""This middleware can be applied to add HTTP proxy support to an
application that was not designed with HTTP proxies in mind. It
sets `REMOTE_ADDR`, `HTTP_HOST` from `X-Forwarded` headers.
If you have more than one proxy server in front of your app, set
num_proxy_servers accordingly
Do not use this middleware in non-proxy setups for security reasons.
get_remote_addr will raise an exception if it sees a request that
does not seem to have enough proxy servers behind it so long as
detect_misconfiguration is True.
The original values of `REMOTE_ADDR` and `HTTP_HOST` are stored in
the WSGI environment as `werkzeug.proxy_fix.orig_remote_addr` and
`werkzeug.proxy_fix.orig_http_host`.
:param app: the WSGI application
"""
def __init__(self, app, num_proxy_servers=1, detect_misconfiguration=False):
self.app = app
self.num_proxy_servers = num_proxy_servers
self.detect_misconfiguration = detect_misconfiguration
def get_remote_addr(self, forwarded_for):
"""Selects the new remote addr from the given list of ips in
X-Forwarded-For. By default the last one is picked. Specify
num_proxy_servers=2 to pick the second to last one, and so on.
"""
if self.detect_misconfiguration and not forwarded_for:
raise Exception("SaferProxyFix did not detect a proxy server. Do not use this fixer if you are not behind a proxy.")
if self.detect_misconfiguration and len(forwarded_for) < self.num_proxy_servers:
raise Exception("SaferProxyFix did not detect enough proxy servers. Check your num_proxy_servers setting.")
if forwarded_for and len(forwarded_for) >= self.num_proxy_servers:
return forwarded_for[-1 * self.num_proxy_servers]
def __call__(self, environ, start_response):
getter = environ.get
forwarded_proto = getter('HTTP_X_FORWARDED_PROTO', '')
forwarded_for = getter('HTTP_X_FORWARDED_FOR', '').split(',')
forwarded_host = getter('HTTP_X_FORWARDED_HOST', '')
environ.update({
'werkzeug.proxy_fix.orig_wsgi_url_scheme': getter('wsgi.url_scheme'),
'werkzeug.proxy_fix.orig_remote_addr': getter('REMOTE_ADDR'),
'werkzeug.proxy_fix.orig_http_host': getter('HTTP_HOST')
})
forwarded_for = [x for x in [x.strip() for x in forwarded_for] if x]
remote_addr = self.get_remote_addr(forwarded_for)
if remote_addr is not None:
environ['REMOTE_ADDR'] = remote_addr
if forwarded_host:
environ['HTTP_HOST'] = forwarded_host
if forwarded_proto:
environ['wsgi.url_scheme'] = forwarded_proto
return self.app(environ, start_response)
from saferproxyfix import SaferProxyFix
app.wsgi_app = SaferProxyFix(app.wsgi_app)
PS: Code taken from http://esd.io/blog/flask-apps-heroku-real-ip-spoofing.html

Nginx auth_request handler accessing POST request body?

I'm using Nginx (version 1.9.9) as a reverse proxy to my backend server. It needs to perform authentication/authorization based on the contents of the POST requests. And I'm having trouble reading the POST request body in my auth_request handler. Here's what I got.
Nginx configuration (relevant part):
server {
location / {
auth_request /auth-proxy;
proxy_pass http://backend/;
}
location = /auth-proxy {
internal;
proxy_pass http://auth-server/;
proxy_pass_request_body on;
proxy_no_cache "1";
}
}
And in my auth-server code (Python 2.7), I try to read the request body like this:
class AuthHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def get_request_body(self):
content_len = int(self.headers.getheader('content-length', 0))
content = self.rfile.read(content_len)
return content
I printed out the content_len and it had the correct value. However, the self.rfile.read() will simply hang. And eventually it will time out and returns "[Errno 32] Broken pipe".
This is how I posted test data to the server:
$ curl --data '12345678' localhost:1234
The above command hangs as well and eventually times out and prints "Closing connection 0".
Any obvious mistakes in what I'm doing?
Thanks much!
The code of the nginx-auth-request-module is annotated at nginx.com. The module always replaces the POST body with an empty buffer.
In one of the tutorials, they explain the reason, stating:
As the request body is discarded for authentication subrequests, you will
need to set the proxy_pass_request_body directive to off and also set the
Content-Length header to a null string
The reason for this is that auth subrequests are sent at HTTP GET methods, not POST. Since GET has no body, the body is discarded. The only workaround with the existing module would be to pull the needed information from the request body and put it into an HTTP header that is passed to the auth service.

Nginx How to get the current upstream ip and port

I'm using Nginx-Lua framework, in log phase, I want to get the current request's upstream ip contains port, In this guide, showed upstream_addr is something similar, but it species the all upstream servers, not the current one, If I want to get the current one, what should I do?
$upstream_addr will return upstream address, which maybe only one or something like this: 192.168.1.1:80, 192.168.1.2:80, unix:/tmp/sock. You can split the return value by comma:
local addrs = _.split(ngx.var.upstream_addr, ',') -- underscore.lua
if #addrs > 0 then
ngx.log(ngx.ERR, addrs[#addrs]) -- upstream address you want.
end

Resources