NGinx & Django, serving large files (3gb+) - nginx

This question was already answered here.
I'm having some problems to serve large file downloads/uploads (3gb+).
As I'm using Django I guess that the problems to server the file can become from Django or NGinx.
In my NGinx enabled site I have
server {
...
client_max_body_size 4G;
...
}
And over django I'm serving the files in chunk sizes:
def return_file(path):
filename = os.path.basename(path)
chunk_size = 8192
response = StreamingHttpResponse(FileWrapper(open(path), chunk_size), content_type=mimetypes.guess_type(path)[0])
response['Content-Length'] = os.path.getsize(path)
response['Content-Disposition'] = 'attachment; filename={0}'.format(filename)
return response
This method allowed me to pass from downloads of 600Mb~ to 2.6Gb, but it seems that the downloads are getting truncated at 2.6Gb. I traced the error:
2015/09/04 11:31:30 [error] 831#0: *553 upstream prematurely closed connection while reading upstream, client: 127.0.0.1, server: localhost, request: "GET /chat/download/photorec.zip/ HTTP/1.1", upstream: "http://unix:/web/rsmweb/run/gunicorn.sock:/chat/download/photorec.zip/", host: "localhost", referrer: "http://localhost/chat/2/"
After reading some posts I added the following to my NGinx conf:
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_redirect off;
But I got the same error with an *1 instead of a *553*
I also thought that It could be a Django database Timeout, so I added:
DATABASE_OPTIONS = {
'connect_timeout': 14400,
}
But it is not working either. (the download over the development server takes about 30 seconds)
Thanks for any help!

For large files try to use NGINX itself with X-Accel. NGINX is intended to server static content, while Django is for your application logic.
For more information
NGINX X-Accel Wiki and this answer.

The error from nginx indicates that the upstream closed the connection, so it's a problem with django. I'd recommend looking for errors and debugging information in the django logs.

Related

GCP deployment with nginx - uwsgi - flask fails

I have a very simple flask app that is deployed on GKE and exposed via google external load balancer. And getting random 502 responses from the backend-service (added a custom headers on backend-service and nginx to make sure the source and I can see the backend-service's header but not nginx's)
The setup is;
LB -> backend-service -> neg -> pod (nginx -> uwsgi) where pod is the application built using flask and deployed via uwsgi and nginx.
The scenario is to handle image uploads in simple-secured way. Sender sends me a token with upload request.
My flask app
receive request and check the sent token via another service using "requests".
If token valid, proceed to handle the image and return 200
If token is not valid, stop and send back a 401 response.
First, I got suspicious about the 200 and 401's. And reverted all responses to 200. Following some of the expected responses, server starts to respond 502 and keep sending it. "Some of the messages at the very beginning succeeded".
nginx error logs contains below lines
2023/02/08 18:22:29 [error] 10#10: *145 readv() failed (104: Connection reset by peer) while reading upstream, client: 35.191.17.139, server: _, request: "POST /api/v1/imageUpload/image HTTP/1.1", upstream: "uwsgi://127.0.0.1:21270", host: "example-host.com"
my uwsgi.ini file is as below;
[uwsgi]
socket = 127.0.0.1:21270
master
processes = 8
threads = 1
buffer-size = 32768
stats = 127.0.0.1:21290
log-maxsize = 104857600
logdate
log-reopen
log-x-forwarded-for
uid = image_processor
gid = image_processor
need-app
chdir = /server/
wsgi-file = image_processor_application.py
callable = app
py-auto-reload = 1
pidfile = /tmp/uwsgi-imgproc-py.pid
my nginx.conf is as below
location ~ ^/api/ {
client_max_body_size 15M;
include uwsgi_params;
uwsgi_pass 127.0.0.1:21270;
}
Lastly, my app has a healthcheck method with simple JSON response. It does no extra stuff and simply returns. This never fails as explained above.
Edit : my nginx access logs in the pod shows the response as 401 while the client receives 502.
for those who gonna face with the same issue, the problem was post data reading (or not reading).
nginx was expecting to get post data read by the proxied, in our case uwsgi, app. But according to my logic I was not reading it in some cases and returning back the response.
Setting uwsgi post-buffering solved the issue.
post-buffering = %(16 * 1024 * 1024)
Which led me to this solution;
https://stackoverflow.com/a/26765936/631965
Nginx uwsgi (104: Connection reset by peer) while reading response header from upstream

Videos greater than 2 MB are not processed by the Nginx server to backend & to AWS S3 bucket

We have been developing an enterprise application for the last two years. Based on microservice architecture, we have nine services with their respective databases and an Angular frontend on NGINX that calls/connects microservices. During our development, we implemented these services and their databases on the Hetzner cloud server with 4GB RAM and 2 CPUs over the internal network, and everything has been working seamlessly. We are uploading all images, pdf, and videos on AWS S3, and it has been smooth sailing. Videos of all sizes were uploaded and played without any issues.
We liked Hetzner and decided to go production also with them. We took the first server and installed proxmox over it, and deployed LXC containers and our services. I tested again here, and no problems were found again.
We then decided to take another server, deployed proxmox, and clustered them. This is where the problem started when we hired a network guy who configured a bridged network between the containers of both nodes. Each container pings the other well, and the telnet also connects over an internal network. MTU set on this bridge is 1400.
Primary Problem- We are NOT able to upload videos over 2 MB to S3 anymore from this network
Other problems – These are intermittent issues, noted in logs–
NGNIX –
504 Gateway Time-out ERRORS of likes, on multiple services--> upstream timed out (110: Connection timed out) while reading response header from upstream, client: 223.235.101.169, server: abc.xyz.com, request: "GET /courses/course HTTP/1.1", upstream: "http://10.10.XX.XX:8080//courses/course/toBeApprove", host: " abc.xyz.com, ", referrer: "https:// abc.xyz.com, /"
Tomcat-
com.amazonaws.services.s3.model.AmazonS3Exception: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout; Request ID: 7J2EHKVDWQP3367G; S3 Extended Request ID: xGGCQhESxh/Mo6ddwtGYShLIeCJYbgCRT8oGleQu/IfguEfbZpTQXG/AIzgLnG2F5YuCqk7vVE8=), S3 Extended Request ID: xGGCQhESxh/Mo6ddwtGYShLIeCJYbgCRT8oGleQu/IfguEfbZpTQXG/AIzgLnG2F5YuCqk7vVE8=
(we increased all known timeouts, both in nginx and tomcat)
Mysql- 2022-09-08T04:24:27.235964Z 8 [Warning] [MY-010055]
[Server] IP address '10.10.XX.XX could not be resolved: Name or
service not known
Other key points to note – we allow video up to 100 mb to upload thus known limits set in nginx and tomcat configurations
Nginx, client_max_body_size 100m;
And tomcat <Connector port="8080"
protocol="HTTP/1.1" maxPostSize="102400” maxHttpHeaderSize="102400"
connectionTimeout="20000" redirectPort="8443" />
In these readings and trials running over last 15 days, we stopped, all firewalls, ufw on OS, proxmox firewall, and even the data center firewall while debugging.
This is our nginx.conf
http {
proxy_http_version 1.1;
proxy_set_header Connection "";
##
client_body_buffer_size 16K;
client_header_buffer_size 1k;
client_max_body_size 100m;
client_header_timeout 100s;
client_body_timeout 100s;
large_client_header_buffers 4 16k;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 300;
send_timeout 600;
proxy_connect_timeout 600;
proxy_send_timeout 600;
proxy_read_timeout 600;
gzip on;
gzip_comp_level 2;
gzip_min_length 1000;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain application/x-javascript text/xml text/css application/xml;
These are our primary test/debugging trials.
**1. Testing with a small video (of size 273 Kb)**
a. Nginx log- clean, nothing related to operations
b. Tomcat log-
Start- CoursesServiceImpl - addCourse - Used Memory:73
add course 703
image file not null org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile#15476ca3
image save to s3 bucket
image folder name images
buckets3 lmsdev-cloudfront/images
image s3 bucket for call
imageUrl https://lmsdev-cloudfront.s3.amazonaws.com/images/703_4_istockphoto-1097843576-612x612.jpg
video file not null org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile#13419d27
video save to s3 bucket
video folder name videos
input Stream java.io.ByteArrayInputStream#4da82ff
buckets3 lmsdev-cloudfront/videos
video s3 bucket for call
video url https://lmsdev-cloudfront.s3.amazonaws.com/videos/703_4_giphy360p.mp4
Before Finally - CoursesServiceImpl - addCourse - Used Memory:126
After Finally- CoursesServiceImpl - addCourse - Used Memory:49
c. S3 bucket
[S3 bucket][1]
[1]: https://i.stack.imgur.com/T7daW.png
3. Testing with video 2 mb (fractionally less)
a. Progress bar keeps running about 5 minutes, then
b. Nginx logs-
2022/09/10 16:15:34 [error] 3698306#3698306: *24091 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 223.235.101.169, server: login.pathnam.education, request: "POST /courses/courses/course HTTP/1.1", upstream: "http://10.10.10.10:8080//courses/course", host: "login.pathnam.education", referrer: "https://login.pathnam.education/"
c. Tomcat logs-
Start- CoursesServiceImpl - addCourse - Used Memory:79
add course 704
image file not null org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile#352d57e3
image save to s3 bucket
image folder name images
buckets3 lmsdev-cloudfront/images
image s3 bucket for call
imageUrl https://lmsdev-cloudfront.s3.amazonaws.com/images/704_4_m_Maldives_dest_landscape_l_755_1487.webp
video file not null org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile#45bdb178
video save to s3 bucket
video folder name videos
input Stream java.io.ByteArrayInputStream#3a85dab9
And after few minutes
com.amazonaws.SdkClientException: Unable to execute HTTP request: Connection timed out (Write failed)
d. S3 Bucket – No entry
Now tried to upload the same video from our test server, and it was instantly uploaded to S3 bucket.
Reading all posts with similar problems,mostly are related to php.ini configurations and thus not related to us.
I have solved the issue now, MTU set in LXC container was set differently than what was configured in virtual switch. Proxmox does not give to set MTU while creating LXC container (and you expect bridge MTU to be used) and you can miss that.
Go to conf file of container; in my case it is 100
nano /etc/pve/lxc/100.conf
find and edit this line
net0: name=eno1,bridge=vmbr4002,firewall=1,hwaddr=0A:14:98:05:8C:C5,ip=192.168.0.2/24,type=veth
to add mtu value, as per switch in towards the last:
name=eno1,bridge=vmbr4002,firewall=1,hwaddr=0A:14:98:05:8C:C5,ip=192.168.0.2/24,type=veth,mtu=1400 (my value at vswitch)
Reboot the container for a permanent change.
And all worked like a charm for me. Hope it helps someone who also uses Proxmox interface to create the containers and thus missed this to configure via CLI (a suggested enhancement to Proxmox)

NGINX : Using Upstream server name

I am running nginx/1.19.6 on Ubuntu.
I am struggling to get the upstream module to work without returning a 404.
My *.conf files are located in /etc/nginx/conf.d/
FILE factory_upstream.conf:
upstream factoryserver {
server factory.service.corp.com:443;
}
FILE factory_service.conf:
server
{
listen 80;
root /data/www;
proxy_cache factorycache;
proxy_cache_min_uses 1;
proxy_cache_methods GET HEAD POST;
proxy_cache_valid 200 72h;
#proxy_cache_valid any 5m;
location /factory/ {
access_log /var/log/nginx/access-factory.log main;
proxy_set_header x-apikey abcdefgh12345678;
### Works when expressed as a literal.# proxy_pass https://factory.service.corp.com/;
### 404 when using the upstream name.
proxy_pass https://factoryserver/;
}
}
I have error logging set to debug, but after reloading the configuration and attempting a call, there are no new records in the error log.
nginx -t # Scans OK
nginx -s reload # no errors
cat /var/log/nginx/error.log
...
2021/03/16 11:29:52 [notice] 26157#26157: signal process started
2021/03/16 11:38:20 [notice] 27593#27593: signal process started
The access-factory.log does record the request :
127.1.2.3 --;[16/Mar/2021:11:38:50 -0400];";GET /factory/api/manifest/get-full-manifest/PID/904227100052 HTTP/1.1" ";/factory/api/manifest/get-full-manifest/PID/904227100052" ;404; - ;/factory/api/manifest/get-full-manifest/PID/904227100052";-" ";ID="c4cfc014e3faa803c8fe136d6eae347d ";-" ";10.8.9.123:443" ";404" ";-"
To help with debugging, I cached the 404 error, "proxy_cache_valid any 5m;" commented out in the example above:
When I use the upstream name, the cache file contains the followiing:
<##$ non-printable characters $%^>
KEY: https://factoryserver/api/manifest/get-full-manifest/PID/904227100052
HTTP/1.1 404 Not Found
Server: nginx/1.17.8
Date: Tue, 16 Mar 2021 15:38:50 GMT
...
The key contains the name 'factoryserver' I don't know if that matters or not. Does it?
The server version is different than what I see when I enter the command nginx -v, which is: nginx version: nginx/1.19.6
Does the difference in version in the cache file and the command line indicate anything?
When I switch back to the literal server name in the proxy_pass, I get a 200 response with the requested data. The Key in the cache file then contains the literal upstream server name.
<##$ non-printable characters $%^>
KEY: https://factory.service.corp.com/api/manifest/get-full-manifest/PID/904227100052
HTTP/1.1 200
Server: nginx/1.17.8
Date: Tue, 16 Mar 2021 15:59:30 GMT
...
I will have multiple upstream servers, each providing different services. The configuration will be deployed to multiple factories, each with its own upstream servers.
I would like for the deployment team to only have to update the *_upstream.conf files, and keep the *_service.conf files static from deployment site to deployment site.
factory_upstream.conf
product_upstream.conf
shipping_upstream.conf
abc123_upstream.conf
Why do I get a 404 when using a named upstream server?
Based on the nginx version in the cached response not matching what you see on the command line, it seems that maybe the 404 is coming from the upstream server. I.e, your proxying is working, but the upstream server is returning a 404. To troubleshoot further, I would check the nginx logs for the upstream server and if the incoming request is what you expect.
Note that when using proxy_pass, it makes a big difference whether you have a / at the end or not. With a trailing slash, nginx treats this as the URI it should send the upstream request to and it doesn't include the URI matched by the location block (/factory/) while without, it includes the full URI as-is.
proxy_pass https://factoryserver/ results in https://factory.service.corp.com:443/
proxy_pass https://factoryserver results in https://factory.service.corp.com:443/factory/
Docs: https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/
So maybe when switching between using an upstream and specifying the literal server name, you're inadvertently being inconsistent with the trailing slash. It's a really easy detail to miss, especially when you don't know it's important.
Please provide more information about your server configuration where you proxy pass the requests. Only difference I see now is that you specify the port (443) in your upstream server.

flask can't load static files when using nginx

I'm using flask + gunicorn for my website. and I'm using nginx for reverse proxy only. and set up as nginx config flask proxy setups. I haven't added location /static info in nginx.conf file. so all static files are supposed to be handled by flask itself. but when visiting from public network, it can't load css files. I tried to run flask + gunicorn alone without nginx, it worked well. so why can't flask serve static files by itself behind nginx?
Then I added location /statis info into nginx config file so that nginx may serve static file instead. Still can't load static files. when i looked into error log. it shows
"/var/www/my_project_folder/static/css/style.css" failed (2: No such
file or directory), client: 61.152.126.178, server: _, request: "GET
/static/css/style.css HTTP/1.1", host: "47.97.209.250", referrer:
"http://47.97.209.250"
the path in the error message actually is correct. the last method I tried is to run all script and service under root and chmod my_project to 666. still no success.
I searched on stackoverflow, but no one mentioned the issue that flask itselft can't serve static files behind nginx.
So i hope anyone have any idea or clue about this. and nginx serving issue
Thank
I solved it with:
https://www.reddit.com/r/flask/comments/avkjh0/ask_flask_serve_static_files_from_flask_app_via/
nginx.sh:
USE_STATIC_PATH=${STATIC_PATH:-'/app/static'}
location /webappB/static {
#alias /path/to/webappB/static;
alias $USE_STATIC_PATH;
}
main.py:
app = Flask(__name__, static_url_path="/webappA")

nginx Connection timed out while reading response header from upstream

I am using nginx + uwsgi over a flask app. In nginx settings the server block is having server_name *.mydomain.com; and location block for uwsgi is like
location /api/ {
include uwsgi_params;
uwsgi_pass unix:///var/uwsgi/app.sock;
.........
}
so the issue is I can access app.mydomain.com, but when i am trying app1.mydomain.com uwsgi log is not showing any request. nginx error log is showing
upstream timed out (110: Connection timed out) while reading response header from upstream, client: 122.166.94.231, server: *.mydomain.com, request: "GET /api/client/generic/ping HTTP/1.1", upstream: "uwsgi://unix:///var/uwsgi/app.sock", host: "app1.mydomain.com
I have another test setup where all these settings are same and its working. Any pointers? When i restart uwsgi and nginx app1.mydomain.com works, until i load app.mydomain.com (initial load of app.mydomain.com fails, but if i keep on refreshing it loads then app1.mydomain.com raises 504 gateway timeout and log shows Connection timed out while reading response header from upstream).
It worked when I added single-interpreter = true in uwsgi.ini settings.
A newly added python library was causing the issue.
Don't know whether this will help others.
I also ran into the same issue. uWSGI has "http", "http-socket" and "socket" options. When putting uWSGI behind a full webserver like Nginx, we should spawn uWSGI to natively speak the uWSGI protocol:
uwsgi --socket 127.0.0.1:3031 --wsgi-file foobar.py --master --processes 4 --threads 2 --stats 127.0.0.1:9191
More details from uwsgi documentation: https://uwsgi-docs.readthedocs.io/en/latest/WSGIquickstart.html#putting-behind-a-full-webserver
Looking at the uwsgi error logs and understanding what the problem is helped me. Issue was not related to Nginx configurations at all. My email host has changed and the code threw error while calling the send email code.

Resources