How-to drop extra data based on "Content-Length" header in nginx - nginx

I have a custom application deployed on an IIS instance, which among other things acts as an http file server. For various reasons (bugs), many files are corrupted, in the sense that they have an additional byte at the end of the binary content. Fortunately I've got the exact content length of each file saved on a db, and when my application returns the file, it sets the content-length header correctly (both for corrupted and correct files).
So I have situations where the content-length in the response header says 100, while the bytes actually present in the body of the same response are 101 (100+1).
For some internal reason I cannot change the behavior of the application.
Calling the application directly from the browser (so direct call to IIS) there seem to be no obvious problems, but this situation seems to mess up my nginx (version 1.15.7) behind which the application is exposed in production. Note that the file is served, but it results corrupted (they are Excel files), while those downloaded from direct call to IIS result correct.
I think there is some problem on some internal buffer because it's like it always discards the last 8192 bytes and in the error log it shows this warning: upstream sent more data than specified in "Content-Length" header while reading response header from upstream.
I tried to add the directive proxy_buffering: off; but the result does not change (only the warning disappears from the error log).
My question is: is there any way to trim the response body based on the content-length value provided by my upstream? Obviously if and only if this value is present in the headers.
Thanks,
AleBer

Related

Serving large PDF, should I set content-length?

I create pdf document dynamically and want to serve them in my handler. I set the content-type to application/pdf and it works fine. I run my server through nginx proxy.
My problem is that some requests generate a lot of other requests for the same doc. I looked at the headers and seen that it want a Chunked transfer encoding.
My solution was to set the content-length and it seems to works fine.
I wonder if it's enough and why I never had to do it with simple html page.
A comment in the source code says:
If the handler didn't declare a Content-Length up front, we either go into chunking mode or, if the handler finishes running before the chunking buffer size, we compute a Content-Length and send that in the header instead.
If you want to avoid chunking, then set the content length. Setting the content length for a large response does reduce the amount of data transferred and can reduce copying within the HTTP server.
As a rule of thumb, set the content length if the length is known in advance of producing the response body.
Your simple HTML pages may be smaller than the chunking buffer size. If so, they were not chunked.

Nginx: referring too-large files uploads to application?

I need to be able to deal with uploaded files sizes exceeding the maximum nginx and php limits before nginx issues an error 413 page. Instead, I want to issue an error message within my application (symfony) dialog.
To test the file-size limits in symfony, My test upload file is 600 Mb. When I upload the 600 Mb file under nginx, the upload runs to 100%, then 
reports "413 Request Entity Too Large".
If I run "app/console server:run" (which uses the symfony server instead of nginx), symfony reports the error in the gui before the upload occurs (as intended).
Is there any way to modify the nginx configuration so it reads the $_SERVER[CONTENT_LENGTH] or $_SERVER[HTTP_CONTENT_LENGTH], aborts the upload, and then passes the rejected request to the app? Symfony flags the error depending on CONTENT_LENGTH (and, with a work-around for a symfony issue, HTTP_CONTENT_LENGTH).
File size limits:
src/my_app/CoreBundle/Resources/config/validation.yml: maxSize: '500M'
/etc/php5/cgi/php.ini:post_max_size = 550M
/etc/php5/cgi/php.ini:upload_max_filesize = 500M
/etc/php5/cli/php.ini:post_max_size = 550M
/etc/php5/cli/php.ini:upload_max_filesize = 500M
/etc/php5/fpm/php.ini:post_max_size = 550M
/etc/php5/fpm/php.ini:upload_max_filesize = 500M
Versions:
symfony 2.5.12  
nginx 1.4.6-1ubuntu3.4
You could increase the allowed size in the nginx config http block.
client_max_body_size 800m;
Set a value thats higher then the value in your php.ini. Then the nginx server dont response with 413 and symfony shows the normal error page because of the php limitation.
I worked around this by catching changes in the file selection in javascript, then popping up a warning and clearing the selection if the file is too large.
My hunch is that nginx doesn't parse the incoming request in any way, ie, it doesn't read CONTENT_LENGTH or HTTP_CONTENT_LENGTH, and rejects the request purely based on whether the size of the request exceeds client_max_body_size. If that's true (confirmation or denial would be great), then there's no way to deal with this when using nginx except letting nginx run the upload and encounter the 413 error (which is time-consuming for a large file and/or slow network).
Content-Length will work on non chunked content, however I suspect your upload to be chunked and then there is no way that nginx can find out what the size of your content is.
Even if it could, it should not pass it.
By HTTP/1.1 specs:
http://www.ietf.org/rfc/rfc2616.txt:
3.If a Content-Length header field (section 14.13) is present, its
decimal value in OCTETs represents both the entity-length and the
transfer-length. The Content-Length header field MUST NOT be sent
if these two lengths are different (i.e., if a Transfer-Encoding
header field is present). If a message is received with both a
Transfer-Encoding header field and a Content-Length header field,
the latter MUST be ignored.

Using If-Modified-Since header for dynamically generated remote files

Our web server regularly downloads images from other web servers. To prevent our server having to download the same image every day even if it has not changed, I plan to store the Last-Modified header when the image downloads and then put that date in the If-Modified-Since header of subsequent requests for the same file.
I have this working fine except when the remote file is generated on-the-fly when requested (e.g. if it generates a certain sized version for the web when requested from separate original file). In this case, the Last-Modified header is the date that the remote server responds to the request so the stored Last-Modified header from the previous download will always be earlier that ones for subsequent requests so the image will always get downloaded and I'll never get the 304 Not Modified status code.
So, is there a way to reduce the download frequency when the remote server is serving up images that are generated on the fly?
It sounds to me like this is not possible, but I thought I'd ask anyway.
If you can create some form of hash for the the images use ETags. Your server will have to check the If-None-Match request header against the hash and if they match you can return a 304 response.
Clients will still send Last-Modified but if your hashing method does not generate many collision you should be able to ignore it and just match the ETags.

nginx resumable upload with upload_module and multipart/form

I currently upload to a webservice on an nginx server using the upload module (http://www.grid.net.ru/nginx/upload.en.html) from a custom desktop application doing a simple multipart-form POST that sends a file in one part and a base64 encoded XML with the file's metadata in another part.
The server receives this POST, passes it to my webservice which reads the metadata, processes the file and all is good.
What I want to do now is use the upload module's upload_resumable directive to do the POST in several chunks to minimize disconnection chances and allow resume. I can currently do this following the protocol described here: http://www.grid.net.ru/nginx/resumable_uploads.en.html
One sends byte ranges of the file along with some headers to identify the chunk and the session in several posts and once all the parts have been uploaded, nginx will compose the final POST containing the file name and path and pass it to your upload_pass location (which in my case CGIs to a django app).
However, I am not clear on how one would send a multipart post with this method since the protocol indicates that the body of the POST must be the bytes indicated in the byte range. I need the final post to also contain the XML I wrote about above.
I can think of sending the XML as the first bytes of the body and a header that indicates how many bytes belong to it but that would mean extra handling of the final file to remove that header and the final files are potentially in the GB size range.
Any other ideas?
Since the protocol supported by nginx specifically states that the post should not be multipart I ended up sending the file in the body, and the rest of the parameters encoded in the URL. Not the prettiest URLs but it works.

Is it OK to send an HTTP response with file attachment without specifying the Content-Length?

Meaning will it work fine? I have a situation where I am attaching files via HTTP attachment by attaching the URI of the file and it is on a different server so I don't have access to the length of the file.
It will work fine. The client will just read to EOF. The client only won't be able to calculate/estimate the progress of download.
It may work fine but the HTTP spec states that applications SHOULD send the length if it's possible to determine:
Applications SHOULD use this field to
indicate the transfer-length of the
message-body, unless this is
prohibited by the rules in section
4.4.
Any Content-Length greater than or
equal to zero is a valid value.
Section 4.4 describes how to determine
the length of a message-body if a
Content-Length is not given.
Note that the meaning of this field is
significantly different from the
corresponding definition in MIME,
where it is an optional field used
within the "message/external-body"
content-type. In HTTP, it SHOULD be
sent whenever the message's length can
be determined prior to being
transferred, unless this is prohibited
by the rules in section 4.4.

Resources