Nginx - fails with multipart request with custom boundary having (CRLF) - nginx

Nginx - fails with multipart request with custom boundary having (CRLF),while from RFC it's a perfectly valid payload.
Example payload.
MIME-Version: 1.0
Content-Type: multipart/form-data;
boundary=------%^TestBoundary^%------
with multiple files.
At first the special characters on the header was causing the boundary to be skipped from passing to the backend, added ignore_invalid_headers off. Now I see the content-type header passed to the backend but with a notorious ":" added to it.
multipart/form-data; boundary=------%^TestBoundary^%------:
Any clue what's causing this ? How do i fix it on the nginx before passing to the backend ?

This is is a known bug that is most likely your problem where the "boundary" part is appended to the Content-Type string using CRLF plus a TAB:
https://forum.nginx.org/read.php?29,192093,192102#msg-192102
If you cannot fix this in your code then you would need to use HAProxy or something similar to fix this by replacing the CRLF + TAB with LWS (see the "Important note" at the bottom of section 1.2.2 in this https://www.haproxy.org/download/1.7/doc/configuration.txt)

Related

How to upload a file which name contains special characters using JMeter?

I struggle to upload files using a POST HTTP request in JMeter, which file name (not file content) contains special characters such as "é è à". For example: "nameWithSpecialCharacters_éè.txt".
Is this a limitation of JMeter? I'm on 5.5 version (latest as of now).
Thank you for your help.
I tried several things such as putting file.encoding=UTF-8 in JMeter settings. I also put UTF-8 in "Content encoding" in HTTP Request sampler but none of these work... I have the same issue if I use BeanShell PreProcessor and set the files to upload with sampler.setHTTPFiles(filesToUpload)
No matter what I do the characters are replaced by "?" in the body of the request:
Content-Disposition: form-data; name="file1"; filename="nameWithSpecialCharacters_??.txt" Content-Type: text/plain Content-Transfer-Encoding: binary
It looks like a bug in the HttpClient4 implementation, I would suggest raising it via JMeter Issues
In the meantime as a workaround you could switch your HTTP Request sampler to use Java as the implementation:
If you have more than one HTTP Request sampler you could use HTTP Request Defaults to perform the change in one place for all.

Can the "Accept" HTTP header have a "charset" parameter?

I have encountered an HTTP client that uses the following header:
Accept: application/vnd.api+json; charset=utf-8'
According to the HTTP spec, Accept headers can have parameters. The most common one being the q parameter, which sets the priority of different content types. However, there are a number of reasons I don't think charset is a valid Accept parameter:
Accept already has the Accept-Charset parameter, which seems like it makes this redundant
MDN doesn't include it in their documentation on Accept, even though they do include in on their Content-Type page
Werkzeug, the flask HTTP parser, doesn't bother to parse charsets for Accept, even though it does for Content-Type
So, it seems this Accept; charset is unusual. But is it wrong?
You quoted the spec, which says they are ok. What else needs to be said?

Write HTTP trailer headers manually

This question was motivated by the answers here:
What to do with errors when streaming the body of an Http request
In this case, I have already written a HTTP 200 OK header, then I need to amend this if there is an error, by writing a trail header that says there was an error after writing a success header.
I have this Node.js code:
const writeResponse = function(file: string, socket: Socket){
socket.write([
'HTTP/1.1 200 OK',
'Content-Type: text/javascript; charset=UTF-8',
'Content-Encoding: UTF-8',
'Accept-Ranges: bytes',
'Connection: keep-alive',
].join('\n') + '\n\n');
getStream(file)
.pipe(socket)
.once('error', function (e: any) {
// there was an error
// how can I write trail headers here ?
s.write('some bad shit happened\n')
});
}
how do I write a useful trail header to the response that can be displayed well by the browser?
I think this is the relevant spec for trail headers:
https://www.rfc-editor.org/rfc/rfc2616#section-14.40
I think they should be called "trailing headers", but whatever.
Firstly:
I think this is the relevant spec for trail headers: https://www.rfc-editor.org/rfc/rfc2616#section-14.40
RFC 2616 has been obsoleted by RFC 7230. The current spec for trailers is RFC 7230 § 4.1.2.
Secondly:
].join('\n') + '\n\n'
Lines in HTTP message framing are terminated with \r\n, not \n.
Thirdly:
Content-Encoding: UTF-8
Content-Encoding is for content codings (like gzip), not charsets (like UTF-8). You probably don’t need to indicate charset separately from Content-Type.
And lastly:
how do I write a useful trail header to the response that can be displayed well by the browser?
You don’t. Mainstream Web browsers do not care about trailers.
See also (by the same user?): How to write malformed HTTP response to “guarantee” something akin to HTTP 500

Chunked transfer encoding - browser behavior

I'm trying to send data in chunked mode. All headers are set properly and data is encoded accordingly. Browsers recognize my response as a chunked one, accepting headers and start receiving data.
I was expecting the browser would update the page on each received chunk, instead it waits until all chunks are received then displays them all. Is this the expected behavior?
I was expecting to see each chunk displayed right after it was received. When using curl, each chunk is shown right after it is received. Why does the same not happen with GUI browsers? Are they using some sort of buffering/cache?
I set the Cache-Control header to no-cache, so not sure it is about cache.
afaik browsers needs some payload to start render chunks as they received.
Curl is of course an exception.
Try to send about 1KB of arbitrary data before your first chunk.
If you are doing everything correctly, browsers should render chunks as they received.
Fix your headers.
As of 2019, if you use Content-type: text/html, no buffering occurs in Chrome.
If you just want to stream text, similar to text/plain, then just using Content-type: text/event-stream will also disable buffering.
If you use Content-type: text/plain, then Chrome will still buffer 1 KiB, unless you additionally specify X-Content-Type-Options: nosniff.
RFC 2045 specifies that if no Content-Type is specified, Content-type: text/plain; charset=us-ascii should be assumed
5.2. Content-Type Defaults
Default RFC 822 messages without a MIME Content-Type header are taken
by this protocol to be plain text in the US-ASCII character set,
which can be explicitly specified as:
Content-type: text/plain; charset=us-ascii
This default is assumed if no Content-Type header field is specified.
It is also recommend that this default be assumed when a
syntactically invalid Content-Type header field is encountered. In
the presence of a MIME-Version header field and the absence of any
Content-Type header field, a receiving User Agent can also assume
that plain US-ASCII text was the sender's intent. Plain US-ASCII
text may still be assumed in the absence of a MIME-Version or the
presence of an syntactically invalid Content-Type header field, but
the sender's intent might have been otherwise.
Browsers will start to buffer text/plain for a certain amount in order to check if they can detect if the content sent is really plain text or some media type like an image, in case the Content-Type was omitted, which would then equal a text/plain content type. This is called MIME type sniffing.
MIME type sniffing is defined by Mozilla as:
In the absence of a MIME type, or in certain cases where browsers
believe they are incorrect, browsers may perform MIME sniffing —
guessing the correct MIME type by looking at the bytes of the
resource.
Each browser performs MIME sniffing differently and under different
circumstances. (For example, Safari will look at the file extension in
the URL if the sent MIME type is unsuitable.) There are security
concerns as some MIME types represent executable content. Servers can
prevent MIME sniffing by sending the X-Content-Type-Options header.
According to Mozilla's documentation:
The X-Content-Type-Options response HTTP header is a marker used by
the server to indicate that the MIME types advertised in the
Content-Type headers should not be changed and be followed. This
allows to opt-out of MIME type sniffing, or, in other words, it is a
way to say that the webmasters knew what they were doing.
Therefore adding X-Content-Type-Options: nosniff makes it work.
The browser can process and render the data as it comes in whether data is sent chunked or not. Whether a browser renders the response data is going to be a function of the data structure and what kind of buffering it employs. e.g. Before the browser can render an image, it needs to have the document (or enough of the document), the style sheet, etc.
Chunking is mostly useful when the length of a resource is unknown at the time the resource response is generated (a "Content-Length" can't be included in the response headers) and the server doesn't want to close the connection after the resource is transferred.

Header Accept in HTTP

I have a problem with "Accept" header in http. I've writen a http client, and when I set "Accept: image/png" I can still read any file (like txt, html, etc).
I think it shouldn't be possible when header "Accept" is set like above.
I tried to check how my Firefox behaves. I wrote "about:config" and I set "network.http.accept.default" as "image/png", and I can surf the net as usually.
Am I misunderstanding meaning of this header? I think that I should only be able to open files *.png.
Accept isn't mandatory; the server can (and often does) either not implement it, or decides to return something else.
If the [Accept] header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.
Source - RFC 7231 5.3.2. Accept
Actually, the former behavior is normal. Let me give you an example.
If the given URL points to a PDF file and the Accept header accepts only docx, then the server will blindly ignore it and send the PDF file because server is not setup to decide between PDF and other documents.
If there are multiple formats available, then server will consider the " Accept " header and try to send the response accordingly, if not, then it will ignore the " Accept " header.
As you suppose, setting Accept means that you can't accept others medias than these specified, and servers should return a 406 response code.
It practice, servers don't implements correctly, and always send a response.
All details are available in RFC 2616
The accept header is poorly implemented by browsers and causes strange errors when used on public sites where crawlers make requests too.
That's why, accept header is ignore most of the time like in the Rail framework.

Resources