Can the "Accept" HTTP header have a "charset" parameter? - http

I have encountered an HTTP client that uses the following header:
Accept: application/vnd.api+json; charset=utf-8'
According to the HTTP spec, Accept headers can have parameters. The most common one being the q parameter, which sets the priority of different content types. However, there are a number of reasons I don't think charset is a valid Accept parameter:
Accept already has the Accept-Charset parameter, which seems like it makes this redundant
MDN doesn't include it in their documentation on Accept, even though they do include in on their Content-Type page
Werkzeug, the flask HTTP parser, doesn't bother to parse charsets for Accept, even though it does for Content-Type
So, it seems this Accept; charset is unusual. But is it wrong?

You quoted the spec, which says they are ok. What else needs to be said?

Related

Which HTTP headers can be combined in a list?

RFC 7230 says (3.2.2 Field Order, markup by me):
A sender MUST NOT generate multiple header fields with the same field
name in a message unless either the entire field value for that
header field is defined as a comma-separated list [i.e., #(values)]
or the header field is a well-known exception (as noted below).
Back in RFC2616, all the headers were contained in a single specification and one could browse through that spec for header definitions that have lists as values.
Nowadays, we have RFC7230 and friends, each specifying its own set of headers.
Is there an (authorative) list somewhere that holds the header names with list values? Or do I need to grep all related RFCs for 1#?
Of the standard HTTP request headers, these can have multiple values separated by a comma ,:
A-IM
Accept
Accept-Charset
Accept-Encoding
Accept-Language
Access-Control-Request-Headers
Cache-Control
Connection
Content-Encoding
Expect
Forwarded
If-Match
If-None-Match
Range
TE
Trailer
Transfer-Encoding
Upgrade
Via
Warning
these can have them separated by a semicolon ;:
Content-Type
Cookie
Prefer
and these shouldn't have multiple values (or can but with some other syntax, for example the User-Agent header can pretty much be arbitrary text):
Accept-Datetime
Access-Control-Request-Method
Authorization
Content-Length
Content-MD5
Date
From
Host
HTTP2-Settings
If-Modified-Since
If-Range
If-Unmodified-Since
Max-Forwards
Origin
Pragma
Proxy-Authorization
Referer
User-Agent
This doesn't include response headers, and any non-standard HTTP request headers. I made these lists by quickly glancing at the RFC listed in that table on Wikipedia for each header, I could've misclassified a few.
The only known exception is Set-Cookie. Many HTTP frameworks that encapsulate HTTP headers tend to treat all headers the exact same way, but have a specific exception for Set-Cookie. No other headers have this problem, and no standard new headers would be introduced with this problem.
Pick a random HTTP framework for any language, and chances are that there there's some special handling for just Set-Cookie.
It's possible that other non-standard headers have this issue.

ETag with Accept-Language

I have a web service which puts an ETag on to each response so future calls can make use of the HTTP 304 (Not Modified) status. The ETag I generate really just a Base64 encoding of the query type along with the timestamp.
The problem I have is if the browser requests the same resource with a difference Accept-Language. The browser currently sends the same If-None-Match header, so the response is a 304, even thought the actual resource would have come back in a different language. So I thought the way to do this was to add a Vary Header, to specify to the client that the response varys with Accept & Accept-Language, as shown below.
Vary:Accept, Accept-Language
However my browser (Chrome) uses the same ETag regardless of the accept-language. What is the correct convention to use here?
Thanks
E-Tag identifies the response contents.
So better use a response body hash for E-Tag construction.
At least you can use hash of a query and a language concatenated.

Does HTTP content negotiation respect media type parameters

An HTTP request can include an Accept header, indicating the media type(s) of responses that the client will find acceptable. The server should honour the request by providing a response that has a Content-Type that matches (one of) the requested media type(s). A media type may include parameters. Does HTTP require that this process of content-negotiation respect parameters?
That is, if the client requests
Accept: application/vnd.example; version=2
(here the version parameter has a value of 2), and the server can serve media-type application/vnd.example; version=1, but not application/vnd.example; version=2, is it OK for the server to provide a response with
Content-Type: application/vnd.example; version=1
Is it OK for the server to provide a response labelled
Content-Type: application/vnd.example; version=2
but for the body of the response to actually be encoded as media-type application/vnd.example; version=1? That is, for the parameters of the media-type of a response to be an inaccurate description of the body of the response?
It seems that Spring MVC 4.1.0 does not respect media-type parameters when doing content negotiation, and gives responses for which the parameters of the media-type of the response are an inaccurate description of the body of the response. This seems to be because the org.springframework.util.MimeType.isCompatibleWith(MimeType) method does not examine the parameters of the MimeType objects.
The relevant standard, RFC 7231 section 3.1.1.1, says the following about media-types:
The type/subtype MAY be followed by parameters in the form of
name=value pairs.
So, Accept and Content-Type headers may contain media type parameters. It adds:
The presence or absence of a
parameter might be significant to the processing of a media-type,
depending on its definition within the media type registry.
That suggests that the server code that uses parameter types should pay attention to them, and not simply discard them, because for some media types they will be significant. It has to implement some smarts in whether to consider whether the media type parameters are significant.
Spring MVC 4.1.0 therefore seems to be wrong to completely ignore the parameters when doing content negotiation: the class org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodProcessor is incorrect to use org.springframework.util.MimeType.isCompatibleWith(MimeType), or that MimeType.isCompatibleWith(MimeType) method is incorrect. If you provide Spring with several HTTP message converters that differ only in the parameters of their supported media type, Spring will not reliably choose the HTTP message converter that has the media type that exactly matches the requested media type.
In section 3.1.1.5, where it describes the Content-Type header, it says:
The indicated media type defines both the data
format and how that data is intended to be processed by a recipient
As the parameters of a media type in general could vary the data format, the behaviour of Spring MVC 4.1.0 is wrong, in providing parameters that are an inaccurate description of the body of the response: the method AbstractMessageConverterMethodProcessor.getMostSpecificMediaType(MediaType, MediaType) is wrong to return the acceptType rather than the produceTypeToUse when the two types are equally specific.
However, section 3.4.1, which discusses content negotiation (Proactive Negotiation), notes:
A user agent cannot rely on proactive negotiation preferences being
consistently honored, since the origin server might not implement
proactive negotiation for the requested resource or might decide that
sending a response that doesn't conform to the user agent's
preferences is better than sending a 406 (Not Acceptable) response.
So the server is permitted to give a response that does not exactly match the media-type parameters requested, as a fall-back when it can not provide an exact match. That is, it may choose to respond with a application/vnd.example; version=1 response body, with a Content-Type: application/vnd.example; version=1 header, despite the request saying Accept: application/vnd.example; version=2, if, and only if generating a valid application/vnd.example; version=2 response would be impossible.
This apparently incorrect behaviour of Spring already has a Spring bug report, SPR-10903. The Spring developers closed it as "Works as Designed", noting
I don't know any rule for comparing media types with their parameters effectively. It really depends on the media type...If you're actually trying to achieve REST versioning through media types, it seems that the most common solution is to use different media types, since their format obviously changed between versions:
"application/vnd.spring.foo.v1+json"
"application/vnd.spring.foo.v2+json"
The relevant spec for content negotiation in HTTP/1.1 is RFC2616, Section 14.1.
It contains the following example, relevant to your question:
Accept: text/*, text/html, text/html;level=1, */*
and gives the precedence as
1) text/html;level=1
2) text/html
3) text/*
4) */*
So I think it is safe to say that text/html;level=1 and text/html are different media types.
I would also consider text/html;level=1 and text/html;level=2 as different.
So in your example I think it would be correct to respond with a 406 error and not respond with a different media type.

Chunked transfer encoding - browser behavior

I'm trying to send data in chunked mode. All headers are set properly and data is encoded accordingly. Browsers recognize my response as a chunked one, accepting headers and start receiving data.
I was expecting the browser would update the page on each received chunk, instead it waits until all chunks are received then displays them all. Is this the expected behavior?
I was expecting to see each chunk displayed right after it was received. When using curl, each chunk is shown right after it is received. Why does the same not happen with GUI browsers? Are they using some sort of buffering/cache?
I set the Cache-Control header to no-cache, so not sure it is about cache.
afaik browsers needs some payload to start render chunks as they received.
Curl is of course an exception.
Try to send about 1KB of arbitrary data before your first chunk.
If you are doing everything correctly, browsers should render chunks as they received.
Fix your headers.
As of 2019, if you use Content-type: text/html, no buffering occurs in Chrome.
If you just want to stream text, similar to text/plain, then just using Content-type: text/event-stream will also disable buffering.
If you use Content-type: text/plain, then Chrome will still buffer 1 KiB, unless you additionally specify X-Content-Type-Options: nosniff.
RFC 2045 specifies that if no Content-Type is specified, Content-type: text/plain; charset=us-ascii should be assumed
5.2. Content-Type Defaults
Default RFC 822 messages without a MIME Content-Type header are taken
by this protocol to be plain text in the US-ASCII character set,
which can be explicitly specified as:
Content-type: text/plain; charset=us-ascii
This default is assumed if no Content-Type header field is specified.
It is also recommend that this default be assumed when a
syntactically invalid Content-Type header field is encountered. In
the presence of a MIME-Version header field and the absence of any
Content-Type header field, a receiving User Agent can also assume
that plain US-ASCII text was the sender's intent. Plain US-ASCII
text may still be assumed in the absence of a MIME-Version or the
presence of an syntactically invalid Content-Type header field, but
the sender's intent might have been otherwise.
Browsers will start to buffer text/plain for a certain amount in order to check if they can detect if the content sent is really plain text or some media type like an image, in case the Content-Type was omitted, which would then equal a text/plain content type. This is called MIME type sniffing.
MIME type sniffing is defined by Mozilla as:
In the absence of a MIME type, or in certain cases where browsers
believe they are incorrect, browsers may perform MIME sniffing —
guessing the correct MIME type by looking at the bytes of the
resource.
Each browser performs MIME sniffing differently and under different
circumstances. (For example, Safari will look at the file extension in
the URL if the sent MIME type is unsuitable.) There are security
concerns as some MIME types represent executable content. Servers can
prevent MIME sniffing by sending the X-Content-Type-Options header.
According to Mozilla's documentation:
The X-Content-Type-Options response HTTP header is a marker used by
the server to indicate that the MIME types advertised in the
Content-Type headers should not be changed and be followed. This
allows to opt-out of MIME type sniffing, or, in other words, it is a
way to say that the webmasters knew what they were doing.
Therefore adding X-Content-Type-Options: nosniff makes it work.
The browser can process and render the data as it comes in whether data is sent chunked or not. Whether a browser renders the response data is going to be a function of the data structure and what kind of buffering it employs. e.g. Before the browser can render an image, it needs to have the document (or enough of the document), the style sheet, etc.
Chunking is mostly useful when the length of a resource is unknown at the time the resource response is generated (a "Content-Length" can't be included in the response headers) and the server doesn't want to close the connection after the resource is transferred.

How to interpret HTTP Accept headers?

According to the HTTP1.1 spec, an Accept header of the following
Accept: text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c
is interpreted to mean
text/html and text/x-c are the preferred media types, but if they do not
exist, then send the text/x-dvi entity, and if that does not exist, send
the text/plain entity
Let's change the header to:
Accept: text/html, text/x-c
What is returned if neither of this is accepted ? e.g. let's pretend that I only support application/json
Maybe you should respond with a 406 Not Acceptable. That's how I read this.
Or a 415 Unsupported Media Type?
I would opt for a 406, because in that case and according to the specs, a response SHOULD include a list of alternatives. Although is not clear to me how that list should look like.
"If an Accept header field is present, and if the server cannot send a response which is acceptable according to the combined Accept field value, then the server SHOULD send a 406 (not acceptable) response." -- RFC2616, Section 14.1
You have a choice. You can either reply with 406 and include an "entity" (e.g. HTML or text file) describing the available formats; OR if you are using HTTP 1.1, you can send the format you support even though it wasn't listed in the Accept header.
(see section 10.4.7 of RFC 2616)
"Note: HTTP/1.1 servers are allowed
to return responses which are not
acceptable according to the accept
headers sent in the request. In some
cases, this may even be preferable to
sending a 406 response. User agents
are encouraged to inspect the headers
of an incoming response to determine
if it is acceptable."

Resources