Wrangling Control of HTTP Headers in ASP.NET - asp.net

I'm working with ASP.NET MVC3, and I'm trying to get absolute control over my headers because a client application that I'm working with expects a very specific content type. What I'm finding when using Fiddler to examine the HTTP traffic is that the text encoding is being returned as part of the header.
For example, the client is expecting application/appname in the Content-Type header, but the server is returning application/appname; charset=utf-8. I think the client is using a strict comparison for checking the type, so I want to be able to specify exactly what is emitted in the headers.
Right now I have a custom ActionResult in which I clear the headers and then specify only the content type, but the encoding still seems to be added on.
How can I remove the encoding from the Content-Type header?

Charset Encoding in ASP.NET Response from Rick Strahl is an older (2007) article, but maybe give it a try.
Response.ContentType = "application/appname";
Response.Charset = null;

Related

Generating PDF on the fly with standard HTTP response fields

I'm developing a web page with a form which returns a PDF document based on the form data. Currently I use the HTTP response fields
Content-Type: application/pdf
Content-Disposition: attachment; filename="foo.pdf"
However, since the field Content-Disposition is non-standard and doesn't work in all browsers I'm looking for a different approach. Do I have to save the PDF document on the server? What is the modus operandi?
Edit: By "doesn't work in all browsers" I mean that with some browsers the filename is not set to foo.pdf. Dillo, for instance, just sets the default filename (in the download dialog) to the basename of the URL path (plus query string).
Do I have to save the PDF document on the server?
No. As far as the HTTP client is concerned it, the inner workings of the server are completely opaque to it. All it sees is a TCP stream of bytes from the server and how exactly that stream is produced doesn't matter as long as it matches the specified Content-Type.
Just send the PDF right after the HTTP headers and you're done with.
Update due to comment
So if you're wondering how to supply a filename without using a header field: Just augment the URL with it. I.e. something like
http://${DOMAIN}/${PDF_GENERATOR}/${DESIRED_FILENAME}
In the HTTP server add a rewrite rule to simply omit the filename part and redirect to just
http://${DOMAIN}/${PDF_GENERATOR}
The HTTP client does not see that, all it see is some URL ending with a "filename", that it can present the user as a default for saving.

How does ASP.NET Web API "sort" Request Headers into Header Collections?

I was trying to read the value of the Content-Type header in a custom delegating handler in ASP.NET Web API. When I queried the request.Headers collection, the header value wasn't in there. However, it was contained in request.Content.Headers. Other non-standard headers (such as Content-Test) starting with Content- were available in request.Headers only; Content-Length, on the other hand, could only be found within request.Content.Headers, just like Content-Type.
Is it correct to assume that Web API is putting all known content headers into the request.Content.Headers collection while putting all other headers into request.Headers?
That's how the HttpClient was designed in the first place. Requests and responses are separate from the actual content hence content related cookies go into the HttpContent.Headers rather than HttpRequestMessage.Headers. Keeping the content headers with the content is a good way of separating the concerns, on the other hand, getting to the content headers is a bit more cumbersome.

Header Accept in HTTP

I have a problem with "Accept" header in http. I've writen a http client, and when I set "Accept: image/png" I can still read any file (like txt, html, etc).
I think it shouldn't be possible when header "Accept" is set like above.
I tried to check how my Firefox behaves. I wrote "about:config" and I set "network.http.accept.default" as "image/png", and I can surf the net as usually.
Am I misunderstanding meaning of this header? I think that I should only be able to open files *.png.
Accept isn't mandatory; the server can (and often does) either not implement it, or decides to return something else.
If the [Accept] header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.
Source - RFC 7231 5.3.2. Accept
Actually, the former behavior is normal. Let me give you an example.
If the given URL points to a PDF file and the Accept header accepts only docx, then the server will blindly ignore it and send the PDF file because server is not setup to decide between PDF and other documents.
If there are multiple formats available, then server will consider the " Accept " header and try to send the response accordingly, if not, then it will ignore the " Accept " header.
As you suppose, setting Accept means that you can't accept others medias than these specified, and servers should return a 406 response code.
It practice, servers don't implements correctly, and always send a response.
All details are available in RFC 2616
The accept header is poorly implemented by browsers and causes strange errors when used on public sites where crawlers make requests too.
That's why, accept header is ignore most of the time like in the Rail framework.

Why is ASP.NET replacing a Content-Length header with a Transfer-Encoding header when manually flushing a response?

Our web application (ASP.NET Web Forms) has a page that will display a recently generated PDF file to users. Because the PDF file is sometimes quite large, we've implemented a "streaming" approach to send it down to the client browser in chunks.
Despite sending the data down in chunks, we know the full size of the file prior to sending it, so we set the Content-Length header appropriately. This has been working in our production environment for awhile (and continues to work in our test environment with a virtually identical configuration) until today. The issue reported was that Chrome would attempt to open the PDF file but would hang with the "Loading" animation stuck.
Because everything was still working fine in our test environment I was able to use Firebug to take a look at the response headers that were coming back in both environments. In the test environment, I was seeing a proper 'Content-Length' header, while in production that had been replaced with a Transfer-Encoding: chunked header. Chrome doesn't like this, hence the hang-up.
I've read some articles and posts talking about how the Transfer-Encoding header can show up when no Content-Length header is provided, but we are specifying the Content-Length header and everything still appears to work while running the same code for the same PDF file on a test server.
Both test and production servers are running IIS 7.5 and both have Dynamic and Static Compression enabled.
Here is the code in question:
var fileInfo = new FileInfo(fileToSendDown);
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", "filename=test.pdf");
Response.AddHeader("Content-Length", fileInfo.Length.ToString());
var buffer = new byte[1024];
using (var fs = File.Open(file, FileMode.Open, FileAccess.Read, FileShare.Read))
{
int read;
while ((read = fs.Read(buffer, 0, 1024)) > 0)
{
if (!response.IsClientConnected) break;
Response.OutputStream.Write(buffer, 0, read);
Response.Flush();
}
}
I was fortunate to see the same behavior on my local workstation so using the debugger I have been able to see that the 'Transfer-Encoding: chunked' header is being set on the 2nd pass through the while loop during the call to 'Flush'. At that point, the response has both a Content-Length header and Transfer-Encoding header, but somehow by the time the response reaches the browser Firebug is only showing the Transfer-Encoding header.
UPDATE
I think I've tracked this down to using a combination of sending the data down in "chunks" AND attaching a 'Filter' to the HttpResponse object (we were using a filter to track the size of viewstate being sent down to each page). There's no sense in us using an HTTP filter when sending a PDF down to the browser, so clearing the filter here has resolved our issue. I decided to dig in a little deeper purely out of curiosity and have updated this question should anyone else ever stumble onto this problem in the future.
I've got a simple app up on AppHarbor that reproduces the issue: http://transferencodingtest.apphb.com/. If you check both the 'Use Filter?' and 'Send In Chunks?' boxes you should be able to see the 'transfer-encoding: chunked' header show up (using Chrome dev tools, Firebug, Fiddler, whatever). If either of the boxes are not checked, you'll get a proper content-length header. The underlying code is up on github so you can see what's going on behind the scenes:
https://github.com/appakz/TransferEncodingTest
Note that to repro locally you'd need to setup a local website in IIS 7.5 (7 may also work, I haven't tried). The ASP .NET development server that ships with Visual Studio DOES NOT repro the issue.
I've added some more details to a blog post here: 'Content-Length' Header Replaced With 'Transfer-Encoding: Chunked' in ASP .NET
From an article on MSDN it seems that you can disable chunked encoding:
appcmd set config /section:asp /enableChunkedEncoding:False
But it's mentioned under ASP settings, so it may not apply to a response generated from an ASP.NET handler.
Once Response.Flush() has been called, the response body in in the process of being sent to the client, so no additional headers can be added to the response. I find it very unlikely that a second call to Response.Flush() is adding the Transfer-Encoding header at that time.
You say you have compression enabled. That almost always requires a chunked response. So it would make sense that if the server knows the Content-Length prior to compression, it might substitute that header for the Transfer-Encoding header and chunk the response. However, even with compression enabled on the server, the client has to explicitally state support for compression in its Accept-Encoding request header or else the server cannot compress the response. Did you check for that in your tests?
On a final note, since you are calling Response.Flush() manually, try setting Response.Buffer = True and Response.BufferOutput = False. Apparently they have conflicting effects on how Response.Flush() operates. See the comment at the bottom of this page and this page.
I had a similar problem when I was writing a large CSV (The file didn't exist I write a string line by line by iterating through an in memory collection and generating the line) by calling Response.Write on the Response stream with BufferOutput set to false, but the solution was to change
Reponse.ContentType = 'text/csv' to Reponse.ContentType = 'application/octet-stream'
When the content type wasn't set to application/octet-stream a bunch of other response headers were added such as Content-Encoding - gzip

How do web servers know the charset using in forms posted to them?

When a web server gets a POST of a form, parsing it into param-value(s) pairs is quite straightforward. However, if the values contain non-English chars that have been encoded by the browser, it must know the charset used in order to decode them.
I've examined the requests sent by two posts. One was done from a page using UTF-8, and one from a page using Windows-1255. The same text was encoded differently. AFAIK, the Content-type header could contain a charset after the application/x-www-form-urlencoded, but it wasn't (using Firefox).
In a servlet, when you use request.getParameter(), you're supposed to get the decoded value. How does the servlet container do that? Does it always bet on UTF-8, use some heuristics, or is there some deterministic way I'm missing?
From the Serlvet 3.0 Spec, section 3.10 Request Data Encoding (emphasis mine)
Currently, many browsers do not send a char encoding qualifier with the ContentType header, leaving open the determination of the character encoding for reading
HTTP requests. The default encoding of a request the container uses to create the
request reader and parse POST data must be “ISO-8859-1” if none has been specified
by the client request. However, in order to indicate to the developer, in this case, the
failure of the client to send a character encoding, the container returns null from
the getCharacterEncoding method.
If the client hasn’t set character encoding and the request data is encoded with a
different encoding than the default as described above, breakage can occur. To
remedy this situation, a new method setCharacterEncoding(String enc) has
been added to the ServletRequest interface. Developers can override the
character encoding supplied by the container by calling this method. It must be
called prior to parsing any post data or reading any input from the request. Calling
this method once data has been read will not affect the encoding.
In practice, I find that setting the charset in a response influences the charset used in the subsequent POST. To be extra sure, you can write a Servlet Filter that calls the setCharacterEncoding on every request object before it is used.
You may also find this thread useful - Detecting the character encoding of an HTTP POST request
The apropriate header for specifying charsets is Accept-Charset.
Latest Chrome for linux, e.g., spits:
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
on each request.
Section 14.2 from http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html states:
The Accept-Charset request-header field can be used to indicate what character sets are acceptable for the response. This field allows clients capable of understanding more comprehensive or special- purpose character sets to signal that capability to a server which is capable of representing documents in those character sets.
(...)
If no Accept-Charset header is
present, the default is that any
character set is acceptable. If an
Accept-Charset header is present, and
if the server cannot send a response
which is acceptable according to the
Accept-Charset header, then the server
SHOULD send an error response with the
406 (not acceptable) status code,
though the sending of an unacceptable
response is also allowed.
So if you receive such a header from a client, the value with highest q can be the encoding you're receiving from it.

Resources