I would like to set the default http hrader in Tcl http package to empty and then selecetively put some header on my own using a dictionary. I need to do this becasue I see many items (like sock, binary, -binary, -strict, queryoffset etc) in my Tcl http request header which are not present in the header specified by other web browser like firefox. I get correct response in broswer so I want exactly those heater which are send by the brower. For this I need to set the default http header in Tcl http package to empty, and mannually set the headers (which I can do). How do I empty the default headers?
I'm not quite sure which header you've got a problem with, but the majority of headers can be set quite easily via the -headers option to http::geturl:
# Easiest to build the option as a dictionary
dict set opts X-Example-Header "fruitbats are mammals and this is nonsense"
dict set opts DNT "1 (Do Not Track Enabled)"
http::geturl $theurl -headers $opts
Almost everything can be set or overridden this way, and the things that can't are typically related to the management of the network connection itself (such as keep-alive management, chunking, compression) and are probably best left to the library in the first place, as HTTP/1.1 is a pretty complex protocol for something supposedly stateless.
Note that options to http::geturl do not directly translate into request options. It's a higher-level interface…
Related
I feel like this has to be easy to Google, but I can't find it: from the perspective of an HTTP cache, what determines if two requests are equivalent?
I imagine one ingredient is that that their URLs need to be identical; for example, rearranging (but not changing) query string parameters seems to cause a cache miss. Presumably they need to have the same Accept header. What else determines if a request can be served from cache?
This is mostly described in this RFC: https://www.rfc-editor.org/rfc/rfc7234#section-4
Summary:
The method
The full uri
Caching-related headers in response influence whether something got stored.
Any request headers that appeared in the list of the Vary response header.
It also matters whether you are caching for a specific user (for example a browser), or many users (for example a proxy).
I also struggled with this. Changing my google search to use "http cache key" generated better results. Using the URL seems to be the most common. Query strings are also generally included.
https://support.cloudflare.com/hc/en-us/articles/115004290387-Using-Custom-Cache-Keys describes what the default is for cloudflare and a discussion on the impact of using different keys.
Another parameter that could be useful is to identifying the type of assets that you want to cache. Or leave it open (no filtering)
"Authorization" header is specifically mentioned in the HTTP spec (https://www.rfc-editor.org/rfc/rfc7234) and needs to be handled.
Upon further reading, I noticed the section on "Secondary keys" in the standard (https://www.rfc-editor.org/rfc/rfc7234#section-4.1) and the use of "Vary" header in a response. Headers presented in the "Vary" response header have to match in both the original and the new request for the cache to declare it as a match.
And as for the primary key, standard says "The primary cache key consists of the request method and target URI." in https://www.rfc-editor.org/rfc/rfc7234#section-2
There are all the conditional requests for cache control like If-match, If-unmodified-since, If-none-match and If-modified-since. For example If-modified-since works this way: suppose you have already requested a page and now you want to reload it. If the header is present then a new page will be sent back from the server ONLY if it was modified since the date indicated as a value for If-modified-since, otherwise 304(not-modified) status will be returned.
Accept and Accept-* instead are necessary for Content-Negotiation, like in which language the page should be returned.
More on conditional requests here: https://www.rfc-editor.org/rfc/rfc7232#page-13
I'm currently trying to optimize http-based data transfer between several applications.
Our current approach, downloading first and then creating the post-request, obviously add extra IO/memory load and latencies, which I'd like to circumvent.
The core question of all:
Is it required to send a "Content-Length" header in HTTP POST requests?
IIRC, HTTP 2616 declares that it's optional, but I'm not sure how applications actually behave at this point.
Depends what you mean by optional. If you mean that you can just omit the header anytime you like then no, it is not optional. The HTTP spec has very specific rules when to use that header. There are different ways of sending the data if you don't know the length. Chunked encoding for example.
4.4 Message Length
I have a problem with "Accept" header in http. I've writen a http client, and when I set "Accept: image/png" I can still read any file (like txt, html, etc).
I think it shouldn't be possible when header "Accept" is set like above.
I tried to check how my Firefox behaves. I wrote "about:config" and I set "network.http.accept.default" as "image/png", and I can surf the net as usually.
Am I misunderstanding meaning of this header? I think that I should only be able to open files *.png.
Accept isn't mandatory; the server can (and often does) either not implement it, or decides to return something else.
If the [Accept] header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.
Source - RFC 7231 5.3.2. Accept
Actually, the former behavior is normal. Let me give you an example.
If the given URL points to a PDF file and the Accept header accepts only docx, then the server will blindly ignore it and send the PDF file because server is not setup to decide between PDF and other documents.
If there are multiple formats available, then server will consider the " Accept " header and try to send the response accordingly, if not, then it will ignore the " Accept " header.
As you suppose, setting Accept means that you can't accept others medias than these specified, and servers should return a 406 response code.
It practice, servers don't implements correctly, and always send a response.
All details are available in RFC 2616
The accept header is poorly implemented by browsers and causes strange errors when used on public sites where crawlers make requests too.
That's why, accept header is ignore most of the time like in the Rail framework.
I was wondering how companies like Double-Click include a cookie in their image responses to track users. Similarly, how do the images (e.g. smart pixels) send information back to their servers?
Please provide a scripting example if possible (any language is okay) [note: if this is resolve doings something server side, please describe how this would be accomplished using APACHE].
Cheers,
Rob
How do they include a cookie ? Configure the server, probably via script to send cookies with the responses. Images are http requests, that follow the http protocol, there is nothing magical about them.
"Smart pixels" convey their information simply via the request the browser must send to the server in order to load the image. Information about the user/browser, can be gathered via javascript and embedded in the url.
To do this in php, you'd use the setCookie function.
<?php
$value = 'something from somewhere';
setcookie("TestCookie", $value);
setcookie("TestCookie", $value, time()+3600); /* expire in 1 hour */
setcookie("TestCookie", $value, time()+3600, "/~rasmus/", ".example.com", 1);
?>
That code was taken from the php doc I referenced above. Basically this adds the Set-Cookie to the HttpResponse header like: Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1
See http://en.wikipedia.org/wiki/List_of_HTTP_header_fields and search for Set-Cookie
ALSO, in scripting languages, like PHP, make sure you set the header before you render any content. This is because the HTTP Headers are the first thing sent in the response, so as soon as you write content, the headers should've already been written.
Another quote from the PHP:setcookie doc:
Like other headers, cookies must be sent before any output from your
script (this is a protocol restriction). This requires that you place
calls to this function prior to any output, including and
tags as well as any whitespace.
I have a program that is supposed to interact with a web server and retrieve a file containing structured data using http and cgi. I have a couple questions:
The cgi script on the server needs to specify a body right? What should the content-type be?
Should I be using POST or GET?
Could anyone tell me a good resource for reading about HTTP?
If you just want to retrieve the resource, I’d use GET. And with GET you don’t need a Content-Type since a GET request has no body. And as of HTTP, I’d suggest you to read the HTTP 1.1 specification.
The content-type specified by the server will depend on what type of data you plan to return. As Jim said if it's JSON you can use 'application/json'. The obvious payload for the request would be whatever data you're sending to the client.
From the servers prospective it shouldn't matter that much. In general if you're not expecting a lot of information from the client I'd set up the server to respond to GET requests as opposed to POST requests. An advantage I like is simply being able to specify what I want in the url (this can't be done if it's expecting a POST request).
I would point you to the rfc for HTTP...probably the best source for information..maybe not the most user friendly way to get your answers but it should have all the answers you need. link text
For (1) the Content-Type depends on the structured data. If it's XML you can use application/xml, JSON can be application/json, etc. Content-Type is set by the server. Your client would ask for that type of content using the Accept header. (Try to use existing data format standards and content types if you can.)
For (2) GET is best (you aren't sending up any data to the server).
I found RESTful Web Services by Richardson and Ruby a very interesting introduction to HTTP. It takes a very strict, but very helpful, view of HTTP.