HTTP Response before Request - http

My question might sound stupid, but I just wanted to be sure:
Is it possible to send an HTTP response before having the request for that resource?
Say for example you have an HTML page index.html that only shows a picture called img.jpg.
Now, if your server knows that a visitor will request the HTML file and then the jpg image every time:
Would it be possible for the server to send the image just after the HTML file to save time?
I know that HTTP is a synchronous protocol, so in theory it should not work, but I just wanted someone to confirm it (or not).

A recent post by Jacques Mattheij, referencing your very question, claims that although HTTP was designed as a synchronous protocol, the implementation was not. In practise the browser (he doesn't specify which exactly) accepts answers to requests have not been sent yet.
On the other hand, if you are looking to something less hacky, you could have a look at :
push techniques that allows the server to send content to the browser. The modern implementation that replace long-polling/Comet "hacks" are the websockets. You may want to have a look at socket.io also.
Alternatively you may want to have a look at client-side routing. Some implementations combine this with caching techniques (like in derby.js I believe).

If someone requests /index.html and you send two responses (one for /index.html and the other for /img.jpg), how do you know the recipient will get the two responses and know what to do with them before the second request goes in?
The problem is not really with the sending. The problem is with the receiver possibly getting unexpected data.
One other issue is that you're denying the client the ability to use HTTP caching tools like If-Modified-Since and If-None-Match (i.e. the client might not want /img.jpg to be sent because it already has a cached copy).
That said, you can approximate the server-push benefits by using Comet techniques. But that is much more involved than simply anticipating incoming HTTP requests.

You'll get a better result by caching resources effectively, i.e. setting proper cache headers and configuring your web server for caching. You can also inline images using base 64 encoding, if that's a specific concern.
You can also look at long polling javascript solutions.

You're looking for server push: it isn't available in HTTP. Protocols like SPDY have it, but you're out of luck if you're restricted to HTTP.

I don't think it is possible to mix .html and image in the same HTTP response. As for sending image data 'immediately', right after the first request - there is a concept of 'static resources' which could be of help (but it will require client to create a new reqest for a specific resource).
There are couple of interesting things mentioned in the the article.

No it is not possible.
The first line of the request holds the resource being requested so you wouldn't know what to respond with unless you examined the bytes (at least one line's worth) of the request first.

No. HTTP is defined as a request/response protocol. One request: one response. Anything else is not HTTP, it is something else, and you would have to specify it properly and implement it completely at both ends.

Related

nginx: Take action on proxy_pass response headers

I'd like to use nginx as a front-end proxy, but then have it conditionally proxy to another URL depending on the MIME type (Content-Type header) of the response.
For instance, suppose 1% of my clients are using a User-Agent that doesn't handle PNGs. For that UA, if the response is of type, image/png, I want to proxy_pass again to a special URL that'll get the PNG and convert it for me.
Ideally I'd do this without hurting performance and caching for the 99% of users that don't need this special handling. I can't modify the backend application. (Otherwise I could have it detect the UA and fix the response, or send an X-Accel-Redirect to get nginx to run another location block.)
If this isn't possible or has bad performance, where would I look to start writing a module to achieve the desired effect? As in, which extension point gets me closest to implementing this logic?
Edit: It seems like I could use Lua to perform a subrequest then inspect the response headers there. But that'd mean passing every request through Lua which seems suboptimal
Although I'm sure there could be valid reasons to do what you want to do, your actual image/png example is not as straightforward as it may look. Browsers no longer include image/png in their Accept HTTP request headers like they've used to in the old days when png was new, so, you'd have to have the whole detection and mapping tables of the really-really old browsers.
Additionally, from the architecture perspective, it's not very clear what you are trying to accomplish.
If this is static data, then why are you proxying it to the backend in the first place, instead of serving the static data directly from nginx? Will the \.png$ regex not match the affected request URIs? Couldn't you solve this without involving a backend, or even by rewriting the request without sending the wrong one to the backend first?
If this is really dynamic, then why do you need to waste the time for making a request only to receive a reply of an unacceptable type, instead of having the special-case mapping tables based on your knowledge of how the app works, and bypassing the needless requests from the start, instead of discarding them later on?
If the app is truly a black box, and you require a general purpose solution that'll work for any app, then it's still unclear what the usecase is, and why the extra requests have to be made, only to be discarded.
If you're really looking to mess with only 1% of your traffic as in your image/png example, then perhaps it might make sense to redirect all requests from the affected 1% of the old browsers to a separate backend, which would have the logic to do what you require.
Frankly, if you want to target the really-really old browsers, then I think png support should be the last of your worries. Many webapps include very complex and special-purpose JavaScript that wouldn't even work in new alternative browsers with new User-Agent strings, let alone any old webbrowsers that didn't even have png support.
So according to http://wiki.nginx.org/HttpCoreModule#.24http_HEADER. I assume you can have something along the lines of
if ($content_type ~ 'whatever your content type') {
proxy_pass 'ur_url'
}
Would that be something that works for you?

Robot request to an ASP.Net app

Is there a way to determine if a http request to an ASP.Net application is made from a browser or from a robot/crawler? I need to differentiate this two kind of requests.
Thanks!
No, there isn't. There is no fool proof to determine what originated a request - all HTTP headers can be spoofed.
Some crawlers (GoogleBot and such) do advertise themselves, but that doesn't mean a person browsing can't pretend to be GoogleBot.
The best strategy it to look for the well known bots (by User-Agent header and possibly by the known IP address) and assume those are crawlers.
Well... If the robot want to be recognized as a robot, yes. Cause he can easilly simulates that he is a web browser.
Personnaly, I will use this list to start: http://www.robotstxt.org/db.html
Have a look at Request.Browser.Crawler, but that only works for some crawlers.

Custom HTTP Headers with old proxies

Is it true that some old proxies/caches will not honor some custom HTTP headers? If so, can you prove it with sections from the HTTP spec or some other information online?
I'm designing a REST API interface. For versioning I'm debating whether to use version as a part of the URL like (/path1/path2/v1 OR /path1/path2?ver=1) OR to use a custom Accepts X-Version header.
I was just reading in O'Reilly's Even Faster Websites about how mainly internet security software, but really anything that has to check the contents of a page, might filter the Accept-Encoding header in order to reduce the CPU time used decompressing and reading the file. The books cites that about 15% of user have this issue.
However, I see no reason why other, custom headers would be filtered. On the other hand, there also isn't really any reason to send it as a header and not with GET is there? It's not really part of the HTTP protocol, it's just your API.
Edit: Also, see the actual section of the book I mention.

Tamper with first line of URL request, in Firefox

I want to change first line of the HTTP header of my request, modifying the method and/or URL.
The (excellent) Tamperdata firefox plugin allows a developer to modify the headers of a request, but not the URL itself. This latter part is what I want to be able to do.
So something like...
GET http://foo.com/?foo=foo HTTP/1.1
... could become ...
GET http://bar.com/?bar=bar HTTP/1.1
For context, I need to tamper with (make correct) an erroneous request from Flash, to see if an error can be corrected by fixing the url.
Any ideas? Sounds like something that may need to be done on a proxy level. In which case, suggestions?
Check out Charles Proxy (multiplatform) and/or Fiddler2 (Windows only) for more client-side solutions - both of these run as a proxy and can modify requests before they get sent out to the server.
If you have access to the webserver and it's running Apache, you can set up some rewrite rules that will modify the URL before it gets processed by the main HTTP engine.
For those coming to this page from a search engine, I would also recommend the Burp Proxy suite: http://www.portswigger.net/burp/proxy.html
Although more specifically targeted towards security testing, it's still an invaluable tool.
If you're trying to intercept the HTTP packets and modify them on the way out, then Tamperdata may be route you want to take.
However, if you want minute control over these things, you'd be much better off simulating the entire browser session using a utility such as curl
Curl: http://curl.haxx.se/

Is there any way to check if a POST url exists?

Is there any way to determine if a POST endpoint exists without actually sending a POST request?
For GET endpoints, it's not problem to check for 404s, but I'd like to check POST endpoints without triggering whatever action resides on the remote url.
Sending an OPTIONS request may work
It may not be implemented widely but the standard way to do this is via the OPTIONS verb.
WARNING: This should be idempotent but a non-compliant server may do very bad things
OPTIONS
Returns the HTTP methods that the server supports for specified URL. This can be used to check the functionality of a web server by requesting '*' instead of a specific resource.
More information here
This is not possible by definition.
The URL that you're posting to could be run by anything, and there is no requirement that the server behave consistently.
The best you could do is to send a GET and see what happens; however, this will result in both false positives and false negatives.
You could send a HEAD request, if the server you are calling support it - the response will typically be way smaller than a GET.
Does endpoint = script? It was a little confusing.
I would first point out, why would you be POSTing somewhere if it doesn't exist? It seems a little silly?
Anyway, if there is really some element of uncertainty with your POST URL, you can use cURL, then set the header option in the cURL response. I would suggest that if you do this that you save all validated POSTs if its likely that the POST url would be used again.
You can send your entire POST at the same time as doing the CURL then check to see if its errored out.
I think you probably answered this question yourself in your tags of your question with cURL.

Resources