I wanted to do an existence check before I actually GET an item, and I was planning to use a HEAD request. But my server is having problems with HEAD requests.
It returns an error 403 for new items. I have to make a GET request before making a HEAD request for new items, or my HEAD request consistently returns a 403.
I cannot change anything about my server. What alternatives do I have? I really don't want to download the items to do an existence check (the items are images).
HTTP ranges could be an option, for example, using curl to get the first 200 bytes:
curl -r 0-199 http://example.com
Related
I’m wondering if there is a general convention for this: When implementing a HTTP health check for any given application where you are not interested in any response body but just the status code, what would the default/expected endpoint look like?
Using a HEAD request - and returning 200 or 204 status code (which one of those?)
Using a GET with 204
something else?
As of my experience, people use mostly GET and 200. A health check wouldn't respond too much content, so no use of making a HEAD request. But this is mostly the case with a dedicated health check URL.
Today's cloud systems often use Kubernetes or OpenShift. They appear to use a GET request. I think they'll probably want to get a 200ish response code, so 200-299:
https://docs.openshift.com/enterprise/3.0/dev_guide/application_health.html
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Another example, Drupal defines the HTTP response code to be 200:
https://www.drupal.org/project/health_check_url
In Oracle's Infrastructure-as-a-Service docs you can choose between GET and HEAD requests, but the default is HEAD:
https://docs.oracle.com/en-us/iaas/api/#/en/healthchecks/20180501/HttpMonitor/
Use a GET with 204 possibly supporting also HEAD with same status code
A HEAD should give the same response as GET but without response body, so you should first know/define what the GET response gives out in terms of headers (and status code), then, if you want, you can support also HEAD on the same endpoint, returning the same status, in this case 204.
Note that if GET employee/34 anwswers with 404 also HEAD must anwser with same code. That means one must do the same work as for GET: check if employee esists, set status etc. but must not write any response. Tomcat supports this automatically as it uses for HEAD request a response object that never writes to the "real" response, so one can use same code handling GET
For a check one may consider also TRACE but it produces a response body / output mirroring what you send to it, is different, I haven't seen implemented anywhere.
TRACE allows the client to see what is being received at the other
end of the request chain and use that data for testing or diagnostic
information.
I am looking for an appropriate HTTP status code that tells the receiver that just the meta-data is being sent, not the complete data.
For example, say you do an HTTP GET:
GET /foo?meta_data_only=yes
the server won't look up the complete data, just send some metadata back about the endpoint, for example. Is there an HTTP status code for the response that can represent this? I would guess it's in the 200s or 300s somewhere?
Since your metadata is being returned in the headers, I would send a status code of 204 No Content.
https://httpstatuses.com/204
The server has successfully fulfilled the request and that there is no
additional content to send in the response payload body.
Metadata in
the response header fields refer to the target resource and its
selected representation after the requested action was applied.
This sounds exactly like what you’re looking for: a successful response that contains no body, and metadata in the headers that provide additional about the resource.
Another thing worth noting is that it’s common practice to use the HTTP verb HEAD when you only want metadata. HEAD is very similar to GET, except that it specifies that you do not want a body back. For example if you do a HEAD to an image url, you will get a 204 No Content response and some metadata about the file such as Content-Type, Content-Size, maybe ETag, but you won’t be sent all of the file data. A lot of web servers (such as Nginx) support this behavior out of the box for static files. I would recommend that you stop using your querystring parameter, and instead implement HEAD versions of your endpoints. That would make the intention even more clear and intuitive.
I'm using "curl -L --post302 -request PUT --data-binary #file " to post a file to a redirected address. At the moment the redirection is not optional since it will allow for signed headers and a new destination. The GET version works well. The PUT version under a certain file size threshold works also. I need a way for the PUT to allow itself to be redirected without sending the file on the first request (to the redirectorURL) and then only send the file when the POST is redirected to a new URL. In other words, I don't want to transfer the same file twice. Is this possible? According to the RFC (https://www.rfc-editor.org/rfc/rfc2616#section-8.2) it appears that a server may send a 100 "with an undeclared wait for 100 (Continue) status, applies only to HTTP/1.1 requests without the client asking to send its payload" so what I'm asking for may be thwarted by the server. Is there a way around this with one curl call? If not, two curl calls?
Try curl -L -T file $URL as the more "proper" way to PUT that file. (Often repeated by me: -X and --request should be avoided if possible, they cause misery.)
curl will use "Expect: 100" by itself in this case, but you'll also probably learn that servers widely don't care about supporting that anyway so it'll most likely still end up having to PUT twice...
I use the SoundCloud API to retrieve the stream URL for a streamable track.
I follow the redirect and I end up with an URL that looks like:
http://ec-media.soundcloud.com/eodihgiuh.128.mp3?<a string>
AWSAccessKeyId=<access key>
&Expires=<timestamp>
&Signature=<signature>
or
http://ak-media.soundcloud.com/euieuieie.128.mp3?
AWSAccessKeyId=<access key>
&Expires=<timestamp>
&Signature=<signature>
&__gda__=<a string>
Then I start streaming the MP3 data at this URL.
First I send a HEAD request to read the Content-Length header, so that I know how many GET requests I will have to send in order to play the whole song.
Then I send several partial GET requests, each one with a different Range header.
The problem is that sometimes the HEAD request returns a 403 status code, even though a GET request to the exact same URL returns with a 200 status code. It seems that this happens if and only if the host is ak-media.soundcloud.com.
Is this supposed to happen? I expected the HEAD request to return exactly the same headers as the GET request, only without the body response.
Cheers,
PB
P.S: I should probably mention that my code is not running on a computer, but on an audio device with a tiny 8-bit processor which has extremely limited resources.
Unfortunately, currently we only offer guaranteed proper response for GET requests.
As a hack, you could try to do requests with very short ranges.
I want the browser to reflect some other URL than the one used to create the request, but without roundtripping to the server.
I would maybe do this:
POST /form HTTP/1.1
...
...and then return:
HTTP/1.1 200 OK
Location: /hello
But that would cause a redirect, the browser will again, request URL /hello.
I would like to just tell the browser that, while the request you just sent was POST /some_url the actuall resource that I'm now returning is actually called GET /hello/1 but without preforming a roundtrip. i.e. Location: ...
Is there any way to do this with JavaScript or the base="" attribute? That will tell the browser to request /hello/1 when I hit F5 (refresh) instead of that, post submission warning?
HTTP/1.1 200 OK
Location: /hello
Actually that probably wouldn't work; it should be a 30x status rather than 200 (“303 See Other” is best for the response to a POST), and ‘Location’ should be a complete absolute URL.
(If your script just says ‘Location: /relativeurl’ without the 30x status, CGI servers will usually do an internal redirect by fetching the new URL and returning it without telling the browser anything funny happened. This may sound like what you want but it isn't really because from the browser's point of view it's no different from the original script returning a 200 and direct page.)
But that would cause a redirect, the browser will again, request URL /hello.
In practice that's probably not as bad as you think, thanks to HTTP/1.1 keep-alives. The client should be able to respond to the redirect straight away (in the next packet) as long as it's on the same server.
Is there any way [...] That will tell the browser to request /hello/1 when I hit F5 (refresh) instead of that, post submission warning?
Nope. Stick with the POST-Redirect-GET model for solving this.
No. Http is stateless, and every request has one answer. When you post, you need to redirect to a get page immediately to prevent a double post - you don't want it to sit on that post url. The redirect is what tells the browser that it is on a new page. That's just the way it works.