SoundCloud API: GET succeeds, HEAD fails - http

I use the SoundCloud API to retrieve the stream URL for a streamable track.
I follow the redirect and I end up with an URL that looks like:
http://ec-media.soundcloud.com/eodihgiuh.128.mp3?<a string>
AWSAccessKeyId=<access key>
&Expires=<timestamp>
&Signature=<signature>
or
http://ak-media.soundcloud.com/euieuieie.128.mp3?
AWSAccessKeyId=<access key>
&Expires=<timestamp>
&Signature=<signature>
&__gda__=<a string>
Then I start streaming the MP3 data at this URL.
First I send a HEAD request to read the Content-Length header, so that I know how many GET requests I will have to send in order to play the whole song.
Then I send several partial GET requests, each one with a different Range header.
The problem is that sometimes the HEAD request returns a 403 status code, even though a GET request to the exact same URL returns with a 200 status code. It seems that this happens if and only if the host is ak-media.soundcloud.com.
Is this supposed to happen? I expected the HEAD request to return exactly the same headers as the GET request, only without the body response.
Cheers,
PB
P.S: I should probably mention that my code is not running on a computer, but on an audio device with a tiny 8-bit processor which has extremely limited resources.

Unfortunately, currently we only offer guaranteed proper response for GET requests.
As a hack, you could try to do requests with very short ranges.

Related

HTTP GET for a large string payload

I have a requirement where I need to make a HTTP request to a Flask server where the payload is a question(string) and a paragraph(string). The server uses machine learning to find the answer to the question within the paragraph and return it.
Now, the paragraph can be huge, as in thousands of words. So will a GET request with a JSON payload be appropriate? or should I be using POST?
will a GET request with a JSON payload be appropriate?
No - the problem here is that the payload of a GET request has no defined semantics; you have no guarantees that intermediate components will do the right thing with your request.
For example: caches are going to assume that the payload of the request is irrelevant, so your GET request might get a response for a completely different document.
should I be using POST?
Today, you should be using POST.
Eventually, you'll probably end up using the safe-method-with-body, once the HTTP-WG figures out the semantics of the new method and adoption has taken hold.

HTTP health check - GET or HEAD and 200 or 204 response?

I’m wondering if there is a general convention for this: When implementing a HTTP health check for any given application where you are not interested in any response body but just the status code, what would the default/expected endpoint look like?
Using a HEAD request - and returning 200 or 204 status code (which one of those?)
Using a GET with 204
something else?
As of my experience, people use mostly GET and 200. A health check wouldn't respond too much content, so no use of making a HEAD request. But this is mostly the case with a dedicated health check URL.
Today's cloud systems often use Kubernetes or OpenShift. They appear to use a GET request. I think they'll probably want to get a 200ish response code, so 200-299:
https://docs.openshift.com/enterprise/3.0/dev_guide/application_health.html
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Another example, Drupal defines the HTTP response code to be 200:
https://www.drupal.org/project/health_check_url
In Oracle's Infrastructure-as-a-Service docs you can choose between GET and HEAD requests, but the default is HEAD:
https://docs.oracle.com/en-us/iaas/api/#/en/healthchecks/20180501/HttpMonitor/
Use a GET with 204 possibly supporting also HEAD with same status code
A HEAD should give the same response as GET but without response body, so you should first know/define what the GET response gives out in terms of headers (and status code), then, if you want, you can support also HEAD on the same endpoint, returning the same status, in this case 204.
Note that if GET employee/34 anwswers with 404 also HEAD must anwser with same code. That means one must do the same work as for GET: check if employee esists, set status etc. but must not write any response. Tomcat supports this automatically as it uses for HEAD request a response object that never writes to the "real" response, so one can use same code handling GET
For a check one may consider also TRACE but it produces a response body / output mirroring what you send to it, is different, I haven't seen implemented anywhere.
TRACE allows the client to see what is being received at the other
end of the request chain and use that data for testing or diagnostic
information.

HTTP status code for sending back just the meta-data not full data

I am looking for an appropriate HTTP status code that tells the receiver that just the meta-data is being sent, not the complete data.
For example, say you do an HTTP GET:
GET /foo?meta_data_only=yes
the server won't look up the complete data, just send some metadata back about the endpoint, for example. Is there an HTTP status code for the response that can represent this? I would guess it's in the 200s or 300s somewhere?
Since your metadata is being returned in the headers, I would send a status code of 204 No Content.
https://httpstatuses.com/204
The server has successfully fulfilled the request and that there is no
additional content to send in the response payload body.
Metadata in
the response header fields refer to the target resource and its
selected representation after the requested action was applied.
This sounds exactly like what you’re looking for: a successful response that contains no body, and metadata in the headers that provide additional about the resource.
Another thing worth noting is that it’s common practice to use the HTTP verb HEAD when you only want metadata. HEAD is very similar to GET, except that it specifies that you do not want a body back. For example if you do a HEAD to an image url, you will get a 204 No Content response and some metadata about the file such as Content-Type, Content-Size, maybe ETag, but you won’t be sent all of the file data. A lot of web servers (such as Nginx) support this behavior out of the box for static files. I would recommend that you stop using your querystring parameter, and instead implement HEAD versions of your endpoints. That would make the intention even more clear and intuitive.

What does client send after receiving a "100 Continue" status code?

Client sends a POST or PUT request that includes the header:
Expect: 100-continue
The server responds with the status code:
100 Continue
What does the client send now? Does it send an entire request (the previously send request line and headers along with the previously NOT sent content)? Or does it only send the content?
I think it's the later, but I'm struggling to find concrete examples online. Thanks.
This should be all the information you need regarding the usage of a 100 Continue response.
In my experienced this is really used when you have a large request body. It could be considered to be roughly complementary to the HEAD method in relation to GET requests - fetch just the header information and not the body (usually) to reduce network load. 100 responses are used to determine whether the server will accept the request based purely on the headers - so that, for example, if you try and send a large POST/PUT request to a non-existent server resource it will result in a 404 before the entire request body has been sent.
So the short answer to your question is - yes, it's the latter. Although, you should always read the RFC for a complete picture. RFC2616 contains 99% of the information you would ever need to know about HTTP - there are some more recent RFCs that settle some ambiguities and offer a few small extensions to the protocol but off the top of my head I can't remember what they are.

Does sending POST data to a server that doesn't accept post data recieve the data?

I am setting up a back end API in a script of mine that contacts one of my sites by sending XML to my web server in the form of POST data. This script will be used by many and I want to limit the bandwidth waste for people that accidentally turn the feature on without a proper access key.
I will be denying requests that do not have the correct access key by maybe generating a 403 access code.
Lets say the POST data is ~500kb of data. Does the server receive all 500kb of data when this attempt is made regardless of the status code?
How about if I made the url contain the key mydomain/api/123456789 and generate 403 status on all bad access keys.
Does the POST data still get sent/received regardless or is it negotiated before the data is finally sent.
Thanks in advance!
Generally speaking, the entire request will be sent, including post data. There is often no way for the application layer to return a response like a 403 until it has received the entire request.
In reality, it will depend on the language/framework used and how closely it is linked to the HTTP server. Section 8.2.2 of RFC2616 HTTP/1.1 specification has this to say
An HTTP/1.1 (or later) client sending
a message-body SHOULD monitor the
network connection for an error status
while it is transmitting the request.
If the client sees an error status, it
SHOULD immediately cease transmitting
the body. If the body is being sent
using a "chunked" encoding (section
3.6), a zero length chunk and empty trailer MAY be used to prematurely
mark the end of the message. If the
body was preceded by a Content-Length
header, the client MUST close the
connection.
So, if you can find a language environemnt closely linked with the HTTP server (for example, mod_perl), you could do this in a way which does comply with standards.
An alternative approach you could take is to make an initial, smaller request to obtain a URL to use for the larger POST. The application can then deny providing the URL to clients without an appropriate key.
Here is great book about RESTful Web Services, where it's explained how HTTP works: http://oreilly.com/catalog/9780596529260
You can consider any request as envelope, where on top of it it's written address (URL), some properties (HTTP Headers) and inside it there's some data (if request is initiated by post method). So as you might guess you can't receive envelope partially.
Oh I forgot, it's when you are using HTTP Post with standard HTTP header "application/x-www-form-urlencoded" but if you are uploading files (correspondingly using ""multipart/form-data") Django gives you control over streamed chunks of files using Middleware classes: http://docs.djangoproject.com/en/dev/topics/http/middleware/

Resources