What does "subrange" mean in the HTTP spec? - http

See, for example, §13.3.3 and §13.3.4.
It doesn't seem to me that this could be related to "media range" (§14.1, e.g. Accept: text/*), nor "language range" (§14.4, e.g. Accept-Language: da, en-gb;q=0.8, en;q=0.7).
Maybe it's the "accept range" (§14.5), which puts byte limitations on a response? If that's true, how do ETags relate?

I'm pretty sure it's for range retrieval requests, i.e. requesting part of a document (resuming a file download, for example).
14.35.2 Range Retrieval Requests
HTTP retrieval requests using
conditional or unconditional GET
methods MAY request one or more
sub-ranges of the entity, instead of
the entire entity, using the Range
request header, which applies to
the entity returned as the result of
the request:
If the ETag is weak (starts with W/) then it can't be used for a range retrieval - only strong validators can be used for that or the client may end up with an inconsistent file.

Related

KDB: Difference between .Q.hp/.Q.hg and the built in HTTP request. And do either persist/Keep alive?

Its seems a http get can be performed using either .Q.hg or using the built in HTTP request like
`:http://host:port "string to send as HTTP method etc"
(from https://code.kx.com/q/kb/programming-examples/)
Is there any difference?
And do either persist/keep-alive by default?
Thank you
Using .Q.hg allows you to use a string which is formatted in a way that is consistent with a web-based url request, .e.g for requesting some csv data from a server:
t:.Q.hg`$":http://www.website.com/report1/format=csv&cols=sym&cols=price&date=20200630";
/the resulting string contains the data only (no metadata/headers) and can be parsed directly
("SF";1#csv)0:t
The GET equivalent is not like a browser url, however it does return the metadata/headers (which in turn makes it messier to parse), e.g.
t:(hsym`$"http://www.website.com") "GET /report1/format=csv&cols=sym&cols=price&date=20200630 HTTP/1.1\r\nhost:www.website.com\r\n\r\n";
/result looks like
"HTTP/1.1 200 OK\r\nDate: Fri, 03 Jul 2020 14:46:33 GMT\r\nContent-Type: application/txt\r\nContent-Length: 1345\r\nConnection: keep-alive ...."
/parsed using something like (strip away metadata to get to the data)
("SF";1#csv)0:_[;t]3+first t ss "\n\r\n"
The resulting metadata/header shows "Connection: keep-alive" in my example that I've just tested so perhaps that's the default? I'm not 100% on that.
.Q.hg also has the advantage of being compatible with HTTPS and making use of proxies as per the documentation: https://code.kx.com/q/ref/dotq/#qhg-http-get
.Q.hg and .Q.hp have a similar functionality to the example outlined in the link, without having to construct the HTTP requests as strings (these functions will construct the strings for you). The example was perhaps written before the .Q.hg/.Q.hp functions were introduced in v3.4.
I don't think either persist by default assuming they use HTTP 1.0 protocol.

What is the correct response format of a 406 HTTP status error

I want to issue a 406 Not Acceptable error in my application and I want to alert the client about the available alternative formats.
From the HTTP protocol spec:
Unless it was a HEAD request, the response SHOULD include an entity
containing a list of available entity characteristics and location(s)
from which the user or user agent can choose the one most appropriate.
The entity format is specified by the media type given in the
Content-Type header field. Depending upon the format and the
capabilities of the user agent, selection of the most appropriate
choice MAY be performed automatically. However, this specification
does not define any standard for such automatic selection.
Do I add that entity in the response body?
What's the format of that list?
The main thing in the spec is:
The entity format is specified by the media type given in the Content-Type header field
This page is quite useful:
http://chimera.labs.oreilly.com/books/1234000001708/apc.html#_proactive_negotiation
As-is the article it references:
http://bit.ly/agent-conneg
The examples there describe negotiation, in situations where you would return a 300 Multiple Choices response, a situation similar to how you might provide alternatives in a 406 Not Acceptable response. It follows the same ideas where you provide an entity in the format of the returning content type - if you are returning text, write some text; if you are returning HTML, write some HTML.
HTTP/1.1 300 Multiple Choices
Host: www.example.org
Content-Type: application/xhtml
Content-Length:XXX
<p>
Select one:
</p>
French
US English
German
A standard format would indeed be useful for automatic renegotiation, but the spec stays away from defining that. I would agree with the writer that the best way to inform of alternatives to a 406 is to use the same "Link" headers in the example from the oreilly.com page, as they describe:
An alternative approach is to use Link headers. This has the advantage of being a standard header that any client can understand. Here is an example:
HTTP/1.1 300 Multiple Choices
Host: www.example.org
Content-Length: 0
Link: <http://www.example.org/results/png>; type="image/png",
<http://www.example.org/results/jpeg>;type="image/jpeg",
<http://www.example.org/results/gif>;type="image/gif"
However, don't expect guaranteed results on every client with something that is not standardized.

Why are POST params put in the request body, instead of in the URL like GET?

Why are POST params put in the request body, instead of in the URL like GET?
I understand that GET requests are meant to read data, while POST requests are meant to alter data (i.e. if a POST request is sent more than once, dicey things can happen). But why the difference in URL vs body? Putting the text in the body doesn't seem significantly more secure or private.
It's not about security or privacy, but about data.
You can send anything you want in the body, while the URI (specifically the query string) is quite restrictive in content and length.
The HTTP request has two parts: The header and the body
The header contains all information which describes the request and the requested object (path, request parameters, options, etc) and the requested operation (GET, POST, PUT, DELETE, etc).
The body contains all data which are sent by the client to process. This data could be some kind of binary data (an image for example), or some kind of form data (POST data).
This is the HTTP request specification: http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html
Here are the definitions of the HTTP request methods:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

HTTP Range Header for Entity lists

I have resources like this
/entities # GET, POST
/entities/<id> # GET, PUT, DELETE
GET /entities gets the list of all entities.
Now I want to poll for updates. The case for a single entity is straight forward:
GET /entities/2
If-Modified-Since: <http date>
The list is tricky. I want the response to be a list of entities, updated or created since a given point in time. I'd intuitively use
GET /entities
Range: after <http date>
Which is a valid request by HTTP specification http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.2 . But the spec also mandates a 206 Partial Content response, which has to include a Content-Range header. A Content-Range header, in turn, mandates a byte range to be specified http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.16 . This is obviously very inconvenient for my use case.
How would you request a semantic range over HTTP?
From reading section 14.35.1, I would say that the Range header is used to request a specific range of bytes from a resource, not to request a group of entities according to when they were modified.
In this case, I believe you should treat your range as a filter and pass the date as a query string parameter:
GET /entities?modified-since=<date>

How should I implement a COUNT verb in my RESTful web service?

I've written a RESTful web service that supports the standard CRUD operations, and that can return a set of objects matching certain criteria (a SEARCH verb), but I'd like to add a higher-order COUNT verb, so clients can count the resources matching search criteria without having to fetch all of them.
A few options that occur to me:
Ignoring the HTTP specification and returning the object count in the response body of a HEAD request.
Duplicating the SEARCH verb's logic, but making a HEAD request instead of a GET request. The server then would encode the object count in a response header.
Defining a new HTTP method, COUNT, that returns the object count in the response body.
I'd prefer the API of the first approach, but I have to strike that option because it's non-compliant. The second approach seems most semantically correct, but the API isn't very convenient: clients will have to deal with response headers, when most of the time they want to be able to do something easy like response.count. So I'm leaning toward the third approach, but I'm concerned about the potential problems involved with defining a new HTTP method.
What would you do?
The main purpose of rest is to define a set of resources that you interact with using well defined verbs. You must thus avoid to define your own verbs. The number of resources should be considered as a different resource, with its own uri that you can simply GET.
For example:
GET resources?crit1=val1&crit2=val2
returns the list of resources and
GET resources/count?crit1=val1&crit2=val2
Another option is to use the conneg: e.g. Accept: text/uri-list returns the resources list and Accept: text/plain returns only the count
You can use HEAD without breaking the HTTP specification and you can indicate the count by using an HTTP Range header in the response:
HEAD /resource/?search=lorem
Response from the service, assuming that you return the first 20 results by default:
...
Content-Range: resources 0-20/12345
...
This way you transfer the amount of resources to the client within the header of the response message without the need to return a message body.
Update:
The solution suggested Yannick Loiseau will work fine. Just wanted to provide one other alternative approach which can be used to achieve what you need without the need to define a new resource of verb.
You can use GET and add the count into the body of the message. Then, if you API allows clients to request a range of results, you can use that in order to limit the size of message body to a minimum (since you only want the count). One way to do that would be to request an empty range (from 0 to 0), for example:
GET /resource/?search=lorem&range=0,0
The service could then respond as follows, indicating that there are 1234 matching resources in the result set:
<?xml version="1.0" encoding="UTF-8" ?>
<resources range="0-0/1234" />
Ignoring the HTTP specification and returning the object count in the response body of a HEAD request.
IMHO, this is a very bad idea. It may not work simply because you might have intermediaries that don't ignore the HTTP spec.
Defining a new HTTP method, COUNT, that returns the object count in the response body.
There is no problem with this approach. HTTP is extendable and you can define your own verbs. Some firewalls prohibit this, but they are usually also prohibit POST and DELETE and X-HTTP-Method-Override header is widely supported.
Another option, to add a query param to your url, something like: ?countOnly=true

Resources