using different request parser depending on the queried route - http

I'm Implementing a mini http server using boost beast. the server has two different routes POST /upload/... and the other one is POST /info. The first one is used for uploading some big files and the other one is for hadling json objects. To keep the performance as hight as possible am I trying to parse each route with the suitable parser file_body and string_body/dynamic_body.
I was hoping that it is possible to do something like:
http::async_read_header(
socket_,
buffer_,
request_,
[self](beast::error_code ec, std::size_t)
{
if (!ec)
self->request_.body().data();
});
but it seems not possible.
Is there any way to use different request bodies depending on header info?
Many thanks in advance

This should be covered in the docs but here's how to do it: Use the type beast::request_parser<beast::empty_body> to first read the header, and then depending on the contents of the header you move-construct a new parser from the old one with the body type you want. Example:
// Deferred body type commitment
request_parser<empty_body> req0;
...
request_parser<string_body> req{std::move(req0)};
You can read the complete documentation on switching body types here:
https://www.boost.org/doc/libs/1_69_0/libs/beast/doc/html/beast/ref/boost__beast__http__parser/parser/overload5.html

Related

How to list all parameters available to query via API?

As a end-point user of an API, how can I list all parameters available to pass the query? In my case (stats about Age of Empires 2 matches), the website describing the API has a list with some of them but it seems there are more available.
To provide more context, I'm extracting the following information:
GET("https://aoe2.net/api/matches?game=aoe2de&count=1000&since=1632744000&map_type=12")
but for some reason the last condition, map_type=12 does nothing (output is the same as without it). I'm after the list of parameters available, so I can extract what I want.
PD: this post is closely related but does not focus on API. Perhaps this makes a difference, as the second answer there seems to suggest.
It is not possible to find out all available (undocumented) query parameters for a query, unless the API explicitly provides such a method or you can find out how the API server processes the query.
For instance, if the API server code is open source, you could find out from the code how the query is processed. Provided that you find the code also.
The answers in the post you linked are similarly valid for an API site as well as for one that provides content for a web browser (a web server can be both).
Under the hood, there is not necessarily any difference between an API server or a server that provides web content (html) in terms of how queries are handled.
As for the parameters seemingly without an effect, it seems that the API in question does not validate the query parameters, i.e., you can put arbitrary parameters in the query and the server will simply ignore parameters that it is not specifically programmed to use.
The documentation on their website is all any of us have to go by https://aoe2.net/#api
You can't just add your own parameters to the URL and expect it to return a value back as they have to have coded it to work that way.
Your best bet is to just extract as much data as you can by increasing the count parameter, then loop through the JSON response and extract the map_type from there.
JavaScript example:
<script>
json=[{"match_id":"1953364","lobby_id":null,"game_type":0},
{"match_id":"1961217","lobby_id":null,"game_type":0},
{"match_id":"1962068","lobby_id":null,"game_type":1},
{"match_id":"1962821","lobby_id":null,"game_type":0},
{"match_id":"1963814","lobby_id":null,"game_type":0},
{"match_id":"1963807","lobby_id":null,"game_type":0},
{"match_id":"1963908","lobby_id":null,"game_type":0},
{"match_id":"1963716","lobby_id":null,"game_type":0},
{"match_id":"1964491","lobby_id":null,"game_type":0},
{"match_id":"1964535","lobby_id":null,"game_type":12},];
for(var i = 0; i < json.length; i++) {
var obj = json[i];
if(obj.game_type==12){
//do something with game_type 12 json object
console.log(obj);
}
}
</script>

Is it valid to combine a form POST with a query string?

I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence). However, is it a valid thing to do according to the HTTP specification? Say you were to POST to:
http://1.2.3.4/MyApplication/Books?bookCode=1234
... and submit some update like a change to the book name whose book code is 1234, you'd be wanting the processing code to take both the bookCode query string param into account, and the POSTed form params with the updated book information. Is this valid, and is it a good idea?
Is it valid according HTTP specifications ?
Yes.
Here is the general syntax of URL as defined in those specs
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
There is no additional constraints on the form of the http_URL. In particular, the http method (i.e. POST,GET,PUT,HEAD,...) used don't add any restriction on the http URL format.
When using the GET method : the server can consider that the request body is empty.
When using the POST method : the server must handle the request body.
Is it a good idea ?
It depends what you need to do. I suggest you this link explaining the ideas behind GET and POST.
I can think that in some situation it can be handy to always have some parameters like the user language in the query part of the url.
I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence).
Any competent framework should support this.
Is this valid
Yes. The POST method in HTTP does not impose any restrictions on the URI used.
is it a good idea?
Obviously not, if the framework you are going to use is still clue-challenged. Otherwise, it depends on what you want to accomplish. The major use case (redirection of a data subset to a new POST target) has been irretrievably broken by browser implementations (all mechanically following the broken lead of Mosaic/Netscape), so the considerations here are mostly theoretical.

Designing proper REST URIs

I have a Java component which scans through a set of folders (input/processing/output) and returns the list of files in JSON format.
The REST URL for the same is:
GET http://<baseurl>/files/<foldername>
Now, I need to perform certain actions on each of the files, like validate, process, delete, etc. I'm not sure of the best way to design the REST URLs for these actions.
Since its a direct file manipulation, I don't have any unique identifier for the files, except their paths. So I'm not sure if the following is a good URL:
POST http://<baseurl>/file/validate?path=<filepath>
Edit: I would have ideally liked to use something like /file/fileId/validate. But the only unique id for files is its path, and I don't think I can use that as part of the URL itself.
And finally, I'm not sure which HTTP verb to use for such custom actions like validate.
Thanks in advance!
Regards,
Anand
When you implement a route like http:///file/validate?path you encode the action in your resource that's not a desired effect when modelling a resource service.
You could do the following for read operations
GET http://api.example.com/files will return all files as URL reference such as
http://api.example.com/files/path/to/first
http://api.example.com/files/path/to/second
...
GET http://api.example.com/files/path/to/first will return validation results for the file (I'm using JSON for readability)
{
name : first,
valid : true
}
That was the simple read only part. Now to the write operations:
DELETE http://api.example.com/files/path/to/first will of course delete the file
Modelling the file processing is the hard part. But you could model that as top level resource. So that:
POST http://api.example.com/FileOperation?operation=somethingweird will create a virtual file processing resource and execute the operation given by the URL parameter 'operation'. Modelling these file operations as resources gives you the possibility to perform the operations asynchronous and return a result that gives additional information about the process of the operation and so on.
You can take a look at Amazon S3 REST API for additional examples and inspiration on how to model resources. I can highly recommend to read RESTful Web Services
Now, I need to perform certain actions on each of the files, like validate, process, delete, etc. I'm not sure of the best way to design the REST URLs for these actions. Since its a direct file manipulation, I don't have any unique identified for the files, except their paths. So I'm not sure if the following is a good URL: POST http:///file/validate?path=
It's not. /file/validate doesn't describe a resource, it describes an action. That means it is functional, not RESTful.
Edit: I would have ideally liked to use something like /file/fileId/validate. But the only unique id for files is its path, and I don't think I can use that as part of the URL itself.
Oh yes you can! And you should do exactly that. Except for that final validate part; that is not a resource in any way, and so should not be part of the path. Instead, clients should POST a message to the file resource asking it to validate itself. Luckily, POST allows you to send a message to the file as well as receive one back; it's ideal for this sort of thing (unless there's an existing verb to use instead, whether in standard HTTP or one of the extensions such as WebDAV).
And finally, I'm not sure which HTTP verb to use for such custom actions like validate.
POST, with the action to perform determined by the content of the message that was POSTed to the resource. Custom “do something non-standard” actions are always mapped to POST when they can't be mapped to GET, PUT or DELETE. (Alas, a clever POST is not hugely discoverable and so causes problems for the HATEOAS principle, but that's still better than violating basic REST principles.)
REST requires a uniform interface, which in HTTP means limiting yourself to GET, PUT, POST, DELETE, HEAD, etc.
One way you can check on each file's validity in a RESTful way is to think of the validity check not as an action to perform on the file, but as a resource in its own right:
GET /file/{file-id}/validity
This could return a simple True/False, or perhaps a list of the specific constraint violations. The file-id could be a file name, an integer file number, a URL-encoded path, or perhaps an unencoded path like:
GET /file/bob/dir1/dir2/somefile/validity
Another approach would be to ask for a list of the invalid files:
GET /file/invalid
And still another would be to prevent invalid files from being added to your service in the first place, ie, when your service processes a PUT request with bad data:
PUT /file/{file-id}
it rejects it with an HTTP 400 (Bad Request). The body of the 400 response could contain information on the specific error.
Update: To delete a file you would of course use the standard HTTP REST verb:
DELETE /file/{file-id}
To 'process' a file, does this create a new file (resource) from one that was uploaded? For example Flickr creates several different image files from each one you upload, each with a different size. In this case you could PUT an input file and then trigger the processing by GET-ing the corresponding output file:
PUT /file/input/{file-id}
GET /file/output/{file-id}
If the processing isn't near-instantaneous, you could generate the output files asynchronously: every time a new input file is PUT into the web service, the web service starts up an asynchronous activity that eventually results in the output file being created.

How should I implement a COUNT verb in my RESTful web service?

I've written a RESTful web service that supports the standard CRUD operations, and that can return a set of objects matching certain criteria (a SEARCH verb), but I'd like to add a higher-order COUNT verb, so clients can count the resources matching search criteria without having to fetch all of them.
A few options that occur to me:
Ignoring the HTTP specification and returning the object count in the response body of a HEAD request.
Duplicating the SEARCH verb's logic, but making a HEAD request instead of a GET request. The server then would encode the object count in a response header.
Defining a new HTTP method, COUNT, that returns the object count in the response body.
I'd prefer the API of the first approach, but I have to strike that option because it's non-compliant. The second approach seems most semantically correct, but the API isn't very convenient: clients will have to deal with response headers, when most of the time they want to be able to do something easy like response.count. So I'm leaning toward the third approach, but I'm concerned about the potential problems involved with defining a new HTTP method.
What would you do?
The main purpose of rest is to define a set of resources that you interact with using well defined verbs. You must thus avoid to define your own verbs. The number of resources should be considered as a different resource, with its own uri that you can simply GET.
For example:
GET resources?crit1=val1&crit2=val2
returns the list of resources and
GET resources/count?crit1=val1&crit2=val2
Another option is to use the conneg: e.g. Accept: text/uri-list returns the resources list and Accept: text/plain returns only the count
You can use HEAD without breaking the HTTP specification and you can indicate the count by using an HTTP Range header in the response:
HEAD /resource/?search=lorem
Response from the service, assuming that you return the first 20 results by default:
...
Content-Range: resources 0-20/12345
...
This way you transfer the amount of resources to the client within the header of the response message without the need to return a message body.
Update:
The solution suggested Yannick Loiseau will work fine. Just wanted to provide one other alternative approach which can be used to achieve what you need without the need to define a new resource of verb.
You can use GET and add the count into the body of the message. Then, if you API allows clients to request a range of results, you can use that in order to limit the size of message body to a minimum (since you only want the count). One way to do that would be to request an empty range (from 0 to 0), for example:
GET /resource/?search=lorem&range=0,0
The service could then respond as follows, indicating that there are 1234 matching resources in the result set:
<?xml version="1.0" encoding="UTF-8" ?>
<resources range="0-0/1234" />
Ignoring the HTTP specification and returning the object count in the response body of a HEAD request.
IMHO, this is a very bad idea. It may not work simply because you might have intermediaries that don't ignore the HTTP spec.
Defining a new HTTP method, COUNT, that returns the object count in the response body.
There is no problem with this approach. HTTP is extendable and you can define your own verbs. Some firewalls prohibit this, but they are usually also prohibit POST and DELETE and X-HTTP-Method-Override header is widely supported.
Another option, to add a query param to your url, something like: ?countOnly=true

Is it considered bad practice to perform HTTP POST without entity body?

I need to invoke a process which doesn't require any input from the user, just a trigger. I plan to use POST /uri without a body to trigger the process. I want to know if this is considered bad from both HTTP and REST perspectives?
I asked this question on the IETF HTTP working group a few months ago. The short answer is: NO, it's not a bad practice (but I suggest reading the thread for more details).
Using a POST instead of a GET is perfectly reasonable, since it also instructs the server (and gateways along the way) not to return a cached response.
POST is completely OK. In difference of GET with POST you are changing the state of the system (most likely your trigger is "doing" something and changing data).
I used POST already without payload and it "feels" OK. One thing you should do when using POST without payload: Pass header Content-Length: 0. I remember problems with some proxies when I api-client didn't pass it.
If you use POST /uri without a body it is something like using a function which does not take an argument .e.g int post (void); so it is reasonable to have function to your resource class which can change the state of an object without having an argument. If you consider to implement the Unix touch function for a URI, is not it be good choice?
Yes, it's OK to send a POST request without a body and instead use query string parameters. But be careful if your parameters contain characters that are not HTTP valid you will have to encode them.
For example if you need to POST 'hello world' to and end point you would have to make it look like this: http://api.com?param=hello%20world
Support for the answers that POST is OK in this case is that in Python's case, the OpenAPI framework "FastAPI" generates a Swagger GUI (see image) that doesn't contain a Body section when a method (see example below) doesn't have a parameter to accept a body.
the method "post_disable_db" just accepts a path parameter "db_name" and doesn't have a 2nd parameter which would imply a mandatory body.
#router.post('/{db_name}/disable',
status_code=HTTP_200_OK,
response_model=ResponseSuccess,
summary='',
description=''
)
async def post_disable_db(db_name: str):
try:
response: ResponseSuccess = Handlers.databases_handler.post_change_db_enabled_state(db_name, False)
except HTTPException as e:
raise (e)
except Exception as e:
logger.exception(f'Changing state of DB to enabled=False failed due to: {e.__repr__()}')
raise HTTPException(HTTP_500_INTERNAL_SERVER_ERROR, detail=e.__repr__())
return response

Resources