As a end-point user of an API, how can I list all parameters available to pass the query? In my case (stats about Age of Empires 2 matches), the website describing the API has a list with some of them but it seems there are more available.
To provide more context, I'm extracting the following information:
GET("https://aoe2.net/api/matches?game=aoe2de&count=1000&since=1632744000&map_type=12")
but for some reason the last condition, map_type=12 does nothing (output is the same as without it). I'm after the list of parameters available, so I can extract what I want.
PD: this post is closely related but does not focus on API. Perhaps this makes a difference, as the second answer there seems to suggest.
It is not possible to find out all available (undocumented) query parameters for a query, unless the API explicitly provides such a method or you can find out how the API server processes the query.
For instance, if the API server code is open source, you could find out from the code how the query is processed. Provided that you find the code also.
The answers in the post you linked are similarly valid for an API site as well as for one that provides content for a web browser (a web server can be both).
Under the hood, there is not necessarily any difference between an API server or a server that provides web content (html) in terms of how queries are handled.
As for the parameters seemingly without an effect, it seems that the API in question does not validate the query parameters, i.e., you can put arbitrary parameters in the query and the server will simply ignore parameters that it is not specifically programmed to use.
The documentation on their website is all any of us have to go by https://aoe2.net/#api
You can't just add your own parameters to the URL and expect it to return a value back as they have to have coded it to work that way.
Your best bet is to just extract as much data as you can by increasing the count parameter, then loop through the JSON response and extract the map_type from there.
JavaScript example:
<script>
json=[{"match_id":"1953364","lobby_id":null,"game_type":0},
{"match_id":"1961217","lobby_id":null,"game_type":0},
{"match_id":"1962068","lobby_id":null,"game_type":1},
{"match_id":"1962821","lobby_id":null,"game_type":0},
{"match_id":"1963814","lobby_id":null,"game_type":0},
{"match_id":"1963807","lobby_id":null,"game_type":0},
{"match_id":"1963908","lobby_id":null,"game_type":0},
{"match_id":"1963716","lobby_id":null,"game_type":0},
{"match_id":"1964491","lobby_id":null,"game_type":0},
{"match_id":"1964535","lobby_id":null,"game_type":12},];
for(var i = 0; i < json.length; i++) {
var obj = json[i];
if(obj.game_type==12){
//do something with game_type 12 json object
console.log(obj);
}
}
</script>
Related
I don't see nextSyncToken in the response. I followed the doc(https://developers.google.com/calendar/api/guides/sync) and paginated using nextPageToken but I couldn't see the nextSyncToken on the last page.
API Used: GET /calendars/primary/events?maxResults=10&singleEvents=true&pageToken=********
I don't know whether if I miss anything here. Could anyone help me with this?
I have seen from the response link on the other answer comment that you are using orderBy on the request.
This is why the nextSyncToken is not showing up.
As mentioned on the documentation on Events: list -> Parameters -> syncToken:
Token obtained from the nextSyncToken field returned on the last page of results from the previous list request. It makes the result of this list request contain only entries that have changed since then. All events deleted since the previous list request will always be in the result set and it is not allowed to set showDeleted to False.
There are several query parameters that cannot be specified together with nextSyncToken to ensure consistency of the client state.
These are:
iCalUID
orderBy
privateExtendedProperty
q
sharedExtendedProperty
timeMin
timeMax
updatedMin
If the syncToken expires, the server will respond with a 410 GONE response code and the client should clear its storage and perform a full synchronization without any syncToken.
Learn more about incremental synchronization.
Optional. The default is to return all entries.
You should remove the orderBy from the request to get the syncToken
Could you please provide the response from gcalendar API? It's hard to say more without detail information. I event don't know which language are you using.
Try to use a vendor library to sort that out:
a) https://packagist.org/packages/google/apiclient (for PHP)
b) https://www.npmjs.com/package/google-calendar (for JavaScript)
and/or
Try to use alternative endpoint: GET https://www.googleapis.com/calendar/v3/calendars/calendarId/events.
I want to do something very similar to what's shown in the docs for FSharp.Data:
The URL I'm requesting from though (TFS) requires client authentication. Is there any way I can provide this by propagating my Windows creds? I notice JsonProvider has a few other compile-time parameters, but none seem to be in support of this.
You don't have to provide a live URL as a type parameter to JsonProvider; you can also provide the filename of a sample file that reflects the structure you expect to see. With that feature, you can do the following steps:
First, log in to the service and save a JSON file that reflects the API you're going to use.
Next, do something like the following:
type TfsData = JsonProvider<"/path/to/sample/file.json">
let url = "https://example.com/login/etc"
// Use standard .Net API to log in with your Windows credentials
// Save the results in a variable `jsonResults`
let parsedResults = TfsData.Parse(jsonResults)
printfn "%A" parsedResults.Foo // At this point, Intellisense should work
This is all very generic, of course, since I don't know precisely what you need to do to log in to your service; presumably you already know how to do that. The key is to retrieve the JSON yourself, then use the .Parse() method of your provided type to parse it.
I'm doing some heavy web scraping using Python. In some cases, post data is sent not through a form submit but through some Javascript, which I cannot interact with via this approach. In order to circumvent this, I've been appending names and values for the post requests to the url and then visiting that url.
This method was working fine until I came across a site that used this kind of structure: [sitename].com/?[pagename].do/. I admit total ignorance about this .do extension, though some light searching tells me that it has to do with Struts and a Java-based backend. In this case it seems to be a way of dynamically generating a table; I'm trying to filter the results of that table. What I want to enter is something like [sitename].com/?[pagename].do?[name]=[value]&[name]=[value], but this doesn't work, nor does it even seem like it should work. I attempted it using several variations in syntax. It seems like something I don't quite understand is going on here.
I wish I could direct you to the actual site, but unfortunately I cannot due to the sensitive nature of the project. Let me know, though, if there's any additional information that would be helpful in providing an answer. Thanks in advance.
Edit: This is not really a "my code isn't working" question, as it's the underlying functionality that I would like to emulate in my code which is troubling me, but I'll do my best to get grittier. I'm contractually bound not to share the names of the sites that we're studying, but I will try to model the problem. I am hoping that someone with some familiarity of the back-end activity sending this .do page to the browser will be able to shed light.
import urllib
import urllib2
#
## case 1: a site that i have success in scraping
url = 'http://[sitename]/[pagename]'
values = {'s' : '40', 'pg' : '1'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page #i get the filtered data that i am looking for
#
## case 2: the site that poses a problem for the encoding of post parameters
url = 'http://[sitename]/?[pagename].do/' # this site uses a .do file to generate
# the content i want to filter. note that the page name is preceded by ?.
values = {'s' : '40', 'pg' : '1'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page # i am taken back to the root of the site,
# the same result i would get if i entered nonsense
# post parameters that did not correspond with actual control names.
here, also, is an example of some javascript on the page that accomplishes what i'd like to do with my scraper:
function page_next (id) {
$("#loading").fadeIn("normal");
$.post("/?dumps.do/", {s: id, pg: 2},
function( data ) {
var content = $( data ).find( '#dumps' );
}
)
}
I don't know what site you are parsing, but this: [sitename].com/?[pagename].do/ is not something I would call default Struts behaviour, assuming it's indeed a Struts application.
Having a .do extension was indeed something Struts used to use as request mapping, but the URL in that case should be [sitename].com/[pagename].do not [sitename].com/?[pagename].do/
In the second form, the action is in fact a parameter in a query string. This is why this syntax is broken: [sitename].com/?[pagename].do?[name]=[value]&[name]=[value]. You want to send a query string to the action but the action itself is a parameter in the query string.
But that's not the issue. The issue is that the site is doing something with that parameter and expects to receive it's data in a certain way, a way you were not able to reverse engineer.
Assuming again that this is a Struts application, Struts uses a front controller to intercept all action.do URLs and then use the action to invoke a particular class in the application, a class that is mapped to that particular action. The format for this should be [sitename].com/[pagename].do. That would be similar to having, say, [sitename].com/[pagename].php.
But having the action as a parameter makes me think that the site has a different front controller (not that of Struts) that is taking the parameter from the query string and passing it downstream to the Struts framework.
There could be a lot of reasons for having this funky way of handling requests, including making it harder for others to scrape the site, although this seems kinda straight forward:
$.post("/?dumps.do/", {s: id, pg: 2}, ...
Have you tried doing a POST to the root of the application with the action in the query string?
When leveraging a Web Api endpoint hosted on a different server, what is the best way to have pagination to avoid excessive serialization and deserialization while still being able to see the total # of matches that were found in the Get request?
Web Api:
[Queryable]
public IQueryable<Customer> Get(string input) {
return _repo.GetCustomers(input).AsQueryable();
}
Controller Code:
HttpResponseMessage response = client.GetAsync("api/customer?input=test&$top=2&$skip=0").Result; // test string
if (response.IsSuccessStatusCode) {
CustomerSearchResultViewModel resultViewModel = new CustomerSearchResultViewModel();
resultViewModel.Customers = response.Content.ReadAsAsync<IEnumerable<Customer>>().Result.ToList();
resultViewModel.TotalResults = resultViewModel.Customers.Count;
}
In this example, the TotalResults will be capped at 2 rather than returning the total # of matches found before pagination.
Is there a way to get the total results out by leveraging OData? If that is not possible, how can I best preserve OData functionality while getting out the number I need?
You can use the OData $inlinecount option. The request would look like,
api/customer?input=test&$top=2&$skip=0&$inlinecount=allpages.
Now, we don't support sending the value of count with the default json and xml formatters out-of-the-box. Your client code is the best example why we have decided not to send it by default. The reason is that the client wouldn't know what to do with the extra count property. Our OData formatter supports $inlinecount option by default as the OData format has clear rules for where the count value fits in the response.
That said, you can do a simple change to support count in your responses. Your client code also has to change though. Refer to this answer for details.
I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence). However, is it a valid thing to do according to the HTTP specification? Say you were to POST to:
http://1.2.3.4/MyApplication/Books?bookCode=1234
... and submit some update like a change to the book name whose book code is 1234, you'd be wanting the processing code to take both the bookCode query string param into account, and the POSTed form params with the updated book information. Is this valid, and is it a good idea?
Is it valid according HTTP specifications ?
Yes.
Here is the general syntax of URL as defined in those specs
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
There is no additional constraints on the form of the http_URL. In particular, the http method (i.e. POST,GET,PUT,HEAD,...) used don't add any restriction on the http URL format.
When using the GET method : the server can consider that the request body is empty.
When using the POST method : the server must handle the request body.
Is it a good idea ?
It depends what you need to do. I suggest you this link explaining the ideas behind GET and POST.
I can think that in some situation it can be handy to always have some parameters like the user language in the query part of the url.
I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence).
Any competent framework should support this.
Is this valid
Yes. The POST method in HTTP does not impose any restrictions on the URI used.
is it a good idea?
Obviously not, if the framework you are going to use is still clue-challenged. Otherwise, it depends on what you want to accomplish. The major use case (redirection of a data subset to a new POST target) has been irretrievably broken by browser implementations (all mechanically following the broken lead of Mosaic/Netscape), so the considerations here are mostly theoretical.