Restful api data syncing only changed models on big collections - http

I'm trying to find a best practice method to make my api respond with a 204. Lets consider the following simplified steps:
Client GET: Server response 200 full collection of 10 records
No changes are made
Client GET: server response 304 data not changed
Changes are made to record 5. Record 11 is added. Record 2 is deleted
Client GET: server response 200 with the new collection of 10 records
For 10 records this is not a big issue. However when a collection is lets say a few thousands records you don't want to refresh your entire locally stored collection. In this case it's easier to change the 3 updated models (delete record 2, update record 5, add record 11) So I want something like this
Client GET: Server response 200 full collection (paginated or not)
No changes are made
Client GET: server response 304 data not changed
Changes are made to record 5. Record 11 is added. Record 2 is deleted
Client GET: server response 204 with only information about the 3 changed records
In the above case the request cycle is optimized, but the problem is: how do I get the server to respond like this. I must send some information from the client in either a header or body. I was thinking about exploiting the Last-modified-since header. This will be the date change on the server. The client stores this date when status is 200 or 204. When the client sends this header to the server, the server will only respond with records that are changed or deleted since then. For example:
{
"total": 113440,
"from": 0,
"till": 2,
"count": 3,
"removed": [1,2],
"parameters": [{"id"}, {"title"}],
"collection": [
{
"id": 3,
"title": "Updated record"
},
{
"id": 4,
"title": "New record"
},
{
"id": 5,
"title": "Another new record"
}
],
}
Downside is that the server will be more complex because it needs to keep track of the deleted data and the last updated records.
Keep in mind that I did think of sending silent push updates but I don't want to do this since the user is not always happy with background data traffic.
What do you guys think about this solutions and do you have a similar or better solution keeping the following in mind?
lower the amount of needed requests
make the api descriptive and cellular (api being it's own documentation allowing clientside generators)
be as live as possible
effectively deal with huge collections (ex: pagination, only fetch
updated records, caching etc)

You could send up a If-Modified-Since header with your GET requests for the collection. The endpoint could then return only those contents that have changed since the provided date.
Note that this is not the behaviour specified by the HTTP spec for conditional requests. Carefully consider whether this is the approach you want to take. I would not suggest it.
In the spec as written, there is no way I'm aware of to retrieve a subset of a collection conditionally. The closest I can think of is to have the collection contain only links to the individual elements. Make the client call a GET on each individual element, using If-None-Match or If-Modified-Since so you only pass stale entities over the wire. You'd still have to make a server hit for the collection entity, though you could also use a conditional header on that request.

IMHO this calls for a query interface, where you tell the rest service in the GET parameters what you want, simar to how you would do it in SQL or in a solr or elasticsearch query.
Why hide something in HTTP headers, that the client is explicitly querying, right?

Related

Cosmos Db library Microsoft.Azure.DocumentDB.Core (2.1.0) - Actual REST invocations

We are attempting to Wiremock (https://github.com/WireMock-Net/WireMock.Net) CosmosDb invocations - so we can build integrationtests in our .net core 2.1 microservice.
By looking at the WireMock instance Request/Response entries, we can observe the following:
1) GET towards "/"
We mock the returning metadata of databases
THIS IS OK
2) GET towards collection (in our case: "/dbs/Chunker/colls/RHTMLChunks")
Returns metadata about the collections
THIS IS OK
3) POST a Query that results in one document being returned towards the documents endpoint on the collection (in our case: "/dbs/Chunker/colls/RHTMLChunks/docs")
I have tried to emulate what we get when we do the exact same query towards the CosmosDb instance in Postman, including headers and response.
However I observe that the lib does the query again, and again, and again....
(I can see this by pausing in Visual Studio, then look at the RequestLog in WireMock)
Does anyone know what should be returned. I have set up WireMock to return the following json payload:
{
"_rid": "q0dcAOelSAI=",
"Documents": [
{
"id": "gL20020621z2D34-1",
"ChunkSize": 658212,
"TotalChunks": 2,
"Metadata": {
"Active": true,
"PublishedDate": "",
},
"ChunkId": 1,
"Markup": "<h1>hello</h1>",
"MainDestination": "gL20020621z2D34",
"_rid": "q0dcAOelSAIHAAAAAAAAAA==",
"_self": "dbs/q0dcAA==/colls/q0dcAOelSAI=/docs/q0dcAOelSAIHAAAAAAAAAA==/",
"_etag": "\"0100e92a-0000-0000-0000-5ba96cf70000\"",
"_attachments": "attachments/",
"_ts": 1537830135
}
],
"_count": 0
}
Problems:
1) Can not find .pdb belonging to Microsoft.Azure.DocumentDB.Core v2.1.0
2) What payload/headers should be returned, so the library will NOT blow up, and retry when we invoke:
var response = await documentQuery.ExecuteNextAsync<DocumentDto>(); // this hangs forever
Please help :)
We're working on open sourcing the C# code base and some other fun improvements to make this easier. In the mean time, I'd advocate for using the emulator for local testing/etc., although I understand mocking is still a lot faster an nicer - it'll just be hard :)
My best pointer is actually our Node.js code base since that's public already. The query code is relatively hard to follow, but basically, you create a query, we look up all the partitions we need to talk to, then we send a request for each partition and keep querying until we don't get back a continuation token anymore (or maxBufferedItem Count/etc. goes over the limit, and we pause until goes back down, etc.)
Effectively, we send out N number of requests for each partition, where N is the number of pages of results, and can vary per partition and query. You'd likely be able to mock a single partition, single page response relatively easy, but a full partition response isn't gonna be fun.
As I mentioned in the beginning, we've got some cool stuff coming, hopefully before the end of the year, which will make offline mocking easier, as well as open sourcing it finally. You might be better off with the emulator until then.

Mobile data reported in GA Measurement Protocol appear in realtime but not in daily summary

I've been attempting to log activity on a mobile-like device using the Google Analytics Measurement Protocol. All of these attempts have validated using the validation URL, and I can see activity when I look at the real-time reports on the Analytics website. But when I look at the Home or Overview reports for the day - no activity is shown.
The view is set for "All Mobile App Data".
The POST body looks something like this:
v=1&tid=UA-000000000-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
The ua field is just a pre-defined string. I found that if I omitted it, the Real Time monitoring listed the hits as desktop hits, although I was in a Mobile report and the ds field was "app".
Am I missing a field that is required? Is there some reason why it is showing up in the real-time report, but not in a daily report? Is there some other way to diagnose why the data is vanishing, or confirm the data is actually being captured?
When i check the debug endpoint the hit is valid
Request:
https://www.google-analytics.com/debug/collect?v=1&tid=UA-XXX-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
Response
{
"hitParsingResult": [ {
"valid": true,
"parserMessage": [ ],
"hit": "/debug/collect?v=1\u0026tid=UA-53766825-1\u0026ds=app\u0026qt=1601\u0026uid=uid-zzzzz\u0026t=screenview\u0026cd=Foo\u0026an=Foo%20App%20Name\u0026aid=com.example.foo\u0026aiid=com.example.foo\u0026av=0.0.1\u0026ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36"
} ],
"parserMessage": [ {
"messageType": "INFO",
"description": "Found 1 hit in the request."
} ]
}
I cannot use one of the mobile libraries from Firebase - this is not one of the platforms they support. I do not wish to pretend this is a web page - there is no associated hostname or path. I do not wish to use Events since I can't do event Behavior Flow, which is one of the things I'm interested in seeing.
I'm aware that it can sometimes take "a day or so" for results to first appear. The site was setup over five days ago at this point, and has received data during that time.
Good thought about the anti-spam setting, however the setting appears to be correct:
I've also tried using GET instead of POST - no change, it still shows the hit in real-time, but then it vanishes.
However, I know that it can record hits permanently. There were two hits from a spammer in Russia that have shown up in the daily report (I wasn't there to see it show up in real-time). I don't know what they did, but would love to find out since it might help figure out how I can add a record.
In the real-time reports, it correctly points out the data center all the hits are coming from. Perhaps that is filtering it out somewhere out of my control?
Try adding Cid I know it says this is an optional parameter but for mobile accounts I belive it may be required.
Client ID
Optional.
This field is required if User ID (uid) is not specified in the request. This anonymously identifies a particular user, device, or browser instance. For the web, this is generally stored as a first-party cookie with a two-year expiration. For mobile apps, this is randomly generated for each particular instance of an application install. The value of this field should be a random UUID (version 4) as described in http://www.ietf.org/rfc/rfc4122.txt.
Example value: 35009a79-1a05-49d7-b876-2b884d0f825b
Although this says it needs to be a UUIDv4, it does work with other UUIDs (I've tested it with a v5, which is a hash against the value used for the uid parameter).

Is it true that my AJAX requests aren't necessarily made to my Controller in order?

So I have a basic
$.ajax( { url : 'MyController/MyAction',
method : 'POST',
async: true,
.
. } );
that can be called very frequently as it is event-driven. Like it may be called 50 times in 1 second if the user is being obnoxious. It updates values in the database.
My friend told me that it's possible that the updates may be sent to the database in the wrong order. Is this true? This is causing me major cognitive dissonance and I can't sleep tonight.
I should mention that these values being updates in the database are associated with a user. In particular, the data is like
data : { userId : '21EC2020-3AEA-4069-A2DD-08002B30309D',
answerId : '69',
val : 'd' }
where the only values changing in rapid succession are answerId and val.
Unfortunately, my understand is "YES. The order is not guaranteed".
You send http request 50 times in 1 second and server save it to DB.
When the network is good and server is strong, it is okay to be saved to DB in order.
But if http sever is busy or network is interrupted, it does not guarantee the data will still always ordered in real happened sequence. Ex, 1 or 2 data order will be exchanged in DB.
My suggestion is : if the order is very critical and the update traffic is heavy, you should add a happen time in http data and save it to DB.
When you select data from DB, you can order by the happen time and it will make sure it has the correct order as event happens to avoid the mis-order caused by server or network busy.

Most efficient method of pulling in weather data for multiple locations

I'm working on a meteor mobile app that displays information about local places of interest and one of the things that I want to show is the weather in each location. I've currently got my locations stored with latlng coordinates and they're searchable by radius. I'd like to use the openweathermap api to pull in some useful 'current conditions' information so that when a user looks at an entry they can see basic weather data. Ideally I'd like to limit the number of outgoing requests to keep the pages snappy (and API requests down)
I'm wondering if I can create a server collection of weather data that I update regularly, server-side (hourly?) that my clients then query (perhaps using a mongo $near lookup?) - that way all of my data is being handled within meteor, rather than each client going out to grab the latest data from the API. I don't want to have to iterate through all of the locations in my list and do a separate call out to the api for each as I have approx. 400 locations(!). I'm afraid I'm new to API requests (and meteor itself) so apologies if this is a poorly phrased question.
I'm not entirely sure if this is doable, or if it's even the best approach - any advice (and links to any useful code snippets!) would be greatly appreciated!
EDIT / UPDATE!
OK I haven't managed to get this working yet but I some more useful details on the data!
If I make a request to the openweather API I can get data back for all of their locations (which I would like to add/update to a collection). I could then do regular lookup, instead of making a client request straight out to them every time a user looks at a location. The JSON data looks like this:
{
"message":"accurate",
"cod":"200",
"count":50,
"list":[
{
"id":2643076,
"name":"Marazion",
"coord":{
"lon":-5.47505,
"lat":50.125561
},
"main":{
"temp":292.15,
"pressure":1016,
"humidity":68,
"temp_min":292.15,
"temp_max":292.15
},
"dt":1403707800,
"wind":{
"speed":8.7,
"deg":110,
"gust":13.9
},
"sys":{
"country":""
},
"clouds":{
"all":75
},
"weather":[
{
"id":721,
"main":"Haze",
"description":"haze",
"icon":"50d"
}
]
}, ...
Ideally I'd like to build my own local 'weather' collection that I can search using mongo's $near (to keep outbound requests down, and speed up), but I don't know if this will be possible because the format that the data comes back in - I think I'd need to structure my location data like this in order to use a geo search:
"location": {
"type": "Point",
"coordinates": [-5.47505,50.125561]
}
My questions are:
How can I build that collection (I've seen this - could I do something similar and update existing entries in the collection on a regular basis?)
Does it just need to live on the server, or client too?
Do I need to manipulate the data in order to get a geo search to work?
Is this even the right way to approach it??
EDIT/UPDATE2
Is this question too long/much? It feels like it. Maybe I should split it out.
Yes easily possible. Because your question is so large I'll give you a high level explanation of what I think you need to do.
You need to create a collection where you're gonna save the weather data in.
A request worker that requests new data and updates the collection on a set interval. Use something like cron-tick for scheduling the interval.
Requesting data should only happen server side and I can recommend the request npm package for that.
Meteor.publish the weather collection and have the client subscribe to that, with optionally a filter for it's location.
You should now be getting the weather data on your client and should be able to get freaky with it.

Dynamics AX 2009 AIF Tables

Background
I have an issue where roughly once a month the AIFQueueManager table is populated with ~150 records which relate to messages which had been sent to AX (where they "successfully failed"; i.e. errorred due to violation of business rules, but returned an exception as expected) over 6 months ago.
Question
What tables are involved in the AIF inbound message process / what order to events occur in? e.g. XML file is picked up and recorded in the AifDocumentLog, data's extracted and added to the AifQueueManager and AifGatewayQueue tables, records from here are then inserted in the AifMessageLog, etc.
Thanks in advance.
There are 4 main AIF classes, I will be talking about the inbound only, and focusing on the included file system adapter and flat XML files. I hope this makes things a little less hazy.
AIFGatewayReceiveService - Uses adapters/channels to read messages in from different sources, and dumps them in the AifGatewayQueue table
AIFInboundProcessingService - This processes the AifGatewayQueue table data and sends to the Ax[Document] classes
AIFOutboundProcessingService - This is the inverse of #2. It creates XMLs with relevent metadata
AIFGatewaySendService - This is the inverse of #1, where it uses adapters/channels to send messages out to different locations from the AifGatewayQueue
For #1
So #1 basically fills the AifGatewayQueue, which is just a queue of work. It loops through all of your channels and then finds the relevant adapter by ClassId. The adapters are classes that implement AifIntegrationAdapter and AifReceiveAdapter if you wanted to make your own custom one. When it loops over the different channels, it then loops over each "message" and tries to receive it into the queue.
If it can't process the file for some reason, it catches exceptions and throws them in the SysExceptionTable [Basic>Periodic>Application Integration Framework>Exceptions]. These messages are scraped from the infolog, and the messages are generated mostly from the receive adaptor, which would be AifFileSystemReceiveAdapter for my example.
For #2
So #2 is processing the inbound messages sitting in the queue (ready/inprocess). The AifRequestProcessor\processServiceRequest does the work.
From this method, it will call:
Various calls to Classes\AifMessageManager, which puts records in the AifMessageLog and the AifDocumentLog.
This key line: responseMessage = AifRequestProcessor::executeServiceOperation(message, endpointActionPolicy); which actually does the operation against the Ax[Document] classes by eventually getting to AifDispatcher::callServiceMethod(...)
It gets the return XML and packages that into an AifMessage called responseMessage and returns that where it may be logged. It also takes that return value, and if there is a response channel, it submits that back into the AifGatewayQueue
AifQueueManager is actually cleared and populated on the fly by calling AifQueueManager::createQueueManagerData();.

Resources