What would be a good API format for “IS NOT SET” clause? - api-design

I have a query API against my service, that looks like this (JSON-ish format):
{
filter: {
,attribute2: [val21, val22]
,attribute3: []
}
}
means effectively, select data WHERE attribute2 in ("val21", "val22") AND attribute3 IS NOT NULL in SQL-ish syntax (meaning, the object being returned has attribute 3 set, but I really don't care what its value is. SQL isn't very good at expressing this of course, as my data is key-value store where a key may be "not set" at all instead of being null valued).
I need to expand this API to be able to express IS NOT SET predicate, and I'm at a loss as to what a good way to do so would be.
The only thing I can possibly think of is to add a special "NOT_SET" value in the request API, that would produce NOT SET semantics; but it seems really klunky and hard to grasp:
The API syntax can be thought of as JSON as far as its expressiveness/capability
An ideal answer would reference some well accepted rules on API design, to show that it's "good".
{
filter: {
,attribute2: [val21, val22]
,attribute4: [__NOT_SET__]
}
}

My suggestion would be to move away from trying to use a key-value pair to represent a predicate phrase. You should have a lot more flexibility with a structure similar to:
{
filters: [
{ attribute: "attribute2", verb: "IN", values: [val21, val22] },
{ attribute: "attribute2", verb: "NOT IN", values: [val21, val22] },
{ attribute: "attribute4", verb: "IS NOT SET" },
]
}
You'd want an enum of verbs, of course, and values would have to be optional. You can add more verbs later if you need them, and you're no longer putting quite so much pressure on the poor :. You can also provide to the client a list of supported verbs and how many values (if any) of what type they take, so the client can build the UI dynamically, if desired.
Of course, this is a breaking change, which may or may not be an issue.

Related

How to avoid "Multiple properties exist for the provided key, use Vertex.properties(name)"?

How to avoid "Multiple properties exist for the provided key, use Vertex.properties(name)" when the property has multiple values.
Vertex has a property called name and it has multiple values.
How to get anyone value even though it has multiple values
%%gremlin
g.V('d65135a3-8cd3-4edd-bc8d-f7087557e2a9').
project('s','s1').
by(values('name')).
by(outE('owns').inV().hasLabel('xy').elementMap())
Error:
{
"detailedMessage": "Multiple properties exist for the provided key, use Vertex.properties(name)",
"requestId": "71391776-ad7f-454d-8413-3032a9800211",
"code": "InternalFailureException"
}
I tried reproducing your issue using this sample graph:
g.addV('set-test').
property('mySet','one').
property(set, 'mySet','two').
property(id,'set-test1')
but I was able to return properties OK.
g.V('set-test1').
project('s').
by(values('mySet'))
{'s': 'one'}
and to get every member of the set:
g.V('set-test1').
project('s','s2').
by(values('mySet').fold())
{'s': ['one', 'two'], 's2': ['one', 'two']}
However, I was able to reproduce the message by doing this:
g.V('set-test1').
project('s1','s2').
by(values('mySet'))
{
"detailedMessage": "Multiple properties exist for the provided key, use Vertex.properties(mySet)",
"requestId": "04e43bad-173c-454b-bf3c-5a59a3867ef6",
"code": "InternalFailureException"
}
Please note however, that Neptune is showing the same behavior in this case that you would see from Apache TinkerPop's TinkerGraph, so using fold is probably the way to go here as it will allow the query to complete successfully.
As a side note, "multi property" values (such as sets) do allocate an ID to each set member. However, this is highly implementation dependent, and I would not rely on these IDs. For example, Neptune does not persist property IDs in the database. They are generated "just in time" and can change. For completeness though, here is an example of using property ID values:
g.V('set-test1').properties('mySet').id()
1002125571
1002283485
We can the use those ID values in a query such as:
g.V('set-test1').
project('s1').
by(properties('mySet').hasId('1002283485').value())
{'s1': 'two'}

API Gateway as DynamoDB non-proxy integration with Integration Response for variable Map returns

Given an AWS Service integration (DynamoDB) in API Gateway, I'm wondering if there's a way to dynamically parse DynamoDB-structured JSON and return it in standard JSON format WITHOUT a standalone Lambda function mapper? For easier demonstration, here's an example of two possible data elements existing in my DynamoDB table I'd like to return in standard JSON format using a single Integration Response mapping template, if possible:
Ex. 1
{
"id": "1",
"desc": "here's a string type at depth 1",
"info": {
"key1": "a string at depth 2"
}
}
Ex. 2
{
"id": "2",
"desc": "here's a string type at depth 1",
"info": {
"key1": "a string at depth 2"
"key2": {
"subkey1": "look, a string at depth 3"
}
}
}
Based on these two examples, we can see these are nested data structures that share top-level keys, but have a variable number of nested tiers.
I've noted the following in all of the answers to questions regarding Integration Mapping Templates WITHOUT using a separate Lambda function to resolve the mapping:
Parsing Integration Response mapping templates for "flat" DynamoDB data -- e.g. something like this, where there's effectively a data depth of 1 since all elements in the data are basic types like S or N:
{
"param1": "some stuff",
"param2": "other stuff"
}
Any answers dealing with more complex data types like M presume the question writer knows without a doubt exactly what the DynamoDB table elements look like. So, they'll say something like 'create the mapping template for each of the layers with the appropriate, expected return type for that field.' Using Ex. 1 to illustrate a solution Integration Response mapping template:
#set($inputRoot = $input.path('$'))
{
"id": "$input.Item.id.S",
"desc": "$input.Item.desc.S",
"info": {
"key1": "$input.Item.info.M.key1.S"
}
}
Now, this works if we actually know what the expected DynamoDB result fields will be for every depth and how many tiers each table element has -- i.e., the data in a NoSQL table should be structured? Barring there being some version of the AppSync VTL resolver $util.dynamodb.toJSON method that can effectively unwrap dynamically-shaped DynamoDB returns, is there a way to do this directly without having to utilize a standalone Lambda function? Is there a way to generalize the response mapping template to account for variably-nested M type data?
One idea I've had is to loop through all of the keys and sending the M types to a secondary loop, but this seems impractical given:
the number of iterations that could theoretically be required to unwrap a deeply-nested object (DynamoDB can support up to 32 layers, last I checked), and
the variability in the nesting from element to element in the table.
To point 1, #foreach has a hard limit on the number of iterations it's capped at: 1000 https://forums.aws.amazon.com/thread.jspa?threadID=225222
I'm dubious as to the prospect of a non-Lambda solution, but still curious. Thoughts?

Advanced filter can't express ISNULL?

These two filters return zero results:
resource.labels:* AND resource.labels.namespace_name:*
resource.labels:* AND NOT resource.labels.namespace_name:*
While this one returns plenty:
resource.labels:*
I have three questions about this:
What's going on here?
More importantly, how do I exclude a particular value of
namespace_name while not excluding records that don't define
namespace_name ?
Similarly, how do I write a filter for all records that don't define namespace_name?
I work on Stackdriver Logging and have worked with the code that handles queries.
You are correct: something's up with the presence operator (:*), and it works differently than the other operators. As a result the behavior of a negated presence operator is not intuitive (or particularly useful).
We consider this a bug, and it's something that I'd really like to fix; however, fixing this class of bug is a lengthy process, so I've proposed some workarounds.
What's going on here?
I cannot reproduce your first "zero result" filter: resource.labels:* AND resource.labels.namespace_name:*
This gives me a large list of logs that contain the namespace_name label. For what it's worth, resource.labels.namespace_name:* implies resource.labels:*, so really you only need the latter half of this filter.
Your second "zero result" filter: resource.labels:* AND NOT resource.labels.namespace_name:*
... runs into a bug where field presence check (:*) does not interact properly with negation.
More importantly, how do I exclude a particular value of namespace_name while not excluding records that don't define namespace_name ?
While not required by the logging API, GCP-emitted resources generally emit the same sets of labels for a given resource type. You can take advantage of this by using resource.type to isolate resources-with-label from resources-without-label, then only apply the label constraint to the resources-with-label clause:
(resource.type != "k8s_container") OR
(resource.type = "k8s_container" AND resource.labels.namespace_name != "my-value")
Here, we are relying on all k8s_container-type entries having the namespace_name label, which should generally be the case. You can modify this to select multiple Kubernetes-prefixed resources:
(NOT resource.type:"k8s_") OR
(resource.type:"k8s_" AND resource.labels.namespace_name != "my-value")
... or use a complex resource.type clause to specifically select which you want to include/exclude from the namespace matching.
(NOT (resource.type = "k8s_container" OR resource.type = "k8s_pod")) OR
((resource.type = "k8s_container" OR resource.type = "k8s_pod") AND resource.labels.namespace_name != "my-value")
You cannot query for a k8s_container type that does not have the namespace_name label, but those should generally not be emitted in the first place.
Similarly, how do I write a filter for all records that don't define namespace_name?
You can't do this right now because of the bug. I think your best bet is to identify all of the resource types that use namespace_name and exclude those types with a resource.type filter:
NOT (
resource.type = "k8s_container" OR
resource.type = "k8s_pod" OR
resource.type = "knative_revision")
Note that, as mentioned earlier, while it's possible (allowed by the API) to have a k8s_container resource without a namespace_name label, emitted k8s_container logs should generally have the label.

Storing timestamp in joining node value instead of Boolean in Firebase database

Say that I have node user, item and user_items used to join them.
Typically one would(as advised in official documents and videos) use such a structure:
"user_items": {
"$userKey": {
"$itemKey1": true,
"$itemKey2": true,
"$itemKey3": true
}
}
I would like to use the following structure instead:
"user_items": {
"$userKey": {
"$itemKey1": 1494912826601,
"$itemKey2": 1494912826602,
"$itemKey3": 1494912826603
}
}
with values being a timestamp value. So that i can order them by creation date also while being able to tell the associated time. Seems like killing two birds with one stone situation. Or is it?
Any down sides to this approach?
EDIT: Also I'm using this approach for the boolean fields such as: approved_at, seen_at,... etc instead of using two fields like:
"some_message": {
"is_seen": true,
"seen_timestamp": 1494912826602,
}
You can model your database in every way you want, as long as you follow Firebase rules. The most important rule is to have the data as flatten as possible. According to this rule your database is structured correctly. There is no 100% solution to have a perfect database but according to your needs and using one of the following situations, you can consider that is a good practice to do it.
1. "$itemKey1": true,
2. "$itemName1": true,
3. "$itemKey1": 1494912826601,
4. "$itemName1": 1494912826601,
What is the meaning of "$itemKey1": 1494912826601,? Beacause you already have set a timestamp, means that your item was uploaded into your database and is linked to the specific user, which means also in other words true. So is not a bad approach to do something like this.
Hope it helps.
Great minds must think alike, because I do the exact same thing :) In my case, the "items" are posts that the user has upvoted. I use the timestamps with orderBy(), along with limitToLast(50) to get the "last 50 posts that the user has upvoted". And from there they can load more. I see no downsides to doing this.

APIGEE querying data that DOESN'T match condition

I need to fetch from BaaS data store all records that doesn't match condition
I use query string like:
https://api.usergrid.com/<org>/<app>/<collection>?ql=location within 10 of 30.494697,50.463509 and Partnership eq 'Reject'
that works right (i don't url encode string after ql).
But any attempt to put "not" in this query cause "The query cannot be parsed".
Also i try to use <>, !=, NE, and some variation of "not"
How to configure query to fetch all records in the range but Partnership NOT Equal 'Reject' ?
Not operations are supported, but are not performant because it requires a full scan. When coupled with a geolocation call, it could be quite slow. We are working on improving this in the Usergrid core.
Having said that, in general, it is much better to inverse the call if possible. For example, instead of adding the property when the case is true, always write the property to every new entity (even when false), then edit the property when the case is true.
Instead of doing this:
POST
{
'name':'fred'
}
PUT
{
'name':'fred'
'had_cactus_cooler':true
}
Do this:
POST
{
'name':'fred'
'had_cactus_cooler':'no'
}
PUT
{
'name':'fred'
'had_cactus_cooler':'yes'
}
In general, try to put your data in the way you want to get it out. Since you know upfront that you want to query on whether this property exists, simply add it, but with a negative value. The update it when the condition becomes true.
You should be able to use this syntax:
https://api.usergrid.com/<org>/<app>/<collection>?ql=location within 10 of 30.494697,50.463509 and not Partnership eq 'Reject'
Notice that the not operator comes before the expression (as indicated in the docs).

Resources