Riak: searchable list of maps (with CRDTs) - riak

I have a use-case which consist in storing several maps under an attribute. It's the equivalent JSON list:
[
{"tel": "+33123456789", "type": "work"},
{"tel": "+33600000000", "type": "mobile"},
{"tel": "+79001234567", "type": "mobile"}
]
Also, I'd like to have it searchable at the object level. For instance:
- search entries that have a mobile number
- search entries that have a mobile number and whose number starts with the string "+7"
Since Riak doesn't support sets of maps (but only sets of registers), I was wondering if there is a trick to achieve it. So far I've had 2 ideas
Map of maps
The idea is to generate (randomly?) keys for the objects, and store each object of the list in a parent map whose keys are the ones generated for this only purpose to have a key.
It seems to me that it doesn't allow to search the object (maps inside the parent map) because Riak Solr search requires the full path to the attribute. One cannot simply write the following Solr query: phone_numbers.*.tel:+7*. Also composed search (eg. search entries that have a mobile number and whose number starts with the string "+7") seem hard to achieve.
Sets with simulated multi-valued attributes
This solution consists in using a set and insert all the values of the object as a single string, with separators between them. Yes, it's a hack. An attribute value would look like: $tel:+79001234567$type:mobile$ with : as the attribute name-value separator and $ as the attribute separator.
Search could be feasible using the * wildcard (ugly, but still), except the problem with escaping the separators: what if the type attribute includes a :? Are they some separators that are accepted by Riak and would not be acceptable in a string (I'm thinking of control characters)?
In a nutshell
I'm looking for a solution whatever the hackiness of it, as long as it works. Any idea is welcome.

Related

Using shacl to validate a property that has at most one value in its properties

I'm trying to create a shacl based on the ontology that my organization is developing (in dutch): https://wegenenverkeer.data.vlaanderen.be/
The objects described have attributes (properties), that have a specified datatype. The datatype can a primitive (like string or decimal) or complex, which means the property will have properties itself (nested properties). For example: an asset object A will have an attribute assetId which is a complex datatype DtcIdentificator, which consists of two properties itself. I have succesfully created a shacl that validates objects by creating multiple shapes and nesting them.
I now run into the problem of what we call union datatypes. These are a special kind of complex datatypes. They are still nested datatypes: the attribute with the union datatypes will have multiple properties but only exactly zero or one of those properties may have a value. If the attribute has 2 properties with values, it is invalid. How can I create such a constraint in shacl?
Example (in dutch): https://wegenenverkeer.data.vlaanderen.be/doc/implementatiemodel/union-datatypes/#Afmeting%20verkeersbord
A traffic sign (Verkeersbord, see https://wegenenverkeer.data.vlaanderen.be/doc/implementatiemodel/signalisatie/#Verkeersbord) can have a property afmeting (size) of the datatype DtuAfmetingVerkeersbord.
If an asset A of this type would exist, I could define its size as (in dotnotation):
A.afmeting.rond.waarde = 700
-or-
A.afmeting.driehoekig.waarde = 400
Both are valid ways of using the afmeting property, however, if they are both used for the same object, this becomes invalid, as only one property of A.afmeting may have a value.
I have tried using the union constraint in shacl, but soon found out that that has nothing to do with what we call "union datatypes"
I think the reason you are struggling is because this kind of problem is usually modelled differently. Basically you have different types of Traffic signs and these signs can have measurements. With the model as you described, A.afmeting.rond.waarde captures 2 ideas using 1 property: (a) the type and (b) the size. From your question, this seems to be the intend. However, this is usually not how this kind of problem is addressed.
A more intuitive design is for Traffic sign to have 2 different properties: (a) type and (b) a measurement. The Traffic sign types are achthoekig, driehoekig, etc. Then you can use SHACL to check that a traffic sign has either both or no properties for a traffic sign.

How can I make collection in Cloud Firestore that can be user reordered without writing every document?

Lets say I have a checklist collection where each item is it's own document since it contains a lot of other data.
I want the user to be able to drag and drop to reorder this list an save it that way. My initial thought was to have a field that is changed to reflect this order but moving one object requires changing the value on every document that is after the new location.
Is there a way to achieve this without a massive number of writes?
If the order of documents changes frequently, you can avoid writing the contents of each document by instead using a whole different document to maintain the order, using an array of strings containing the document IDs. In fact, you could hold lots of different orderings depending on how you want to display the documents.
Say you have a collection of documents:
collection
- docA
- docB
- docC
Now you want to store mutable orderings in a document called "order" in another collection:
collection-meta
- order
- byAlpha: ["docA", "docB", "docC"]
- byScore: ["docC", "docA", "docB"]
Just query the "order" document first, then get each document for display in the order defined in the array. To reorder the documents, just update the contents of the single array in the "order" doc.
I usually do this by using a floating point value for the order.
Say you have a list with these 3 documents:
ID=a, order=1.0
ID=b, order=2.0
ID=c, order=3.0
Now let's assume we want to move document a between b and c. You'd do that by changing its order to 2.0 + (3.0 - 2.0) / 2 = 2.5.
ID=b, order=2.0
ID=a, order=2.5
ID=c, order=3.0
This works for a reasonable number of swaps, which is the scenario I usually deal with.
If you're dealing with a large number/potentially infinite of iterations, you'll want to look at the precision of the floating point operation. In that case your alternative might be to use a custom value type, i.e. encoding the value into a string field and then use a custom library to do the division at a higher numeric precision.
If i understood your question correctly, what you can do is You can have a separate document that contains index of all the other checklist document. All the checklist documents can keep their respective info as it is supposed to be. For example
Checklist1 Checklist3
Checklist2 Checklist2
Checklist3 Checklist1
Index Index
(Before) (After Drag)
and Index contains structure like below after the drag is performed:
Index -> field value
Checklist1 3
Checklist2 2
Checklist3 1

AQL include for multi value property returns only first value

What's the right AQL format to include all values for a multi value property field. Neither #<prop name> nor properties.* seem to work.
When running an AQL query and including a property field which contains multiple values the result contains the first value and not a list containing all values
items.find(...).include("*","#distro")
At present, I run one query to generate a list of artifacts and then iterate through the list running a query for each artifacts properties
f'/api/storage/{artifact.repo}/{artifact.path}/{artifact.name}?properties'
Result
...properties {'key': 'distro', 'value': 'Ubuntu'}
Desired Result
...properties {'key': 'distro', 'value': ['Ubuntu', 'CentOS',...]}
I heard back from jfrog support and the problem seems to be that the use of '#propertyname' allows for the collapse of a potentially multi value property into a single value and this blocks the gathering of all properties.
A more efficient approach would be
items.find(...).include("property")
This results in all properties returned in the json payload as it includes the property domain which includes all the properties.
Additionally, by not using # the query doesn't collapse the properties from lists of values to a single value. So, if build_number is a property it becomes ['25'] instead of 25.
When requesting the property domain, be sure to treat each property as a list.

How does one convert string to number in JDOQL?

I have a JDOQL/DataNucleus storage layer which stores values that can have multiple primitive types in a varchar field. Some of them are numeric, and I need to compare (</>/...) them with numeric constants. How does one achieve that? I was trying to use e.g. (java.lang.)Long.parse on the field or value (e.g. java.lang.Long.parseLong(field) > java.lang.Long.parseLong(string_param)), supplying a parameter of type long against string field, etc. but it doesn't work. In fact, I very rarely get any errors, for various combinations it would return all values or no values for no easily discernible reasons.
Is there documentation for this?
Clarification: the field is of string type (actually a string collection from which I do a get). For some subset of values they may store ints, e.g. "3" string, and I need to do e.g. value >= 2 filters.
I tried using casts, but not much, they do produce errors, let me investigate some more
JDO has a well documented set of methods that are valid for use with JDOQL, upon which DataNucleus JDO adds some additional ones and allows users to add on support for others as per
http://www.datanucleus.org/products/accessplatform_3_3/jdo/jdoql.html#methods
then you also can use JDOQL casts (on the same page as that link).

dynamodb creating a string set

I have a lot of objects with unique IDs. Every object can have several labels associated to it, like this:
123: ['a', 'hello']
456: ['dsajdaskldjs']
789: (no labels associated yet)
I'm not planning to store all objects in DynamoDB, only these sets of labels. So it would make sense to add labels like that:
find a record with (id = needed_id)
if there is one, and it has a set named label_set, add a label to this set
if there is no record with such id, or the existing record doesn't have an attribute named label_set, create a record and an attribute, and initialize the attribute with a set consisting of the label
if I used sets of numbers, I could use just ADD operation of UPDATE command. This command does exactly what I described. However, this does not work with sets of strings:
If no item matches the specified primary key:
ADD— Creates an item with supplied primary key and number (or set of numbers) for the attribute value. Not valid for a string type.
so I have to use a PUT operation with Expected set to {"label_set":{"Exists":false}}, followed (in case it fails) by an ADD operation. These are two operations, and it kinda sucks (since you pay per operation, the costs of this will be 2 times more than they could be).
This limitations seems really weird to me. Why are something what works with numbers sets would not work with string sets? Maybe I'm doing something wrong.
Using many records like (123, 'a'), (123, 'hello') instead of one record per object with a set is not a solutions: I want to get all the values from the set at once, without any scans.
I use string sets from the Java SDK the way you describe all the time and it works for me. Perhaps it has changed? I basically follow the pattern in this doc:
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_UpdateItem.html
ADD— Only use the add action for numbers or if the target attribute is
a set (including string sets). ADD does not work if the target
attribute is a single string value or a scalar binary value. The
specified value is added to a numeric value (incrementing or
decrementing the existing numeric value) or added as an additional
value in a string set. If a set of values is specified, the values are
added to the existing set. For example if the original set is [1,2]
and supplied value is [3], then after the add operation the set is
[1,2,3], not [4,5]. An error occurs if an Add action is specified for
a set attribute and the attribute type specified does not match the
existing set type.
If you use ADD for an attribute that does not exist, the attribute and
its values are added to the item.
When your set is empty, it means the attribute isn't present. You can still ADD to it. In fact, a pattern that I've found useful is to simply ADD without even checking for the item. If it doesn't exist, it will create a new item using the specified key and create the attribute set with the value(s) I am adding. If the item exists but the attribute doesn't, it creates the attribute set and adds the value(s). If they both exist, it just adds the value(s).
The only piece that caught me up at first was that the value I had to add was a SS (String set) even if it was only one string value. From DynamoDB's perspective, you are always merging sets, even if the existing set is an empty set (missing) or the new set only contains one value.
IMO, from the way you've described your intent, you would be better off not specifying an existing condition at all. You are having to do two steps because you are enforcing two different situations but you are trying to perform the same action in both. So might as well just blindly add the label and let DynamoDB handle the rest.
Maybe you could: (pseudo code)
try:
add_with_update_item(hash_key=42, "label")
except:
element = new Element(hash_key=42, labels=["label"])
element.save()
With this graceful recovery approach, you need 1 call in the general case, 2 otherwise.
You are unable to use sets to do what you want because Dynamo Db doesn't support empty sets. I would suggest just using a string with a custom schema and building the set from that yourself.
To avoid two operations, you can add a "ConditionExpression" to your item.
For example, add this field/value to your item:
"ConditionExpression": "attribute_not_exists(RecordID) and attribute_not_exists(label_set)"
Source documentation.
Edit: I found a really good guide about how to use the conditional statements

Resources