How to implement a sub-resource relationship in CRNK (JSON-API) without top level repository - json-api

This question mainly relates to HOW to do this using the CRNK framework however I have also tagged it as JSON-API as I am interested in whether or not the approach is generally correct within the confines of JSON-API spec.
I don't want to complicate things by going into specifics of the exact domain the problem relates to, so I am going to simplify things a little.
I have a queue, which has various attributes such as name, description etc. Another attribute of the queue is some historical timestamped data, essentially an array of objects that look something like:
{ "time": "21/10/2018 10:15 GMT", "value": 35 }
In fact the queue can have a number of these attributes that relate to different data for that queue. The amount of data in the array can be quite big depending on how much data has been collected.
My first instinct was to model this an attribute on the queue:
{
data: {
...
attributes: {
...
history: [
{ "time": "21/10/2018 10:15 GMT", "value": 35 },
{ "time": "21/10/2018 10:30 GMT", "value": 35 },
{ "time": "21/10/2018 10:45 GMT", "value": 35 }
]
}
}
}
The issue I have with this approach however is that the entire data set is going to come back with the queue (which could be quite large and not always required). I could combat this issue by using sparse fieldsets, but I don't particularly like this concept of requesting the queue over and over again with a different fields parameter in order to get the data I am after in a particular scenario.
What I wanted to do was to model this history data as a relationship, that way the data can be accessed through a relationship URL, e.g. /api/queues/1/history This seems to make most sense to me as the intended use of the API is various screens would make use of the different sets of data attached to a queue, so each screen would have the queue object and can then request the data it is interested in through these relationship links.
The issue I am having however is that the history data here doesn't exist as an identifiable resource in the backend, only as a sub-resource of the queues (i.e. select * from historydata where queueid = 1). This is where I am unsure how to implement it in CRNK. It seems as though to model a relationship I have to also create a ResourceRepository for the sub-resource (/api/history/{id}). But I don't want this.
So my question around the CRNK implementation is how do I configure my resources and repositories such that:
GET /api/history/{id} - always returns 404 (ideally without having to implement this myself in a HistoryResourceRepository)
GET/PATCH /api/queues/1/history - will go through the queue repository to access and update the history data using the queue ID as the identifier
Also, on a side note, what is the recommended approach for assigning an ID to the sub-resource, given it doesn't exist as an identifiable entity in that respect and the ID is largely irrelevant?

the way repositories are implemented aligns strongly with what the JSON API specification writes about how to work with relationships (see
http://jsonapi.org/format/#crud-updating-relationships). Meaning each history item must be resource and can be set into relation with, for example, those queue items. As such a resource and relationship repository must be implemented. Relationship repositories really only establish connections and can by themselve not work with data. So only the resource repositories are able to do insert, update and deletes of data.
However, in this particular use case (history), GET access with a relationship repository would be sufficient. It would not be overly hard to make the resource repository optional (or at least hide it from the rest api/crnk-home). But it may go slightly against the JSON API specification.
Another thing that can be done, should you have multiple history records, is to make use of nested urls like "history/queue", "history/xy", to establish a clean API and have all history-related resources at one place / sub-directory. Personally I do that in applications.

Related

Validating a Contact has a Unique Email in Axon

I am curious to understand what the best practice approach is when using the Axon Framework to validate that an email field is unique to a Set of emails for a Contact Aggregate.
Example setup
ContactCreateCommand {
identifier = '123'
name = 'ABC'
email = 'info#abc.com'
}
ContactAggregate {
ContactAggregate(ContactCreateCommand cmd) {
//1. cannot validate email
AggregateLifecycle.apply(
new ContactCreatedEvent(//fields ... );
);
}
}
From my understanding of how this might be implemented, I have identified a number of possible ways to handle this, but perhaps there are more.
1. Do nothing in the Aggregate
This approach imposes that the invoker (of the command) does a query to find Contacts by email prior to sending the command, allowing for some milliseconds where eventual consistency allows for duplication.
Drawbacks:
Any "invoker" of the command would then be required to perform this validation check as its not possible to do this check inside the Aggregate using an Axon Query Handler.
Duplication can occur, so all projections based from these events need to handle this duplication somehow
2. Validate in a separate persistence layer
This approach introduces a new persistence layer that would validate uniqueness inside the aggregate.
Inside the ContactAggregate command handler for ContactCreateCommand we can then issue a query against this persistence layer (eg. a table in postgres with a unique index on it) and we can validate the email against this database which contains all the sets
Drawbacks:
Introduces an external persistence layer (external to the microservice) to guarantee uniqueness across Contacts
Scaling should be considered in the persistence layer, hitting this with a highly scaled aggregate could prove a bottleneck
3. Use a Saga and Singleton Aggregate
This approach enhances the previous setup by introducing an Aggregate that can only have at most 1 instance (e.g. Target Identifier is always the same). This way we create a 'Singleton Aggregate' that is responsible only to encapsulate the Set of all Contact Email Addresses.
ContactEmailValidateCommand {
identifier = 'SINGLETON_ID_1'
email='info#abc.com'
customerIdentifier = '123'
}
UniqueContactEmailAggregate {
#AggregateIdentifier
private String identifier;
Set<String> email = new HashSet<>();
on(ContactEmailValidateCommand cmd) {
if (email.contains(cmd.email) == false) {
AggregateLifecycle.apply(
new ContactEmailInvalidatedEvent(//fields ... );
} else {
AggregateLifecycle.apply(
new ContactEmailValidatedEvent(//fields ... );
);
}
}
}
After we do this check, we could then re-act appropriately to the ContactEmailInvalidatedEvent or ContactEmailValidatedEvent which might invalidate the contact afterwards.
The benefit of this approach is that it keeps the persistence local to the Aggregate, which could give better scaling (as more nodes are added, more aggregates with locally managed Sets exist).
Drawbacks
Quite a lot of boiler plate to replace "create unique index"
This approach allows an 'invalid' Contact to pollute the Event Store for ever
The 'Singleton Aggregate' is complex to ensure it is a true (perhaps there is a simpler or better way)
The 'invoker' of the CreateContactCommand must check to see the outcome of the Saga
What do others do to solve this? I feel option 2 is perhaps the simplest approach, but are there other options?
What you are essentially looking for is Set Based Validation (I think here blog does a nice job explaining the concept, and how to deal with it in Axon). In short, validating some field is (or is not) contained in a set of data. When doing CQRS, this becomes a somewhat interesting concept to reason about, with several solutions out there (as you've already portrayed).
I think the best solution to this is summarized under your second option to use a dedicated persistence layer for the email addresses. You'd simply create a very concise model containing just the email addresses, which you would validate prior to issuing the ContactCreateCommand. Note that this persistence layer belongs to the Command Model, as it is used to perform business validation. You'd thus introduce an example where you not only have Aggregates in your Command Model, but also Views. And as you've rightfully noted, this View needs to be optimized for it's use case of course. Maybe introducing a cache which is created on application start up wouldn't be to bad.
To ensure this email addresses view is as up to date as possible, it's smartest to ensure it is updated in the same transaction as when the ContactCreatedEvent (which contains a new email address, I assume) is published. You can do this by having a dedicated Event Handling Component for your "Email Addresses View" which is updated through a SubscribingEventProcessor (a SEP). This would work as the SEP is invoked by the same thread publishing the event (your aggregate).
You have a couple of options when it comes to querying this model prior to sending the command. You could use a MessageDispatchInterceptor which only reacts on the ContactCreateCommand for example. Or, you introduce a Handler Enhancer which is dedicated to react ContactCreateCommand to perform this validation. Or, you introduce another command like RequestContactCreationCommand which is targeted towards a regular component. This component would handle the command, validate the model and if approved dispatches a ContactCreateCommand.
That's my two cents to the situation, hope this helps #vcetinick!

Sparse fields on complex JSON API attributes

According to #document-resource-object-attributes it is allowed to have 'complex' values for attributes, i.e. any valid JSON value.
With #fetching-sparse-fieldsets it is possible to select a subset of the content. However, all examples are matching the attribute name.
For example:
{
"data": [
{
"type": "dogs",
"id": "3f02e",
"attributes": {
"name": "doggy",
"body": {
"head": "small",
"legs": [
{
"position": "front",
"side": "right"
},
{
"position": "front",
"side": "left"
}
],
"fur": {
"color": "brown"
}
}
}
}
]
In the result I am only interested in the name, body.head and body.fur.color.
What would be a correct way to solve this (preferably without requiring relations, since this data is valid)?
JSON:API's Sparse Fieldsets feature allows to request only specific fields of a resource:
A client MAY request that an endpoint return only specific fields in the response on a per-type basis by including a fields[TYPE] parameter.
https://jsonapi.org/format/#fetching-sparse-fieldsets
A field is either an attribute or a relationship in JSON:API:
A resource object’s attributes and its relationships are collectively called its “fields”.
https://jsonapi.org/format/#document-resource-object-fields
Sparse Fieldsets are not meant to have an impact on the value of an attribute or a relationship. If you have such a need you shouldn't model the data as a complex value but expose it as a separate resource.
Please note that there is no need that your database schema and the exposed resources by your API are the same. Actually it often makes sense to not have a 1-to-1 relationship between database tables and resources in your JSON:API.
Don't be afraid of having multiple resources. It's often much better for the long-term than having one resource with complex objects:
You can include the related resource (e.g. dog-bodies, dog-legs, dog-furs in your case) by default.
You can generate the IDs for that resources automatically based on the persisted ID of a parent resource.
You can have much stricter constraints and easier documentation for you API if having separate resources.
You can reduce the risk of collisions as you can support updating specific parts (e.g. color attribute of a dog-furs) rather than replacing the full body value of a dogs resource.
The main drawback that I see currently with having multiple resources instead of one is the limitation that you can't create or update more than one resource in the same request with JSON:API v1.0. But it's very likely that the upcoming v1.1 won't have that limitation anymore. An official existing called Atomic Operations is proposed for that use case by a member of the core team working on the spec.

What is the correct JSONAPI way to post multiple related entities in a single request?

At some point in my hypothetical app, I want to create multiple related entities of different types in a single request, for efficiency sake. In the example below I serialize the request in a way that it contains the data about the new User as well as its related Avatar.
// POST /api/users
{
data: {
attributes: { ... },
type: 'user',
relationships: {
avatar: {
data: {
attributes: { ... }
type: 'avatar',
}
}
}
}
}
The question is, what would be the correct/recommended way (if there's any) to do that in JSONAPI?
Creating or updating multiple resources in a single request is not supported by JSON:API spec yet. However there is a proposal for an Atomic Operations extension for the upcoming v1.1 of the spec.
But in most cases such a feature is not required for efficiency. You might even cause more load to the server by bundling multiple create or update requests into one. Doing multiple requests in parallel is cheap with HTTP/2 nowadays.
It might not be as performant as doing it with one requests if the operations depend on each other (e.g. must await a post to be created before a comment for this post could be created). But in that case atomic transactions are also a strong requirement. That's the main driver behind that extension.
So to answer your question:
It's currently not supported in JSON:API spec.
There is a good chance that it will be supported in the next version (v1.1) by an extension.
If efficiency is the only reason you are looking for such a feature, you might not need it at all.
Since it is common, more over may times encouraged to decouple REST API resources from internal representations, there is no recommendation that would suggest against defining a specific 'virtual' endpoint, where the attributes of that resource in turn would become attributes of two or more different resources under different endpoints.
It may not solve your problem, if you want such feature in general, but if this is only needed for some resource combinations, you can always make a dedicated endpoint for a resource which incorporates all attributes of all related resources.
In your case it could be something like:
// POST /api/users_with_avatar
{
data: {
attributes: {
"user_attribute_1": "...",
"user_attribute_2": "...",
"user_attribute_3": "...",
"avatar_attribute_1": "...",
"avatar_attribute_2": "..."
},
type: 'user-with-avatar'
}
}

How to remove Cloud Firestore field type specifiers when using REST API?

I totally made up the name "type specifiers." What I mean is the stringValue key in front of a value. Usually I would expect a more-standard response: "name" : "name_here".
{
"fields": {
"name": {
"stringValue": "name_here"
}
}
}
Is it possible to remove those when making a GET call?
More importantly, it be nice to understand why it's structured like it is. Even for POST-ing data? The easy answer is probably because Cloud Firestore, unlike Realtime Database, needs to know the specific types, but what are all the deeper reasons? Is there an "official" name for formatting like this where I could do more research?
For example, is the reasoning any related to Protocol Buffers? Is there a way to request a protobuf instead of JSON?
Schema:
Is it possible to remove those when making a GET call?
In short No. The Firestore REST API GET returns an instance of Document.
See https://firebase.google.com/docs/firestore/reference/rest/v1beta1/projects.databases.documents#Document
{
"name": string,
"fields": {
string: {
object(Value)
},
...
},
"createTime": string,
"updateTime": string,
}
Regarding the "Protocol Buffer": When the data is deserialized you could just have a function to convert into the structure you wish to use, e.g. probably using the protocol buffers if you wish but as there appear to be libraries for SWIFT, OBJECTIVE-C, ANDROID, JAVA, PYTHON, NODE.JS, GO maybe you won’t need to use the REST API and craft a Protocol Buffer.
Hopefully address your “More Importantly” comment:
As you eluded to in your question Firestore has a different data model to the Realtime Database.
Realtime database data model allows JSON objects with the schema and keywords as you want to define it.
As you point out, the Firestore data model uses predefined schemas, in that respect some of the keywords and structure cannot be changed.
The Cloud Firestore Data Model is described here: https://firebase.google.com/docs/firestore/data-model
Effectively the data model is / where a document can contain a subcollection and the keywords “name”, “fields”, “createdTime”, “upTime” are in a Firestore document (a pre-defined JSON document schema).
A successful the Firestore REST API GET request results in a Document instance which could contain collection of documents or a single document. See https://firebase.google.com/docs/firestore/reference/rest/. Also the API discovery document helps give some detail about the api:
https://firestore.googleapis.com/$discovery/rest?version=v1beta1
An example REST API URL structure is of the form:
https://firestore.googleapis.com/v1beta1/projects/<yourprojectid>/databases/(default)/documents/<collectionName>/<documentID>
It is possible to mask certain fields in a document but still the Firestore Document schema will persist. See the three examples GET:
collection https://pastebin.com/98qByY7n
document https://pastebin.com/QLwZFGgF
document with mask https://pastebin.com/KA1cGX3k
Looking at another example, the REST API to run Queries
https://firebase.google.com/docs/firestore/reference/rest/v1beta1/projects.databases.documents/runQuery
the response body is of the form:
{
"transaction": string,
"document": {
object(Document)
},
"readTime": string,
"skippedResults": number,
}
In summary:
The Realtime database REST API will return the JSON for the object according to the path/nodes as per your “more-standard response”.
The Firestore REST API returns a specific Firestore predefined response structure.
There API libraries available for several language so maybe it’s not necessary to use the REST API and craft your own Protocol Buffer but if you needed to you it’s probably feasible.
I don't understand why somebody just say that you can't and don't try think some solution for help! Seriously that this is a really problem solver?
Anyway, I created a script that will help you (maybe it's late now hahaha).
The script encode json and after replace it as string to modify and remove Google type fields (low process).
It's a simple code, I know that you can improve it if necessary!
WARNING!!
Maybe you will have problems with values that contain '{}' or '[]'. This can be solved with a foreach that convert all strings that contains this elements in other char (like '◘' or '♦', some char that you know that doesn't will be in value.
Ex.: Hi {Lorena}! ------> Hi ◘Lorena♦!
After the process, convert again to '{}' or '[]'
YOU CAN'T HAVE FIELDS WITH THE SAME NAME THAT GOOGLE FIELDS
Ex.: stringValue, arrayValue, etc
You can see and download the script in this link:
https://github.com/campostech/scripts-helpers/blob/master/CLOUD%20STORE%20JSON%20FIELDS%20REMOVER/csjfr.php

What's the RESTful way of attaching one resource to another?

this is one of the few moments I couldn't find the same question that I have at this place so I'm trying to describe my problem and hope to get some help an ideas!
Let's say...
I want to design a RESTful API for a domain model, that might have entities/resources like the following:
class Product
{
String id;
String name;
Price price;
Set<Tag> tags;
}
class Price
{
String id;
String currency;
float amount;
}
class Tag
{
String id;
String name;
}
The API might look like:
GET /products
GET /products/<product-id>
PUT /prices/<price-id>?currency=EUR&amount=12.34
PATCH /products/<product-id>?name=updateOnlyName
When it comes to updating references:
PATCH /products/<product-id>?price=<price-id>
PATCH /products/<product-id>?price=
may set the Products' Price-reference to another existing Price, or delete this reference.
But how can I add a new reference of an existing Tag to a Product?
If I wanted to store that reference in a relational database, I needed a relationship table 'products_tags' for that many-to-many-relationship, which brings us to a clear solution:
POST /product_tags [product: <product-id>, tag: <tag-id>]
But a document-based NoSQL database (like MongoDB) could store this as a one-to-many-relationship for each Product, so I don't need to model a 'new resource' that has to be created to save a relationship.
But
POST /products/<product-id>/tags/ [name: ...]
creates a new Tag (in a Product),
PUT /products/<product-id>/tags/<tag-id>?name=
creates a new Tag with <tag-id> or replaces an existing
Tag with the same id (in a Product),
PATCH /products/<product-id>?tags=<tag-id>
sets the Tag-list and doesn't add a new Tag, and
PATCH /products/<product-id>/tags/<tag-id>?name=...
sets a certain attribute of a Tag.
So I might want to say something link this:
ATTACH /products/<product-id>?tags=<tag-id>
ATTACH /products/<product-id>/tags?tag=<tag-id>
So the point is:
I don't want to create a new resource,
I don't want to set the attribute of a resource, but
I want to ADD a resource to another resources attribute, which is a set. ^^
Since everything is about resources, one could say:
I want to ATTACH a resource to another.
My question: Which Method is the right one and how should the URL look like?
Your REST is an application state driver, not aimed to be reflection of your entity relationships.
As such, there's no 'if this was the case in the db' in REST. That said, you have pretty good URIs.
You talk about IDs. What is a tag? Isn't a tag a simple string? Why does it have an id? Why isn't its id its namestring?
Why not have PUT /products/<product-id>/tags/tag_name=?
PUT is idempotent, so you are basically asserting the existance of a tag for the product referred to by product-id. If you send this request multiple times, you'd get 201 Created the first time and 200 OK the next time.
If you are building a simple system with a single concurrent user running on a single web server with no concurrency in requests, you may stop reading now
If someone in between goes and deletes that tag, your next put request would re-create the tag. Is this what you want?
With optimistic concurrency control, you would pass along the ETag a of the document everytime, and return 409 Conflict if you have a newer version b on the server and the diff, a..b cannot be reconciled. In the case of tags, you are just using PUT and DELETE verbs; so you wouldn't have to diff/look at reconciliation.
If you are building a moderately advanced concurrent system, with first-writer-wins semantics, running on a single sever, you can stop reading now
That said, I don't think you have considered your transactional boundaries. What are you modifying? A resource? No, you are modifying value objects of the product resource; its tags. So then, according to your model of resources, you should be using PATCH. Do you care about concurrency? Well, then you have much more to think about with regards to PATCH:
How do you represent the diff of a hierarchial JSON object?
How do you know what PATCH requests that conflict in a semantic way - i.e. we may not care about DELETEs on Tags, but two other properties might interact semantically.
The RFC for HTTP PATCH says this:
With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.
PATCH is neither safe nor idempotent as defined by [RFC2616], Section
9.1.
I'm probably going to stop putting strange ideas in your head now. Comment if you want me to continue down this path a bit longer ;). Suffice to say that there are many more considerations that can be done.

Resources