How to write a Spring JPA repository method name to activate the Distinct clause in Couchbase?

How to write a Spring JPA repository method name to activate the Distinct clause in Couchbase? - spring-data-couchbase

Working on a Java application that uses Spring Data Couchbase 2.2.0.RELEASE...
Starting with a list of JSON objects that represent Book objects:
[
{id: 123, title: "Abc", category: "A"},
{id: 456, title: "Efg", category: "B"},
{id: 789, title: "Abc", category: "A"}
]
The array of Book objects are inserted into Couchbase. Later, the application would like get a list of distinct book titles back based on a category filter. Following some of the Spring documentation, I've arrived at this method name in the BookRepository interface:
List<Book> findDistinctTitleByCategory(String category);
However, the query that is created by Spring does not contain the Distinct clause for title. Here's is the final query that Spring sends to the CB cluster where bucket name here is default:
Executing N1QL query: {"statement":"SELECT META(`default`).id AS _ID, META(`default`).cas AS _CAS, `default`.* FROM `default` WHERE (`category` = \"A\")","scan_consistency":"not_bounded"}
Am I writing the method name wrong?

SDC currently does not support query derivation for distinct. I have created a ticket for enhancement here. In the meantime, you can work around by directly using #Query instead of n1ql.selectEntity, provide the select part.
If you are fetching only the title, SDC supports projections.
interface OnlyTitle {
String getTitle();
}
#Query(...)
OnlyTitle findDistinctTitleByCategory(String category);

Related

Project by, with optional properties

I believe this question is for Tinkerpop, not specific to the CosmosDB implementation; just some semantics might be baked into my query examples.
I've developed a data layer that creates queries based on some metadata information. Currently, my data layer will only persist non-null data values to the graph vertex; this is causing troubles with my retrieval mechanism.
Provided the following data model, where the field "HomeRoute" may or may not exist on the actual vertex (depending on whether it was populated or not).
{
"ApplicationModule": string
"Title": string
"HomeRoute": string?
}
My initial query structure is as follows, which does not support the optional properties (discussed later).
g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by('HomeRoute');
To simulate, we can insert a vertex:
g.addV('ApplicationsTest')
.property('partitionId', '')
.property('ApplicationModule', 'TestApp')
.property('Title', 'Test App')
.property('HomeRoute', 'testapphome');
And we can successfully query it using my base query noted above, which returns it in my desired JSON format.
[
{
"ApplicationModule": "TestApp",
"Title": "Test App",
"HomeRoute": "testapphome"
}
]
If we now insert a vertex without the HomeRoute property (since it was null within the application layer), my base query will fail.
g.addV('ApplicationsTest')
.property('partitionId', '')
.property('ApplicationModule', 'TestApp')
.property('Title', 'Test App');
Executing my base query now results in error:
Gremlin Query Execution Error: Project By: Next: The provided
traverser of key "HomeRoute" maps to nothing.
I can apply a coalesce operation against "optional" fields; my current understanding has allowed me to return a constant value in the case of undefined properties. Updating my base query as follows will return "!dbnull" when a property does not exist on the vertex:
g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by(values('HomeRoute')
.fold()
.coalesce(unfold(), constant('!dbnull')));
This query when executed returns the values as expected, again in JSON format.
[
{
"ApplicationModule": "TestApp",
"Title": "Test App",
"HomeRoute": "testapphome"
},
{
"ApplicationModule": "TestApp",
"Title": "Test App",
"HomeRoute": "!dbnull"
}
]
My question (still new to Gremlin / Tinkerpop queries) - is there any way that I can get this result with only the properties which are present on the respective vertices?
My desired output from this example is below, which would allow my data layer to only unbundle the values present on the graph vertex and not have to consider string "!dbnull" values.
[
{
"ApplicationModule": "TestApp",
"Title": "Test App",
"HomeRoute": "testapphome"
},
{
"ApplicationModule": "TestApp",
"Title": "Test App"
}
]

I've found a way to achieve what I'm looking for. Would still love input from the community though, if there's optimizations or other considerations.
g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by(values('HomeRoute')
.fold()
.coalesce(unfold(), constant('!dbnull')))
.local(unfold()
.where(select(values).is(without('!dbnull')))
.group().by(select(keys)).by(select(values)))

If you only need specific keys that already exist on the vertex you can use valueMap no need to use project:
g.V()
.has('ApplicationsTest', 'partitionId', '')
.valueMap("ApplicationModule", "Title", "HomeRoute").by(unfold())
example: https://gremlify.com/9fua9jsu0dh

Firebase REST API query with different keys

So this is the structure of my Firebase DB right now, I am using the Firebase REST API:
"company": {
company1_id {
id: company_id,
userId: userid,
name: name
//someotherstuff
}
company2_id {
id: company_id,
userId: userid,
name: name,
//someotherstuff
}
}
Soo, right now I am getting the companies belonging to one user by calling :
"firebasedbname.firebaseio.com/company.json?orderBy="userId"&equalTo=userId"
This works perfectly fine and gets the corresponding data, but now I want it to order the companies alphabetically by name, and then i try this:
"firebasedbname.firebaseio.com/company.json?orderBy="name"&equalTo=userId"
But this time, it returns no data! Even though i have added .indexOn: "name" to the company node.Any help will be aprreciated.

As explained in the doc, if you want to filter data you need to first "specify how you want your data to be filtered using the orderBy parameter", and then you need to "combine orderBy with any of the other five parameters: limitToFirst, limitToLast, startAt, endAt, and equalTo".
So if you added "an .indexOn: "name" to the company node", it means that you intend to query as follows:
https://xxxx.firebaseio.com/company.json?orderBy="name"&equalTo="companyName"
You cannot order by (company) name and filter on userId.
If you want to get all the companies corresponding to a specific user and order them by the company name, you will need to use ?orderBy="userId"&equalTo=userId" and do the sorting in the client/application calling the REST API.

can getstream aggregation work on properties of the "object" property

I set up the aggregation rule:
{{ object.experienceId }}
on a notification feed in getstream.io expecting it to aggregate based on the object.experienceId, but instead it seems to aggregate everything into one, regardless of object.experienceId. Am I mis-understanding how aggregation works? What could be the issue?
var activity = {
time: new Date(),
verb: 'created',
actor: { id: 1, name: 'User One' },
object: {
id: 2,
experienceId: 12,
caption: 'Moment 1',
photo:
{ id: '314e00a2-2455-11e5-b696-feff819cdc9f',
mainColor: 'ff3333',
width: 1000,
height: 400 },
createdBy: {
id: 1, name: 'User One'
},
type: 'Moment' },
context: 'http://derbyapp.co'
};
notifications.addActivity(activity,

The reason why this is not working is because the object field is expected to be a string (http://getstream.io/docs/#add-remove-activities), thus within the aggregation rule you can not reference properties of activities object field. There are multiple solutions to this problem.
First you could supply the experienceId as a separate property of the activity object, so you can use the aggregation template {{ experienceId }}, since all the additional properties provided to an activity can be used in the aggregation rule (http://getstream.io/docs/#aggregated-feeds).
Second you could supply an object on any additional field of the activity, for instance item. Additional fields can reference their child properties thus you could use aggregation rule {{ item.experienceId }}. But beware not to send data to the getstream.io API that is not actually needed at getstream.io's end, in this example you could also send the object's id field, instead of the entire object, and retrieve the object from your local database once you retrieve activities from the API (The same holds for the actor field). If you do not want to take care of the logic needed for this you could use one of getstream's integration libraries (there are libraries for rails/django/laravel etc.).
var activity = {
time: new Date(),
verb: 'created',
actor: 1,
object: '1',
experienceId: 12
};

Retrieving data from firebase Array

My Firebase Array has the following structure:
[
{
'userId': '12345',
'itemOrdered' : 'abc',
'status': 'pending'
...other attributes
},
{
'userId': '6789',
'itemOrdered' : 'def',
'status' : 'pending',
...other attributes
},
{
'userId': '12345',
'itemOrdered' : 'def',
'status' : 'complete',
...other attributes
},
]
I am not able to figure out how to retrieve the following data:
Get records with userId = xxx
Get all records where 'itemOrdered" = 'def'
Firebase docs talk about using orderByChild but that doesn't make much sense.

Assuming you're using the JavaScript SDK to access Firebase:
ref.orderByChild('userId').equalTo('xxx')
ref.orderByChild('itemOrdered').equalTo('def')
If you're trying to build a query that gets order of item def from user xxx, then that's not currently possible with Firebase's querying. The only way to query the value of multiple properties is to combine them in a single property in a way that allows the query you want. E.g.
ref.orderByChild('userId_itemOrdered').equalTo('xxx_def')

There are additional options depending on your platform.
1) Query for the userId and then filter in code for the item you are looking for.
2) Query for the userId and build another query based on those results.
3) Flatten your data!
For example: create a node called user_purchases and each child could be a userId node with that users purchased itemId's (that makes it a snap to know exactly what items a user purchased). Or create an items_purchased node with each child being an item number node and then associated userId's who purchased that item.

ColdFusion: ORM Collection with Multiple Foreign Keys

My database structure mainly consists of multiple primary keys per table, therefore multiple columns are required per join. I'm trying to use ColdFusion (11 to be specific) ORM collection property. It seems that a comma separated list of columns in the fkColumn attribute doesn't work, like it does for relationship properties. I have filed a bug with Adobe, but I'm wondering if anyone else has run into this and found workarounds. Or maybe I'm just doing it wrong..
Table Setup
Years Staff StaffSites Sites
=========== ============ ============ ===========
YearID (PK) StaffID (PK) YearID (PK) SiteID (PK)
YearName StaffName StaffID (PK) SiteName
SiteID (PK)
Staff ORM CFC
component persistent=true table='Staff' {
property name='id' column='StaffID' fieldType='id';
property name='year' column='YearID' fieldType='id';
property name='sites' elementColumn='SiteID' fieldType='collection' table='StaffSites' fkColumn='StaffID,YearID';
}
The Problem
There is an error when running the generated query: [Macromedia][SQLServer JDBC Driver][SQLServer]An expression of non-boolean type specified in a context where a condition is expected, near ','.
Taking a look at the generated query, it appears that the list of columns is not properly parsed for the where clause, but it somewhat understands that there are multiple columns in the select expression.
select
sites0_.StaffID,
YearID as StaffID1_2_0_,
sites0_.SiteID as SiteID4_0_
from
StaffSites sites0_
where
sites0_.StaffID,YearID=?
The Goal
For the ORM collection property to correctly support a multi-key "join". Why not use a relationship? I'd like to use ORM objects to then serialize as JSON for use in the REST services. The serialized JSON needs to contain the ID for the relationships, not the actual relationship data. For example, the JSON payload should be:
{
"id": 1234,
"year": 2015,
"sites": [1,2,3]
}
Instead of something like:
{
"id": 1234,
"year": 2015,
"sites": [
{"id": 1, "name": "Foo"},
{"id": 2, "name": "Bar"},
{"id": 3, "name": "Baz"},
]
}

For your DB structure, the simplest way to translate into ORM would be to use "StaffSites" as linktable for many-to-many relationships.
You should try CF11's Custom serializer http://blogs.coldfusion.com/post.cfm/language-enhancements-in-coldfusion-splendor-improved-json-serialization-2