NEO4J - Recursive query aggregation - graph

I have the following query
MATCH (category:Category {id:'qwe'} )
MATCH (category)-[:CHILD_OF*0..50]->(subcats:Category)<-[:PHOTO_OF]-(photo:Photo)
return (subcats.name), count(photo)
This query returns only the photo counts of the exact categories that photos belong to, i want aggregated count of categories photos including the photos of the child categories, i guess this used to work but broken after 2.3.x update.

Try this query:
MATCH (category:Category {id:'qwe'} )
MATCH (category)-[:CHILD_OF*0..50]->(subcats:Category)<-[:PHOTO_OF]-(photo:Photo)
WITH category.name AS cat, collect(subcats.name) AS subcats, count(photo) AS num
RETURN cat, subcats, num
Is that the behavior you are trying to achieve?
I'm not sure what might have changed between 2.2.x and 2.3 with your query. Would you be willing to share your data so we can try to reproduce?

Related

How to use an ifnull() inside a case

I am writing a query for a SQLite database in which I need to find the best selling product for each region in a list of 6 regions. However, some of the regions don't have any product sold in which case I am supposed to output 'No Product Sold'. I am trying to use an ifnull() statement, but can't get it working.
Currently, my query is as follows
SELECT area.name AS region,
CASE ifnull(sales.area_id, 'No Product Sold')
END AS most_sales
FROM product
JOIN sales ON product.rank = sales.rank
GROUP BY area name
I also know I will likely need to use a nested query, but I'm not sure how yet.
Unfortunately I cannot base this off of rank, it has to be the item with the highest sales in the region. Does anyone know what I am doing wrong?

Querying firestore using multiple in clauses

I'm working on a product listing page (similar to any e-commerce site) where users are expected to filter products based on multiple attributes and multiple values per attribute.
Lets assume data model is as shown below,
Product
Category - Shirt
Size - Medium
Colour - Blue
Dropdown filters on the search page would be,
1. Category: Shirts, T-Shirts, Trousers etc
2. Size: Medium, Large etc
I'm clueless as how to query firestore when user would like to search all Shirts & T-Shirts of Small & Medium sizes?
Query like this isn't supported in firestore,
Firebase.firestore().collection("Products")
.where("Category", "in", ["Shirt, "T-Shirts"])
.where("Size", "in", ["Medium", "Large"])
On top of this, I need to paginate the response so filtering on the client side doesn't seem like an option.
Please suggest if there is any option.
Since you can only have a single in condition in the query, your current approach won't work. The only workaround I know of is to keep a separate field where each value is the combination of sizes that is available. So something like:
"available_in": "Medium_Large"
And then query with:
Firebase.firestore().collection("Products")
.where("Category", "in", ["Shirt, "T-Shirts"])
.where("available_in", "==", "Medium_Large")
This type of solution only works if the number of combinations is reasonable, but that seems to be the case here.

Arango DB performace: edge vs. DOCUMENT()

I'm new to arangoDB with graphs. I simply want to know if it is faster to build edges or use 'DOCUMENT()' for very simple 1:1 connections where a querying the graph is not needed?
LET a = DOCUMENT(#from)
FOR v IN OUTBOUND a
CollectionAHasCollectionB
RETURN MERGE(a,{b:v})
vs
LET a = DOCUMENT(#from)
RETURN MERGE(a,{b:DOCUMENT(a.bId)}
A simple benchmark you can try:
Create the collections products, categories and an edge collection has_category. Then generate some sample data:
FOR i IN 1..10000
INSERT {_key: TO_STRING(i), name: CONCAT("Product ", i)} INTO products
FOR i IN 1..10000
INSERT {_key: TO_STRING(i), name: CONCAT("Category ", i)} INTO categories
FOR p IN products
LET random_categories = (
FOR c IN categories
SORT RAND()
LIMIT 5
RETURN c._id
)
LET category_subset = SLICE(random_categories, 0, RAND()*5+1)
UPDATE p WITH {
categories: category_subset,
categoriesEmbedded: DOCUMENT(category_subset)[*].name
} INTO products
FOR cat IN category_subset
INSERT {_from: p._id, _to: cat} INTO has_category
Then compare the query times for the different approaches.
Graph traversal (depth 1..1):
FOR p IN products
RETURN {
product: p.name,
categories: (FOR v IN OUTBOUND p has_category RETURN v.name)
}
Look-up in categories collection using DOCUMENT():
FOR p IN products
RETURN {
product: p.name,
categories: DOCUMENT(p.categories)[*].name
}
Using the directly embedded category names:
FOR p IN products
RETURN {
product: p.name,
categories: p.categoriesEmbedded
}
Graph traversal is the slowest of all 3, the lookup in another collection is faster than the traversal, but the by far fastest query is the one with embedded category names.
If you query the categories for just one or a few products however, the response times should be in the sub-millisecond area regardless of the data model and query approach and therefore not pose a performance problem.
The graph approach should be chosen if you need to query for paths with variable depth, long paths, shortest path etc. For your use case, it is not necessary. Whether the embedded approach is suitable or not is something you need to decide:
Is it acceptable to duplicate information, and potentially have inconsistencies in the data? (If you want to change the category name, you need to change it in all product records instead of just one category document, that products can refer to via the immutable ID)
Is there a lot of additional information per category? If so, all that data needs to be embedded into every product document that has that category - basically trading memory / storage space for performance
Do you need to retrieve a list of all (distinct) categories often? You can do this type of query really cheap with the separate categories collection. With the embedded approach, it will be much less efficient, because you need to go over all products and collect the category info.
Bottom line: you should choose the data model and approach that fits your use case best. Thanks to ArangoDB's multi-model nature you can easily try another approach if your use case changes or you run into performance issues.
Generally spoken, the latter variant
LET a = DOCUMENT(#from)
RETURN MERGE(a,{b:DOCUMENT(a.bId)}
should have lower overhead than the full-featured traversal variant. This is because the DOCUMENT variant will do a point lookup of a document whereas the traversal variant is very general purpose: it can return zero to many results from a variable number of collections, needs to keep track of the path seen etc.
When I tried both variants in a local test case, the non-traversal variant was also a lot faster, supporting this claim.
However, the traversal-based variant is more flexible: it can also be used should there be multiple edges (no 1:1 mapping) and for longer paths.

Avoiding salesforce governing limits on soql queries getting group members for each group?

I am working in apex on salesforce platform. I have this loop to grab all group names, Ids, and their respective group members, place them in an object to collect all this info, then put that in a list to have a list of all groups and all information I need:
List<groupInfo> memberList = new List<groupInfo>();
for(Id key : groupMap.keySet()){
groupInfo newGroup = new groupInfo();
Group g = groupMap.get(key);
if(g.Name != null){
set<Id> memberSet = getGroupEventRelations(new set<Id>{g.Id});
if(memberSet.size() != 0){
newGroup.groupId = g.Id;
newGroup.groupName = g.Name;
newGroup.groupMemberIds = memberSet;
memberList.add(newGroup);
}
}
}
My getGroupEventRelations method is as such:
global static set<Id> getGroupEventRelations(set<Id> groupIds){
set<Id> nestedIds = new set<Id>();
set<Id> returnIds = new set<Id>();
List<GroupMember> members = [SELECT Id, GroupId, UserOrGroupId FROM GroupMember WHERE GroupId IN :groupIds];
for(GroupMember member : members){
if(Schema.Group.SObjectType == member.UserOrGroupId.getSObjectType()){
nestedIds.add(member.UserOrGroupId);
} else{
returnIds.add(member.UserOrGroupId);
}
}
if(nestedIds.size() > 0){
returnIds.addAll(getGroupEventRelations(nestedIds));
}
return returnIds;
}
getGroupEventRelations contains a soql query, and considering this is called inside a loop of groups... if someone has over 100 groups with group members or possibly a series of 100 nested groups inside groups... then this is going to hit the governing limits of salesforce soql queries pretty quickly...
I am wondering if anyone knows of a way to possibly get rid of the soql query inside getGroupEventRelations to get rid of the query in the loop. When I want group members for a specific group, I am not really seeing a way to get by this without more loops inside loops where I could risk running into CPU timeout salesforce governing limit :(
Thank you in advance for any help!
At large enough numbers there's no solution, you'll run into SOME governor limit. But you can certainly make your code work with bigger numbers than it does now. Here's a quick little cheat you could do to cut nesting 5-fold. Instead of just looking at the immediate parent (single level of children) look for parent, grandparent, great grandparent, etc, all in one query.
[SELECT Id, GroupId, UserOrGroupId FROM GroupMember WHERE (GroupId IN :groupIds OR Group.GroupId IN :groupIds OR Group.Group.GroupId IN :groupIds OR Group.Group.Group.GroupId IN :groupIds OR Group.Group.Group.Group.GroupId IN :groupIds OR Group.Group.Group.Group.Group.GroupId IN :groupIds) AND Id NOT IN :returnIds];
You just got 5 (or is it 6?) levels of children in one SOQL call, so you can support that many times more nest levels now. Note that I added a 'NOT IN' clause to make sure you don't repeat children that you already have, since you won't know which Ids came from the bottom level.
You can also make your very first call for all groups instead of each group at a time. So if someone has 100 groups you'll make just one call instead of 100.
List<Group> groups = groupMap.values();
List<GroupMember> allMembers = [SELECT Id, GroupId, UserOrGroupId FROM GroupMember WHERE GroupId IN :groups];
Lastly, you could query all GroupMembers in a single SOQL call and then iterate yourself. Like you said, you risk running into the 10 second limit here, but if the number of groups isn't in the millions you'll likely be just fine, especially if you do some O(n) analysis and choose good data structures and algorithms. On the plus side, you won't have to worry about SOQL limits regardless of the nesting and the tree complexity. This answer should be very helpful, they are doing almost exactly what you'd have to do if you pulled all members in one call.
How to efficiently build a tree from a flat structure?

After join, cannot filter by attribute qty - getting products from inventory that are in stock

You have been so helpful in the past that I keep coming back searching for help and learning.
This time I am trying to get all products that have a quantity greater than 1 and that are in stock (is_in_stock = 1)
$products = Mage::getModel('catalog/product')->getCollection();
//$products->addAttributeToSelect('*');
//SELECT `e`.*, `stock`.`qty` FROM `catalog_product_entity` AS `e` LEFT JOIN `cataloginventory_stock_item` AS `stock` ON stock.product_id = e.entity_id
$products->getSelect()->joinLeft(
array('stock'=>'cataloginventory_stock_item'),
'stock.product_id = e.entity_id',
array('stock.qty', 'stock.is_in_stock')
);
This returns qty and is_in_stock columns attached to the products table. You can test it as follows:
$products->getFirstItem()->getQty();
$products->getFirstItem()->getIsInStock();
The issue begins when I try to filter by qty and is_in_stock.
$products->addFieldToFilter(array(
array('Qty','gt'=>'0'),
array('Is_in_stock','eq'=>'1'),
));
This returns - Invalid attribute name never performing filtering. I am guessing it is trying search for e.qty but cannot find it.
So, I tried to filter differently:
$products->getSelect()->where("`qty` > 0");
$products->getSelect()->where("`is_in_stock` = 1");
This is not filtering as well even though, if you look at its sql query, (var_dump((string) $products->getSelect())), and run that query in phpMyAdmin, it works.
Alan Storm in his tutorial mentions that 'The database query will not be made until you attempt to access an item in the Collection'. So, I make the $products->getFirstItem() call but it still not executing the query or filtering in another words.
What am I doing wrong? Any ideas how to filter by attributes that are joined to the table?
Thank you again,
Margots
I would suggest that you try using $products->addAttributeToFilter... instead of $products->addFieldToFilter... - the addField method only works when the field is on the main table that you are querying (in this case catalog_product_entity). Because the inventory fields are in a joined table, you need to use addAttribute.
Hope this helps,
JD
After looking under the hood I learned that _selectAttributes instance field was not assigned in Mage_Eav_Model_Entity_Collection_Abstract class and that is why get exception. A solution usually would be what Jonathan Day suggested above - add addAttributeToFilter() method, however. It will return error since it cannot find such attribute for catalog/product. (qty and in_invetory are in cataloginventory_stock_item). I found two solutions to my problem both required going different direction:
One involved pursuing a way to query the Select statement that I had set for product collection(see above) but somehow it was not resetting the collection with new product. WhenI copied that Sql statment in phpMyAdmin, it worked, so how to query that statement from product collection:
$stmt = $products->getConnection('core_write')->query($products->getSelect()->__toString());
while($rows = $stmt->fetch(PDO::FETCH_ASSOC)){
echo "<br>Product Id ".$rows['entity_id'];
}
Instead of using catalog/product entity table I used the flat table - cataloginventory_stock_item to accomplish the same thing
$stockItem = new Mage_CatalogInventory_Model_Stock_Item();
$stockItems->addQtyFilter('>',0);
$stockItems->addFieldToFilter('is_in_stock',array('eq'=>'1'));
Now there is a collection of all products with qty > 0 and that are in stock.

Resources