Query WikiData using Wikipedia categories - wikidata

I'm wondering if it's possible to write a WikiData SparkQL query that can retrieve all entities under a category?
For example: the wikipage of Barak Obama has a bunch of categories including: "African-American Christians", "African-American educators", "African-American feminists", "African-American lawyers"
I'm trying to find a way to select all "humans" what match those categories. The wikidata page of Obama doesnt have any of those categories so I'm not sure how to query this.
Thanks

SELECT * WHERE {
wd:Q76 wdt:P910 ?category .
?link schema:about ?category; schema:isPartOf <https://en.wikipedia.org/>; schema:name ?title .
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "en.wikipedia.org";
wikibase:api "Generator";
mwapi:generator "categorymembers";
mwapi:gcmtitle ?title;
mwapi:gcmprop "ids|title|type";
mwapi:gcmlimit "max".
?member wikibase:apiOutput mwapi:title.
?ns wikibase:apiOutput "#ns".
?item wikibase:apiOutputItem mwapi:item.
}
}
adapted from here

Related

Lighouse #paginator with all records

I am using #paginator directive on my query and my client wants to get all records of posts from the query. This is my code:
posts: [Post!]! #paginate
I tested this querys:
posts(first:0) {id} #works but don't get all records
posts(first:-1) {id} #error
One way was to get all records was to use the value of total inside the paginatorInfo and make a new query with that value on the first:.
posts(first:0) {
paginatorInfo {
total
}
}
For optimization making 2 querys to get all records is very bad.
The best approach I got was to make a new query (now having two querys for posts with different directives and names) like:
allPosts: [Post!]! #all
Other approaches but not so clean:
Set pagination.max_count to null in the config/lighthouse.php and do (first: 100000000) (the int is 32 bits so has a limit).
Change the #paginantor to a #field(resolver:) and do pagination with pure php.

Get the complete info of a Wikidata item

I'm using the following query to get the info of a specific Wikidata item.
For example, this one gets the info about the movie Titanic
SELECT ?wd ?wdLabel ?ps ?ps_Label ?wdpqLabel ?pq_Label {
VALUES (?film) {(wd:Q44578)}
?film ?p ?statement .
?statement ?ps ?ps_ .
?wd wikibase:claim ?p.
?wd wikibase:statementProperty ?ps.
OPTIONAL {
?statement ?pq ?pq_ .
?wdpq wikibase:qualifier ?pq .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} ORDER BY ?wd ?statement ?ps_
It works well and I do get the info, but I want to add the items ("Q") beside them.
For example, if the genre is "romance film" I would like to get Q1054574 besides it. And if the actor is Leonardo DiCaprio I would like to get Q38111.
How can I achieve this in this kind of query?
You could add ?ps_ to the SELECT:
SELECT ?wd ?wdLabel ?ps ?ps_Label ?ps_ ?wdpqLabel ?pq_Label
Result: Screenshot

Sanity query to filter array of object using reference value

My database structure looks something like this (omitted less-important details):
Hospital{
Name,
Variants[] (references HospitalVariant)
}
HospitalVariant{
DiseaseVariant (references Disease),
Description,
Rating
}
Disease{
Name,
slug
}
Now, I want to fetch all hospitals which treats breast-cancer and I only want to fetch breast-cancer from that hospitals array of diseases.
*[_type="Hospital" && Variants[].DiseaseVariant->slug.current match "breast-cancer"]{
...,
Name,
Variants[DiseaseVariant->slug.current match "breast-cancer"]
}
First part of the query is working correctly i.e. it's fetching the hospitals which treats breast-cancer but in the Diseases array, nothing is being fetched.
If I use a non-referenced field, the query is working correctly. i.e.
*[_type="Hospital" && Variants[].DiseaseVariant->slug.current match "breast-cancer"]{
...,
Name,
Variants[Rating > 95]
}
This is returning the correct results. But when I am using referenced Object (Disease), it's not working correctly.

Searching not exists in Neo4j via Cypher

I have some relations between persons in my graph.
my data (generate script below)
create (s:Person {name: "SUE"})
create(d:Person {name: "DAVID"})
create(j:Person {name: "JACK"})
create(m:Person {name: "MARY"})
create(js:Person {name: "JASON"})
create(b:Person {name: "BOB"})
create(a1:Adress {id:1})
create(a2:Adress {id:2})
create(a3:Adress {id:3})
create(a4:Adress {id:4})
create(a5:Adress {id:5})
merge (d)-[:MOTHER]->(s)
merge(j)-[:MOTHER]->(s)
merge(js)-[:MOTHER]->(m)
merge(b)-[:MOTHER]->(m)
merge(b)-[:CURRENT_ADRESS]->(a1)
merge(js)-[:CURRENT_ADRESS]->(a2)
merge(j)-[:CURRENT_ADRESS]->(a3)
merge(s)-[:CURRENT_ADRESS]->(a4)
merge(d)-[:CURRENT_ADRESS]->(a5)
;
I can get mothers who live with her child:
MATCH (p:Person)-[:CURRENT_ADRESS]->(a:Adress)<-[:CURRENT_ADRESS]-(t), (t)-[:MOTHER]->(p)
return p.name,t.name
p.name t.name
MARY JASON
but i want to get mothers who is not living with any child of her.
How can i do that in Cyper?
Actually in your graph, everybody is living at a different address due to different identifiers.
Let's build a graph example introducing the sister which lives at the same address :
CREATE
(p:Person)-[:MOTHER]->(m:Person),
(p)-[:FATHER]->(f:Person),
(p)-[:SISTER]->(s:Person),
(p)-[:CURRENT_ADDRESS]->(a:Adress),
(m)-[:CURRENT_ADDRESS]->(b:Adress),
(f)-[:CURRENT_ADDRESS]->(c:Adress),
(s)-[:CURRENT_ADDRESS]->(a)
Now this is very simple, match family members that don't have a CURRENT_ADDRESS relationship in depth2 to the family member :
MATCH (p:Person)-[:MOTHER|:FATHER|:SISTER]->(familyMember)
WHERE NOT EXISTS((p)-[:CURRENT_ADDRESS*2]-(familyMember))
RETURN familyMember
Try this
MATCH (p:Person)-[:CURRENT_ADRESS]-(a:Adress), (p)-[:MOTHER|:FATHER]->(t)
WITH p,a,t
MATCH (p), (t) where not (t)-[:CURRENT_ADRESS]-(a)
return p.NAME,t.NAME
This should work:
MATCH (p:Person)-[:CURRENT_ADRESS]-(a:Adress), (p)-[:MOTHER|:FATHER]->(t)-[:CURRENT_ADRESS]-(b:Adress)
WHERE a <> b
return p.NAME, t.NAME;
By the way, I'd also put the appropriate direction arrow on the CURRENT_ADRESS relationships.
Finally i found it.
match path=(p:Person)-[:MOTHER]->(m:Person)-[:CURRENT_ADRESS]->(a:Adress)
where all(x in nodes(path) where not exists((p)-[:CURRENT_ADRESS]->(a)))
return path

Filter a product collection by two categories in Magento

I'm trying to find products that are in two categories.
I've found an example to get products that are in category1 OR category2.
http://www.alphadigital.cl/blog/lang/en-us/magento-filter-by-multiple-categories.html
I need products that are in category1 AND category2.
The example in the blog is:
class ModuleName_Catalog_Model_Resource_Eav_Mysql4_Product_Collection
extends Mage_Catalog_Model_Resource_Eav_Mysql4_Product_Collection{
public function addCategoriesFilter($categories){
$alias = 'cat_index';
$categoryCondition = $this->getConnection()->quoteInto(
$alias.'.product_id=e.entity_id AND '.$alias.'.store_id=? AND ',
$this->getStoreId()
);
$categoryCondition.= $alias.'.category_id IN ('.$categories.')';
$this->getSelect()->joinInner(
array($alias => $this->getTable('catalog/category_product_index')),
$categoryCondition,
array('position'=>'position')
);
$this->_categoryIndexJoined = true;
$this->_joinFields['position'] = array('table'=>$alias, 'field'=>'position' );
return $this;
}
}
When I'm using this filter alone it perform OR query on several categories.
When I combine this filter with prepareProductCollection of Mage_Catalog_Model_Layer
it somehow remove the filter effect.
How can I change the filter to AND and combine it with prepareProductCollection?
Thanks
Thanks
This code will allow you to filter by multiple categories but avoid completely killing performance if you had to perform multiple collection loads:
$iNumberFeaturedItems = 4;
$oCurrentCategory = Mage::registry('current_category');
$oFeaturedCategory = Mage::getModel('catalog/category')->getCollection()
->addAttributeToFilter('name','Featured')
->getFirstItem();
$aFeaturedCollection = Mage::getResourceModel('catalog/product_collection')
->addAttributeToSelect(array('name', 'price', 'small_image', 'url_key'), 'inner')
->addStoreFilter()
->addCategoryFilter($oFeaturedCategory)
->addCategoryIds();
The first step is to get a collection of products for one category (in this case, a Featured category). Next step is to get the IDs of the products, notice that this does NOT perform a load (ref Mage_Core_Model_Mysql4_Collection_Abstract::getAllIds())
$aFeaturedProdIds = $aFeaturedCollection->getAllIds();
shuffle($aFeaturedProdIds); //randomize the order of the featured products
Then get the IDs for a second category:
$aCurrentCatProdIds = $oCurrentCategory->getProductCollection()->getAllIds();
And intersect the arrays to find product IDs that exist in both categories:
$aMergedProdIds = array_intersect($aFeaturedProdIds,$aCurrentCatProdIds);
For this particular use case, we loop until we have sufficient intersecting products, traversing up the category tree until we find a large enough match (but stopping at root category!):
while(count($aMergedProdIds) < $iNumberFeaturedItems && $oCurrentCategory->getId() != Mage::app()->getStore()->getRootCategoryId()):
$oCurrentCategory = $oCurrentCategory->getParentCategory();
$aParentCatProdIds = $oCurrentCategory->getProductCollection()->getAllIds();
$aMergedProdIds = array_intersect($aFeaturedProdIds,$aParentCatProdIds);
endwhile;
Finally, filter our initial collection by the IDs of the intersecting products, and return.
$aFeaturedItems = $aFeaturedCollection->addIdFilter(array_slice($aMergedProdIds,0,$iNumberFeaturedItems))->getItems();
return $aFeaturedItems;
I am also working on this to no avail, it was available in magento 1.3 using the attribute filter with finset on the category_ids column, however this was moved into the index table from the entity table and now no longer works.
There is one possible solution, but requries an override function which I found here
But this solution is far from ideal.
I think that it will be enough to call addCategoryFilter() twice on your product collection - once for each category. I have not tested it though, so might be wrong.

Resources