Multiple aggregates in SPARQL - aggregate-functions

I have a triple store that contains mail archive data. So let's say I have a lot of persons (foaf:Person) that have sent (ex:hasSent) and received (ex:hasReceived) emails (ex:Email).
Example:
SELECT ?person ?email
WHERE {
?email rdf:type ex:Email.
?person rdf:type foaf:Person;
ex:hasSent ?email.
}
The same works for ex:hasReceived, of course. Now I would like to do some statistics and analytics, i.e. determine how many emails an individual has sent and received. Doing this for only one predicate is a simple aggregation:
SELECT ?person (COUNT(?email) AS ?count)
WHERE {
?email rdf:type ex:Email.
?person rdf:type foaf:Person;
ex:hasSent ?email.
}
GROUP BY ?person
However, I need need the number of received emails as well and I would like to do this without having to issue a separate query. So I tried the following:
SELECT ?person (COUNT(?email1) AS ?sent_emails) (COUNT(?email2) AS ?received_emails)
WHERE {
?person rdf:type foaf:Person.
?sent_email rdf:type ex:Email.
?person ex:hasSent ?sent_email.
?received_email rdf:type ex:Email.
?person ex:hasReceived ?received_email.
}
GROUP BY ?person
This did not seem to be right, as the numbers for the emails sent vs. received were exactly the same. I assume this is because my SPARQL statement results in a cross product of all mails a person has ever sent and received, right?
What do I need to do in order to get the statistics right on a per-individual basis?

COUNT(?email1) isn't counting anything as ?email1 is undefined. Also, there is partial cross product as you mention - DISTINCT will help.
Try (COUNT(DISTINCT ?sent_email) AS ?sent_emails)

Related

insert values to empty field from different field in SPARQL

I've just started using SPARQL and I've encountered a challenge with inserting.
I want to insert values to empty field from a different field ( it's like a hierarchy, I want to insert data from parent).
Graph is the same for both fields.
I think i searched for examples everywhere and i tried all of them and even got status 200 but there were no values. I One of the queries I tried:
PREFIX rdf:
PREFIX rdfs:
PREFIX edg:
PREFIX metadata:
PREFIX op:
PREFIX owl:
PREFIX c: <http://abc/>
INSERT
{ GRAPH <http://xyz/> {
?subclass1 ?iBT ?Bterm } }
WHERE
{SELECT ?subclass1 ?iBT ?Bterm
WHERE
{
{?subclass rdfs:subClassOf op:BusinessObject.
?subclass rdfs:label ?label.
?subclass op:bT ?BT.
?BT rdfs:label ?Bterm.
?subclass1 rdfs:subClassOf ?subclass.
?subclass1 rdfs:label ?label1.
OPTIONAL{?subclass1 op:bT ?BT1.}
OPTIONAL{?subclass1 op:iBT ?iBT.
?iBT rdfs:label ?ilabel}
FILTER(?label="something").
}
} }
I want to paste values from ?Bterm to ?iBt.
I added filter to check on one example.
I would appreciate any suggestions with this.
INSERT
{
GRAPH <urn:sparql:tests:insert:informative2> { ?book ?p ?v }
}
WHERE
{
GRAPH <urn:sparql:tests:insert:informative>
{
?book <http://purl.org/dc/elements/1.1/date> ?date .
FILTER ( ?date > "1970-01-01T00:00:00-02:00"^^xsd:dateTime )
?book ?p ?v
}
}
I expected the value Bterm (from example above) to be inserted to iBt and I even got status 200 but there were no values inserted.

query wikidata cities names in arabic and english

I need to get cities names in arabic and english from wikidata
also after that I need to get states names in arabic and english
is it available to download the query return data or copy it to excel or csv?
if you want to use Wikidata Query Service and receive labels in mutliple languages at the same time you have to repeat the rdfs:label Statement for each language like
SELECT * WHERE {
?item (wdt:P31/(wdt:P279*)) wd:Q515;
rdfs:label ?cityLabelEN.
FILTER((LANG(?cityLabelEN)) = "en")
?item rdfs:label ?cityLabelAR.
FILTER((LANG(?cityLabelAR)) = "ar")
}
LIMIT 10
Try It
If you want to get states instead of cities: change the first object from wd:Q515 to wd:Q7275 (and adopt the variabelname for the labels)

Include a range of "possible" dates in a SPARQL filter

I'm working on a DBPedia project to locate female singers who would have been active during the 1960s (approx).
Unfortunately when I try to select a range of singers who were active from 1955 - 1972 I miss out on singers who were active before 1955 (the results negate some singers, for instance Umm Kulthum who was active from 1925-1973).
My code is below, and shows where the filter is only including artists who were active exclusively for this date range. I want to create a filter that says "give me all singers who would have been musically active during the this date range in particular, but also include those who might have been active from a period before and including this date range"? I don't want those that were only active before this date range.
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dbc: <http://dbpedia.org/resource/Category:>
SELECT distinct ?name ?person ?thumbnail ?birthDate ?active
where {
?person foaf:name ?name .
?person dct:subject ?subject.
?person dbo:birthDate ?birthDate.
OPTIONAL {?person dbo:thumbnail ?thumbnail}
OPTIONAL {?person dbo:activeYearsStartYear ?active}
{ ?person a dbo:MusicalArtist .
filter exists {?person
dct:subject/skos:broader*dbc:Female_singers_by_nationality}}
filter (?active > '1955-04-18T22:29:33.667Z'^^xsd:dateTime && ?active <
'1974-01-01T21:37:37.708Z'^^xsd:dateTime)} order by ?active
One solution is to also check the reverse in your filter using a boolean or. Like this:
SELECT distinct ?name ?person ?thumbnail ?birthDate ?activeStart ?activeEnd
where {
?person foaf:name ?name .
?person dct:subject ?subject.
?person dbo:birthDate ?birthDate.
?person dbo:activeYearsStartYear ?activeStart.
?person dbo:activeYearsEndYear ?activeEnd
OPTIONAL {?person dbo:thumbnail ?thumbnail }
{ ?person a dbo:MusicalArtist .
filter exists {
?person dct:subject/skos:broader* dbc:Female_singers_by_nationality
}
}
BIND('1955-04-18T22:29:33.667Z'^^xsd:dateTime as ?startPeriod)
BIND('1974-01-01T21:37:37.708Z'^^xsd:dateTime as ?endPeriod)
filter ( (?activeStart > ?startPeriod && ?activeStart < ?endPeriod)
|| (?activeStart < ?startPeriod && ?activeEnd > ?startPeriod))
}
order by ?activeStart

Part of a request depending on parameter type (if URI or not)

I'm creating an interface in SPARQL to query DBpedia.
For example you can search people who were born in Paris, or people who born in 1966.
My request is generalized and the value changes according to your choice.
According to my example above, here variable1= dbo:birthplace or variable1=dbo:birthDate.
SELECT *
WHERE {
?x a dbo:Person .
?x variable1 ?z.
}
I add a line to write the name of the place you want:
SELECT *
WHERE {
?x a dbo:Person .
?x variable1 ?z.
?z rdfs:label variable2.
}
But this can work only if ?z is an URI, which is not the case for date.
Does someone know a way to make these 2 situations working ?
I tried to add an if statement saying:
If ?z is a URI, add the line ?z rdfs:label variable2.
Otherwise check if ?z = variable2
But it seems that if statement works only to create a new parameter, in this example ?type.
BIND (IF(isURI(?z),"URI","Not")AS ?type).
While I would like something like :
BIND (IF(isURI(?z),?z rdfs:label ?nameobject,?nameobject)AS ?nameobject).
Sorry if my question is not asked correctly, I tried to do it as clear as I could ..
EDIT: Using OPTIONAL, thanks to Stanislav Kralin
I tried with optional, here is my code:
SELECT distinct *
WHERE {
?x a dbo:Person .
?x rdfs:label ?name .
?x dbp:birthName ?z .
OPTIONAL{ ?z rdfs:label ?nameobject .}
OPTIONAL{BIND(?z as ?nameobject) .}
BIND (concat("http://wikipedia.org/wiki/",replace(?name," ","_")) as ?wikilink) .
}
LIMIT 100
So if ?z is an URI, it gives the rdfs:label; if not (that is typed literal or plain literal with language tag), it should keep ?z.
It does the first optional but not the the second one. However if I write this
OPTIONAL{BIND("Try" as ?nameobject) .}
it writes the "Try" statement. So I think I am not far from the solution, perhaps I'm not writing correctly the BIND.
Finally, here is the solution! :)
Here is the beginning of my code :
SELECT distinct *
WHERE {
?x a dbo:Activity .
?x rdfs:label ?name .
?x dbp:skills ?z .
}
ORDER BY?x
LIMIT 100
My problem was that I needed to make 2 different queries according to the data type of my ?z variable.
I tried to do it with IF, but as explained here, in SPARQL IF is an operator and not a statement.
So I tried with OPTIONAL by saying :
OPTIONAL{ ?z rdfs:label ?nameobject .}
OPTIONAL{BIND( ?z as ?nameobject) .}
That means, if rdfs:label of ?z exists, put it in ?nameobject, otherwise, put ?z in ?nameobject.
But that didn't work, probably because of the different types of variables.
Finally my solution is to create 2 columns, to put the data in the same type, and then to put them in the same column:
SELECT distinct *
WHERE {
?x a dbo:Activity .
?x rdfs:label ?name .
?x dbp:skills ?z .
OPTIONAL{ ?z rdfs:label ?nameobjectURI .}
BIND( IF(isURI(?z),"",concat(?z," ")) as ?nameobjectOTH) .
BIND( IF(bound(?nameobjectURI),STR(?nameobjectURI),?nameobjectOTH) as ?nameobject) .
}
ORDER BY?x
LIMIT 100
And that works! I hope it will help someone else :)
EDIT with COALESCE solution, from Stanislav Kralin
It is possible to simplify the code like this :
SELECT distinct *
WHERE {
?x a dbo:Activity .
?x rdfs:label ?name .
BIND(STR(?name) as ?namestr) .
?x dbp:skills ?z .
OPTIONAL{ ?z rdfs:label ?nameobjectURI .}
BIND (COALESCE(STR(?nameobjectURI),concat(?z," ")) as ?nameobject) .
}

DQL - filter entitys bases on parent country

I encounter a problem with a dql query.
To locate, I have protocols with a many to many relation with criterions,
criterions which have many to many relation with details,
details which have many to many relation with organes and then organes which have many to one relation with atelier which have a many to one relation with country.
I need to recover protocols which have NO country rattached or country which contains in an array. My problem is when a parent is null and other not, the entity is validate. I don't have many bases in SQL so I missunderstood something maybe.
THis is my DQL query :
$queryBuilder->select('p')
->from($this->getClass(), 'p')
->leftJoin('p.criteres', 'c')
->leftJoin('c.details', 'd')
->leftJoin('d.organes', 'o')
->leftJoin('o.atelier', 'a')
->where('a.country IN (' . $this->getUserCountriesFormated() . ') OR a.country IS NULL')
->orderby('p.name', 'ASC');
Thanks in advance for your reply.
Edit : an example of protocol which pull up because of a null detail (the country ITA isn't in my countriesformated array) :
Your data model seems to be a bit crazy (too many NtoN relationships), make sure it is correct. I suppose method getUserCountriesFormated() returns comma separated string of country ids, for example "1,2,3".
Then, instead of
->where('a.country IN (' . $this->getUserCountriesFormated() . ') OR a.country IS NULL')
try to use:
->leftJoin('a.country', 'n')
->where('(IDENTITY(n) IN (' . $this->getUserCountriesFormated() . ') OR IDENTITY(n) IS NULL)')

Resources