How to filter non-resolvable URIs on a SPARQL query? - uri

Is it possibe to filter out results that contains a non-resolvable URI within the SPARQL query?
An example: I'm making the following query (endpoint: http://linkeddata.systems:8890/sparql):
PREFIX RO: <http://www.obofoundry.org/ro/ro.owl#>
PREFIX SIO: <http://semanticscience.org/resource/>
PREFIX EDAM: <http://edamontology.org/>
PREFIX PHIO: <http://linkeddata.systems/ontologies/SemanticPHIBase#>
PREFIX PUBMED: <http://linkedlifedata.com/resource/pubmed/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX up: <http://purl.uniprot.org/core/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?disn_1 ?label ?rel ?valor
WHERE { ?disn_1 ?rel ?valor . ?disn_1 rdfs:label ?label FILTER(( ?disn_1 = <http://linkeddata.systems/SemanticPHIBase/Resource/host/HOST_00561>))}
In the results, as you can see there is in ?valor variable a triple that contains a non-resolvable URI (text: /hostncbitaxid/). I would like to know if there is some specific FILTER that can be added in the SPARQL query to remove those results with non-resolvable URIs.
I'm having problems with the API that I'm using to process these results in C# because it is returning an exception due to non-resolvable URIs so I would like to filter them out in the SPARQL query (if possible).

How do you know that it's not resolvable? RDF doesn't have a concept of a "relative URI", all the URIs are resolved relative to something (and perhaps to what is an implementation detail in some cases), so you end up with absolute URIs. In the HTML results from that endpoint, I get http://linkeddata.systems:8890/hostncbitaxid/, and that could easily be resolvable.
That said, if you are ending up with results that include non-absolute URIs, and you want to filter those out, you could use some heuristics to do that. For instance, if you only want URIs beginning with http, you can do that. E.g., here's a query that returns two values for ?uri:
prefix : <urn:ex:>
select * where {
values ?uri { <http://www.example.org/> </foobar> }
}
-----------------------------
| uri |
=============================
| <http://www.example.org/> |
| <file:///foobar> |
-----------------------------
(Notice that the relative URI /foobar got resolved as a file:// URI.) You can keep only http URIs with a filter:
prefix : <urn:ex:>
select * where {
values ?uri { <http://www.example.org/> </foobar> }
filter strstarts(str(?uri), "http")
}
-----------------------------
| uri |
=============================
| <http://www.example.org/> |
-----------------------------

The query returns (SPARQL results in JSON format):
"valor": { "type": "uri", "value": "/hostncbitaxid/" }}
This is bad data - it must be an absolute URI in RDF. Presumably the data is bad. You can remove it in the query as #joshua-taylor shows.

Related

Karate: Using data-driven embedded template approach for API testing

I want to write data-driven tests passing dynamic values reading from external file (csv).
Able to pass dynamic values from csv for simple strings (account number & affiliate id below). But, using embedded expressions, how can I pass dynamic values from csv file for "DealerReportFormats" json array below?
Any help is highly-appreciated!!
Scenario Outline: Dealer dynamic requests
Given path '/dealer-reports/retrieval'
And request read('../DealerTemplate.json')
When method POST
Then status 200
Examples:
| read('../DealerData.csv') |
DealerTemplate.json is below
{
"DealerId": "FIXED",
"DealerName": "FIXED",
"DealerType": "FIXED",
"DealerCredentials": {
"accountNumber": "#(DealerCredentials_AccountNumber)",
"affiliateId": "#(DealerCredentials_AffiliateId)"
},
"DealerReportFormats": [
{
"name": "SalesReport",
"format": "xml"
},
{
"name": "CustomerReport",
"format": "txt"
}
]
}
DealerData.csv:
DealerCredentials_AccountNumber,DealerCredentials_AffiliateId
testaccount1,123
testaccount2,12345
testaccount3,123456
CSV is only for "flat" structures, so trying to mix that with JSON is too ambitious in my honest opinion. Please look for another framework if needed :)
That said I see 2 options:
a) use proper quoting and escaping in the CSV
b) refer to JSON files
Here is an example:
Scenario Outline:
* json foo = foo
* print foo
Examples:
| read('test.csv') |
And test.csv is:
foo,bar
"{ a: 'a1', b: 'b1' }",test1
"{ a: 'a2', b: 'b2' }",test2
I leave it as an exercise to you if you want to escape double-quotes. It is possible.
Option (b) is you can refer to stand-alone JSON files and read them:
foo,bar
j1.json,test1
j2.json,test2
And you can do * def foo = read(foo) in your feature.

Kibana - find http-URLs

I have a field containing URLs and want to filter all URLs starting with "http://".
I'm unable to figure out how to do that.
I tried as a filter:
scan.domain.url : http\://*
scan.domain.url : "http\://*"
scan.domain.url : /^http\:\/\//
Then I also tried Query DSL
{
"regexp": {
"scan.domain.url": "^http://"
}
}
I always get empty results.
In elastic search regex token ^ $ are not supported.
You need to search on whole string (keyword). Text is broken in tokens so full url will not be available in elastic search.
GET employer/_search
{
"query": {
"regexp": {
"scan.domain.url.keyword": "http://.*"
}
}
}

Nginx rewrite by map with parameters

I would like to rewrite with map as
/oldpage?f=regist -> /signup
/oldpage?f=regist&a=1 -> /signup?a=1
/oldpage?f=confirm -> /signup?t=confirm
/oldpage?f=confirm&a=1 -> /signup?t=confirm&a=1
but my redirect result in nginx (v1.12.2) is
/oldpage?f=regist -> /signup?f=regist
/oldpage?f=regist&a=1 -> Not Found
/oldpage?f=confirm -> /signup?t=confirm?f=confirm
/oldpage?f=confirm&a=1 -> Not Found
I set nginx.conf as,
map $request_uri $rewrite_uri {
include conf.d/redirect.map;
}
server {
...
if ($rewrite_uri) {
rewrite ^ $rewrite_uri redirect;
}
}
and redirect.map is
/oldpage?f=regist /signup;
/oldpage?f=confirm /signup?t=confirm;
It would be really appreciated if you could give me some advices.
If the a=1 parameter represents any other parameters, and you do not wish to add those combinations to the map file, you should change the syntax of your map file to use regular expressions.
The regular expressions in the map block can create named captures which can be used later in the configuration. In the example below, the $prefix and $suffix variables are named captures from the map block.
The example below has some caveats - because the $prefix and $suffix values may be empty, the generated URIs may contain a trailing ? or & - which should not affect the overall semantics.
All of the regular expressions and the mapped values have a common pattern to capture optional parameters and append them to the resulting value.
map $request_uri $new {
default 0;
~*^/oldpage\?(?<prefix>.*&)?f=regist(&(?<suffix>.*))?$ /signup?;
~*^/oldpage\?(?<prefix>.*&)?f=confirm(&(?<suffix>.*))?$ /signup?t=confirm&;
}
server {
...
if ($new) {
return 301 $new$prefix$suffix;
}
See this document for more.

Meteor: Iron.Router Catch All (Will match the rest of the URI)

Is it possible to define a Iron.Router route with a parameter that will match the rest of the URI?
For example
Router.route('results', {
path: '/test/:domain'
});
This will match on entries like
/test/hello
/test/hello.com
What I really need, is to also match on entries such as
/test/hello.com/about
/test/hello.com/about?param=3
Thoughts?
Figured out how!
In this case the path will now be
Router.route('results', {
path: '/test/(.*)'
});
To access the trailing info, access the parameter at index 0
this.params[0]

HTTP Error 414 with query to DBPedia endpoint using SPARQLwrapper

I made a function that executes a SPARQL query on the DBpedia SPARQL endpoint. This function takes an array of 15 elements, and each time it substitutes an element from the array into the query, then executes it to get result. The problem is that it takes the first 9 elements then this error is raised:
results = sparql.query().convert()
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 390, in query
return QueryResult(self._query())
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 369, in _query
raise e
HTTPError: HTTP Error 414: Request-URI Too Large
My query is as follows:
sparql = SPARQLWrapper('http://mlode.nlp2rdf.org/sparql');
querystring="""
PREFIX dc:<http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX olia-ar: <http://purl.org/olia/arabic_khoja.owl#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX lexvo: <http://lexvo.org/id/iso639-3/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX gold: <http://purl.org/linguistics/gold/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX qvoc: <http://www.nlp2rdf.org/quranvocab#>
SELECT ?verseTextAr ?tafseer
WHERE
{
?verse a qvoc:Verse;
qvoc:chapterIndex 26;
qvoc:verseIndex WORD;
skos:prefLabel ?verseTextAr;
qvoc:descByJalalayn ?tafseer.
}
"""
The 414 error means that SPARQLWrapper is trying to do a HTTP GET for the query but the query is too large resulting in a request URI that the DBPedia servers reject.
You need to get SPARQLWrapper to POST the query instead, the documentation states that this is possible and it appears that the setMethod() method should be used to configure this.

Resources