I have a Fuseki endpoint and want to use the python rdflib library to add graphs to it.
I connect to the Fuseki server as follows:
import rdflib
from rdflib import Graph, Literal, URIRef
from rdflib.plugins.stores import sparqlstore
query_endpoint = 'http://localhost:3030/dsone/query'
update_endpoint = 'http://localhost:3030/dsone/update'
store = sparqlstore.SPARQLUpdateStore()
store.open((query_endpoint, update_endpoint))
Then I set-up an rdflib ConjunctiveGraph as follows:
g = rdflib.ConjunctiveGraph(identifier=URIRef("http://localhost/dsone/g3"))
g.parse(data="""
#base <http://example.com/base> .
#prefix : <#> .
<> a :document1 .
""", format='turtle', publicID = "http://example.com/g3")
g now contains one triple:
for r in g:
print(r)
Which results in:
(rdflib.term.URIRef('http://example.com/base'), rdflib.term.URIRef('http://www.w3.org/1999/02/22-rdf-syntax-ns#type'), rdflib.term.URIRef('http://example.com/base#document1'))
Now I add the graph to the store with:
store.add_graph(g)
The server shows a 200 return code. When I do the insert directly over REST, it returns 204 - which could be a clue.
Now there is nothing in the store.
for r in store.query("""
select ?g (count(*) as ?count) {graph ?g {?s ?p ?o .}} group by ?g
"""):
print(r)
Which has no output.
As far as I can tell store.add_graph(g) is doing nothing.
Any ideas?
store.add_graph(g)
Just created the graph itself, it won't take your graph content and put it in there. So add_graph() is really only just using the graph's identifier, not it's data.
Try calling add_graph() then adding data to the graph.
Related
So I have a graph of nodes -- "Papers" and relationships -- "Citations".
Nodes have properties: "x", a list with 0/1 entries corresponding to whether a word is present in the paper or not, and "y" an integer label (one of the classes from 0-6).
I want to project the graph from Neo4j using GraphDataScience.
I've been using this documentation and I indeed managed to project the nodes and vertices of the graph:
Code
from graphdatascience import GraphDataScience
AURA_CONNECTION_URI = "neo4j+s://xxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = "my_code:)"
# Client instantiation
gds = GraphDataScience(
AURA_CONNECTION_URI,
auth=(AURA_USERNAME, AURA_PASSWORD),
aura_ds=True
)
#Shorthand projection --works
shorthand_graph, result = gds.graph.project(
"short-example-graph",
["Paper"],
["Citation"]
)
When I do print(result) it shows
nodeProjection {'Paper': {'label': 'Paper', 'properties': {}}}
relationshipProjection {'Citation': {'orientation': 'NATURAL', 'aggre...
graphName short-example-graph
nodeCount 2708
relationshipCount 10556
projectMillis 34
Name: 0, dtype: object
However, no properties of the nodes are projected. I then use the extended syntax as described in the documentation:
# Project a graph using the extended syntax
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {properties: "x"}},
"Citation"
)
print(result)
#Errors
I get the error:
NameError: name 'properties' is not defined
I tried various variations of this, with or without " ", but none have worked so far (also documentation is very confusing because one of the docs always uses " " and in another place I did not see " ").
Also, note that all my properties are integers in the Neo4j db (in AuraDS), as I used to have the error that String properties are not supported.
Some clarification on the correct way of projecting node features (aka properties) would be very useful.
thank you,
Dina
The keys in the Python dictionaries that you use with the GraphDataScience library should be enclosed in quotation marks. This is different from Cypher syntax, where map keys are not enclosed with quotation marks.
This should work for you.
extended_form_graph, result = gds.graph.project(
"Long-form-example-graph",
{'Paper': {"properties": "x"}},
"Citation"
)
Best wishes,
Nathan
I have the following table in Lua:
local a = {orszag = {"Ausztria", "Albánia", "Azerbajdzsán"}, varos = {"Ankara", "Amszterdam", "Antwerpen"}, fiu = {"Arnold", "Andor", "Albert"}, lany = {"Anna", "Anasztázia", "Amanda"}}
I would like to do the following:
for i in a["orszag"] do etc. (for example compare all the words in the value to the user input)
But when I do so I get the following: attempt to call a table value.
So I know, it works in python for example, but is it possible somehow to do this in Lua as well?
Use
for k,v in pairs(a["orszag"]) do
I use a few different triplestores, and code in R and Scala. I think I'm seeing some differences in:
whether the triplestores include triples other than the ones I
explicitly loaded.
the point at which these "background" triples might be added.
Are there any general rules for whether supporting vocabularies need to be added, independent of the implementation technology?
Using Jena in R, via rrdf, I usually only see what I loaded:
library(rrdf)
turtle.input.string <-
"PREFIX prefix: <http://example.com/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix:subject rdf:type prefix:object"
jena.model <-
fromString.rdf(rdfContent = turtle.input.string, format = "TURTLE")
model.string <- asString.rdf(jena.model, format = "TURTLE")
cat(model.string)
This gives:
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix prefix: <http://example.com/> .
prefix:subject a prefix:object .
But sometimes triples from RDF and RDFS seems to appear when I add or remove triples afterwards. That's what "bothers" me the most, but I'm having trouble finding an example right now. If nobody knows what I mean, I'll dig something up later today.
When I use Blazegraph in Scala, via the OpenRDF Sesame library, I think I always get RDF, RDFS, and OWL "for free"
import java.util.Properties
import org.openrdf.query.QueryLanguage
import org.openrdf.rio._
import com.bigdata.journal._
import com.bigdata.rdf.sail._
object InjectionTest {
val jnl_fn = "sparql_tests.jnl"
def main(args: Array[String]): Unit = {
val props = new Properties()
props.put(Options.BUFFER_MODE, BufferMode.DiskRW)
props.put(Options.FILE, jnl_fn)
val sail = new BigdataSail(props)
val repo = new BigdataSailRepository(sail)
repo.initialize()
val cxn = repo.getConnection()
val resultStream = new java.io.ByteArrayOutputStream
val resultWriter = Rio.createWriter(RDFFormat.TURTLE, resultStream)
val ConstructString = "construct {?s ?p ?o} where {?s ?p ?o}"
cxn.prepareGraphQuery(QueryLanguage.SPARQL, ConstructString).evaluate(resultWriter)
var resString = resultStream.toString()
println(resString)
}
}
Even without adding any triples, the construct output includes blocks like this:
rdfs:isDefinedBy rdfs:domain rdfs:Resource ;
rdfs:range rdfs:Resource ;
rdfs:subPropertyOf rdfs:isDefinedBy , rdfs:seeAlso .
Are there any general rules for whether supporting vocabularies need to be added, independent of the implementation technology?
That depends on what inferencing scheme your triplestore claims to support. For a pure RDF store (no inferencing), no additional triples should be added at all.
Judging from that fragment you showed, the Blazegraph store you used has at least RDFS inferencing (and possibly partial OWL reasoning as well?) enabled. Note that this is store-specific, not framework, so it's not a Jena vs. Sesame thing: both frameworks support stores that either do or do not do reasoning. Of course, if you use either framework and use the "excluded inferred triples" option that they offer, the backing store should respect that config option and not include such inferred triples in the result.
I'm trying to get information from a SQLite DB (HDBC.sqlite3) to feed to a web view using the Scotty framework. I'm currently trying to complete a "grab all" or rather select all the information from the table and then return it to display on my web page that is running via Scotty. I've encountered an error and I'm having some trouble figuring out how to fix it.
Here is my error:
Controllers/Home.hs:42:44:
Couldn't match expected type `Data.Text.Lazy.Internal.Text'
with actual type `IO [[(String, SqlValue)]]'
In the expression: getUsersDB
In the first argument of `mconcat', namely
`["<p>/users/all</p><p>", getUsersDB, "</p>"]'
In the second argument of `($)', namely
`mconcat ["<p>/users/all</p><p>", getUsersDB, "</p>"]'
Here is my code:
import Control.Monad
import Web.Scotty (ScottyM, ActionM, get, html, param, text)
import Data.Monoid (mconcat)
import Controllers.CreateDb (createUserDB)
import Database.HDBC
import Database.HDBC.Sqlite3
import Control.Monad.Trans ( MonadIO(liftIO) )
import Data.Convertible
getAllUsers :: ScottyM()
getAllUsers = get "/users/all" $ do
html $ mconcat ["<p>/users/all</p><p>", getUsersDB , "</p>"]
getUsersDB = do
conn <- connectSqlite3 databaseFilePath
stmt <- prepare conn "SELECT name FROM users VALUES"
results <- fetchAllRowsAL stmt
disconnect conn
return (results)
run returns the number of rows modified :
https://hackage.haskell.org/package/HDBC-2.4.0.0/docs/Database-HDBC.html#v:run
I am new to Pig and I want to convert a bag of tuples to a map with specific value in each tuple as key. Basically I want to change:
{(id1, value1),(id2, value2), ...} into [id1#value1, id2#value2]
I've been looking around online for a while, but I can't seem to find a solution. I've tried:
bigQMap = FOREACH bigQFields GENERATE TOMAP(queryId, queryStart);
but I end up with a bag of maps (e.g. {[id1#value1], [id2#value2], ...}), which is not what I want. How can I build up a map out of a bag of key-value tuple?
Below is the specific script I'm trying to run, in case it's relevant
rawlines = LOAD '...' USING PigStorage('`');
bigQFields = FOREACH bigQLogs GENERATE GFV(*,'queryId')
as queryId, GFV(*, 'queryStart')
as queryStart;
bigQMap = ?? how to make a map with queryId as key and queryStart as value ?? ;
TOMAP takes a series of pairs and converts them into the map, so it is meant to be used like:
-- Schema: A:{foo:chararray, bar:int, bing:chararray, bang:int}
-- Data: (John, 27, Joe, 30)
B = FOREACH A GENERATE TOMAP(foo, bar, bing, bang) AS m ;
-- Schema: B:{m: map[]}
-- Data: (John#27,Joe#30)
So as you can see the syntax does not support converting a bag to a map. As far as I know there is no way to convert a bag in the format you have to map in pure pig. However, you can definitively write a java UDF to do this.
NOTE: I'm not too experienced with java, so this UDF can easily be improved on (adding exception handling, what happens if a key added twice etc.). However, it does accomplish what you need it to.
package myudfs;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import java.util.Map;
import java.util.HashMap;
import java.util.Iterator;
import org.apache.pig.data.Tuple;
import org.apache.pig.data.DataBag;
public class ConvertToMap extends EvalFunc<Map>
{
public Map exec(Tuple input) throws IOException {
DataBag values = (DataBag)input.get(0);
Map<Object, Object> m = new HashMap<Object, Object>();
for (Iterator<Tuple> it = values.iterator(); it.hasNext();) {
Tuple t = it.next();
m.put(t.get(0), t.get(1));
}
return m;
}
}
Once you compile the script into a jar, it can be used like:
REGISTER myudfs.jar ;
-- A is loading some sample data I made
A = LOAD 'foo.in' AS (foo:{T:(id:chararray, value:chararray)}) ;
B = FOREACH A GENERATE myudfs.ConvertToMap(foo) AS bar;
Contents of foo.in:
{(open,apache),(apache,hadoop)}
{(foo,bar),(bar,foo),(open,what)}
Output from B:
([open#apache,apache#hadoop])
([bar#foo,open#what,foo#bar])
Another approach is to use python to create the UDF:
myudfs.py
#!/usr/bin/python
#outputSchema("foo:map[]")
def BagtoMap(bag):
d = {}
for key, value in bag:
d[key] = value
return d
Which is used like this:
Register 'myudfs.py' using jython as myfuncs;
-- A is still just loading some of my test data
A = LOAD 'foo.in' AS (foo:{T:(key:chararray, value:chararray)}) ;
B = FOREACH A GENERATE myfuncs.BagtoMap(foo) ;
And produces the same output as the Java UDF.
BONUS:
Since I don't like maps very much, here is a link explaining how the functionality of a map can be replicated with just key value pairs. Since your key value pairs are in a bag, you'll need to do the map-like operations in a nested FOREACH:
-- A is a schema that contains kv_pairs, a bag in the form {(id, value)}
B = FOREACH A {
temp = FOREACH kv_pairs GENERATE (key=='foo'?value:NULL) ;
-- Output is like: ({(),(thevalue),(),()})
-- MAX will pull the maximum value from the filtered bag, which is
-- value (the chararray) if the key matched. Otherwise it will return NULL.
GENERATE MAX(temp) as kv_pairs_filtered ;
}
I ran into the same situation so I submitted a patch that just got accepted: https://issues.apache.org/jira/browse/PIG-4638
This means that what you wanted is a core part starting with pig 0.16.