Measuring distances among classes in RDF/OWL graphs - graph

Maybe someone could give me a hint. Is it possible to measure the distance between 2 concepts/classes that belong to the same ontology?
For example, let's suppose I have an ontology with the
Astronomy class and the Telescope class. There is a link between both, but it is not a direct link. Astronomy has a parent class called Science, and Telescope has a parent class called Optical Instrument which belongs to its parent called Instrumentation, that is related to a class called Empirical Science that finally belongs to a class called Science.
So there is an indirect link between Telescope and Astronomy, and I want to find out the number of steps needed to reach one class starting from the another one.
Is there an easy SPARQL query that resolves that question? Or are there better ways to do that job? Or is not possible to find that out using Semantic Web paradigm?
Any hint will be very appreciated.

SPARQL provides the ability to search for arbitrary length paths in a graph but no mechanism to tell you the length of that path.
So you can do something like:
SELECT * WHERE { ?s ex:property+ ?o }
The syntax is very much like regex so you can do alternatives, restricted cardinalities etc

In my understanding SPARQL doesn't contain any recursive constructions to be able to measure indirect link of arbitrary length. The best you could do is to prepare set of queries distance_1(a, b), distance_2(a, b)... to check for specific distance between two concepts.
Another alternative is to discover this information using non-SPARQL technology, for example writing graph traversing algorithm in Python with RDFlib.

Since you explicitly mentioned that you are talking about classes and they will be in the same ontology, it is safe to assume that they will be always connected (because ultimately both will be a subclass of "Thing", right?). On the other hand, the path I mentioned in the parentheses (Class1 -> ... -> Thing <- ... <- Class2) is a trivial one, so I assume you want to find... all of the existing paths between two classes, in other words, all of the existing paths between two vertices. Is that true? Or are you looking for the shortest path? Your question is not very clear in that aspect, can you clarify it?
As far as I know there is no simple SPARQL construct that will list all the paths between classes or the shortest path. However some semantic web triple stores come with graph traversal algorithms such as breadth-first-search or depth-first-search, please refer to:
http://www.franz.com/agraph/support/documentation/current/lisp-reference.html#sna
You may also find the source code of the following project very useful:
RelFinder, Interactive Relationship Discovery in RDF Data, http://www.visualdataweb.org/relfinder.php

Related

Method for recursive fill the ancestor properties of item

I dont know if this question is simple or not (or silly)
You have getItem methods to get an object using its Id (expressid) .
I want to get, for example, the ancestor properties ... and calculate properties (m2, m3. etc)
I can do it recursively calling to the ifc parent elements, and then again ., and again ...
Is there any other way ? Is there any dedicated function ?
This also means to learn about the ifc itself, but I dont know if I'm reinventing the wheel ?
Thanks
PD.
Somebody with enough reputation could create a ifc.js tag ....?
IFC.js just reflects the data within the IFC file. That means that the complexity / convolution of the data structures are not due to the library itself, but to the data structure of the IFC schema. Indeed, getting a bit familiar with the schema is a requirement to perform specific operations.
Just for your information, it's quite uncommon to compute the quantity data of elements yourself (surface, volume, etc). What the vast majority of BIM softwares do is read the explicit data of the IFC file (generally expressed as QSets).
It could be interesting to improve the ifc.js libraby to get a structured object of all properties for a construction element by its expressid.
A construction element is a leaf element under ifcRoot in class hierarchy.

Represent HashMap in class diagram

I have a class diagram which has a dictionary (in python terms) as an attribute. This is the basic structure:
serverEntry = { creditCardObj1 : accountBalance1, creditCardobj2 : accountBalance2, ...}
To clarify, creditCardObj1 is an instance of a class CreditCard while accountBalance1 is an integer value and similarly for the other entries in the dictionary.
I read that in Java, it is called HashMap. In any case, I would like to implement it in my UML class diagram. Any tips on how to do that. I am using Visio 2007 so it would even more helpful if someone can explain in terms of that.
I won't put much effort into this, as another answer has been chosen already. That answer does not cover the closest thing in UML to a HashMap: the qualifier. It is drawn as a rectangle between a class and an association. Inside the rectangle is a name and a type. What it means is that given an instance of that type, the association will yield some number of instances of things on the other end of the association. That number of instances is specified with a multiplicity.
For your purposes, you would put creditCard: CreditCard inside the qualifier rectangle, and Integer on the other end of the association, with a multiplicity of 1.
I believe that this question is more about UML than about the programming language. Thus, please allow me to use the language I feel more comfortable in.
C# calls it Dictionary too, Dictionary<string, T> for example with a string for the HashKey. The UML Class Diagram in Visual Studio (I am using the Community Edition of VS 2015 here) is fortunately close enough to code so that the UML Model basically is the language's meta-model and the diagram is just a different view to the real code.
This comes in handy because real programming problems - like the one you asked for - can be addressed directly in the diagram. (Also: no extra code generation or additional parsing necessary to get or maintain the diagram).
I like the way Visual Studio solved this. They offer two options, one that is focussed more on the technicalities (show as Association)
and one that is focussed more on the domain (show as Collection Association).
I am usually using the first one only if I need the diagram to emphasize that the Dictionary class is involved, in every other case the second option is my preferred option.

Graph Traversing algorithms in Semantic web

I am asking about Algorithms that would be useful in Querying the Semantic web DB to get all the related RDFs to an original Object.
i.e If the original Object is the movie "inception", I want an algorithm to build queries to get the RDFs of the cast of the movie, the studio, the country ....etc so that I can build a relationship graph.
The most close example is the answer to this question , Especially this class , I wan similar algorithms or maybe titles to search in order to produce such an algorithm, I am thinking maybe some modifications on graph traversing algorithms can work, but I'm not sure.
NOTE: My project is in ASP.NET. So, it would help to use Exisiting .NET libraries.
You should be able to do a simple breadth-first-search to get all the objects that are a certain distance away from a given node.
You'll need to know something about the schema because some neighboring nodes are more meaningful than others. For example, in Freebase, we have intermediate nodes that link a film to an actor and a role. You need to know to go 2-ply deep to get at the actor and the role because just saying that the film is related to the intermediate nodes is not very interesting.
Did you take a look at "property paths"?
Property Paths give a more succinct way to write parts of basic graph
patterns and also extend matching of triple pattern to arbitrary
length paths. Property paths do not invalidate or change any existing
SPARQL query.
Triple stores and SPARQL engines such as OWLIM and AllegroGraph support them.

Trees / Graphs, How to represent Multiple parents and children?

I'm hoping you can help me out with some technical questions on graphs/trees.
I'm trying to display the creation of objects in systems.
It's really a tree structure.
It has some interesting requirements.
a)
One node can have many children. Say 20. Maybe more.
ie. one library can be used by many objects.
b)
A child node can have many parents. Say up to 20.
ie. many libraries are used by one procedure or object
c)
A particular node can appear in more than one place.
ie. a generic print, or logging function is called in many procedures
Note: This is just an -example- in tech terms I expect you will understand.
It is NOT the issue I need to model. No need to discuss it.
As I've thought about it, I realized that it's not a simple binary tree, or a linked list.
1)
What kind of data structure could I save all the data in?
2)
How could I produce a graph of this in java?
3)
What is a free open source graphing software that could graph such a tree?
Such as Neo4j
Perhaps in formats:
- as a tree, with a root, trunk, branches, and leaves?
- Like the graphs you see now, depicting social networks, with the root node in the center?
4)
Any good websites, or tutorials on this subject?
Thanks a lot!
Check out prefuse. It's old but it works. You'll have to invest a bit of time to learn how to use it though. Once you get there, it's just a matter of creating a prefuse.data.Graph object and fill in your nodes and their neighbors and then creating the visualization.
If you're open to other solution check out d3.js - draw graph using javascript on SVG element in your browser.
If this is really about objects, then maybe UML can help. It's designed to generate graphs of object relationships. There are tons of free UML tools out there. I'd download one and see if you can shoehorn your application into it.
JGraphT can represent your graph structure and can use JGraph for visualisation.
For an example visualization, look at this.

How to serialize a graph?

This is an interview question: How to serialize a graph ? I saw this answer but I am not sure if this is enough.
It looks like a very confusing "open question" and the candidates are probably expected to ask more questions about the requirements: what the nodes and edges are, how they are serialized themselves, is this graph weighted, directed, etc., how many nodes/edges are in the graph.What about the infrastructure ? Is it a plain file system or we should/can use a database ?
So, how would you answer this question ?
I think the answer you provided is quite reasonable. IMO, basically you need to know the application background, I will ask at least:
is it directed or not?
what are the properties associated with the vertex, edge and graph itself?
is the graph sparse (If so then we'd better not use adjacency matrix) ?
The simplest way will be storing it as an edge list.
However, in different application there are some classical ways to do it.
For example if you are doing circuit simulation then the graph is sparse and
the resulting graph/matrix can be stored as column-compressed form. If you are solving a (min-cost) max-flow problem then there are already a DIMACS format, such that public solvers can read it and write it. Structured way is also a good choice if you want human readable, XML can provide self-validation (there is already a GraphML as the standard). By the way, the dot format is quite self-contained.
Meh. Whatever you store it in, it's basically:
Output each vertex in the graph. If you don't have all the vertices first, it's a PITA to rebuild the graph when you're reading it back in.
Now you can store edges between vertices. Hopefully your vertices have some form of ID, to uniquely identify them. The version of this I've seen is "store a (graph|tree) in a database". So, read in the nodes, store in a hashtable or similar for O(1) amortized lookup. Then, foreach edge, lookup ID-source and ID-dest, and link.
Voila, you've deserialized it. If it's not a DB, the same idea generally holds - serialize nodes first, then edges.

Resources