Why does Grakn use noun semantics on relations instead of verbs? - graph

I was going over some of the docs here.
I was curious why Grakn opts to indicate relations using noun semantics rather than verb semantics? In most of the other graph work and research I’ve covered, it usually makes sense to think that two entities (nouns) are linked by a verb e.g. person worked at company. Indeed, for a few entities that I am dealing with, it is a bit difficult to reason about the relationships as nouns for example if artist remixed track.
I’m inclined to use verbs as relations but I wonder if that is not how I should be thinking about it in a Grakn setup. Are there eventual difficulties I can expect to face if I decide to use verb semantics?

Typically, graph databases use directed edges to represent binary relations. Under those circumstances it makes sense to use a verb to describe a relation, since verbs often indicate a directional action between a subject and an object.
Grakn is a knowledge graph, which works differently. This is because relations in Grakn act as hyperedges. This means that there can be more than two participants (called roleplayers) in a relation. This is great for flexible modelling, but it can break the verb naming convention.
To work from your example, rather than artist remixed track, we could (very conveniently in this case) use the noun remix as the relation. Taking a guess at the domain model, an artist remixes a track, and as a result they have created a new track. That’s a great opportunity for a ternary (3-way) relation in grakn. The model for that would be as follows:
define
remix sub relation,
relates original-track,
relates remixed-track,
relates remixing-artist;
track sub entity,
plays original-track,
plays remixed-track;
artist sub entity,
plays remixing-artist;
Once the schema above has been defined in Grakn, we can add a remix instance connecting two new tracks and a new artist like so:
insert
$o isa track, has name "Brimful of Asha";
$rt isa track, has name "Brimful of Asha (Norman Cook Remix)";
$a isa artist, has name "Norman Cook";
$r(original-track: $o, remixed-track: $rt, remixing-artist: $a) isa remix;
It has then proved useful to use a noun for the relation because it doesn’t connect any of the 3 roleplayers it can have in a binary way. Instead we have named the concept that sits in-between the two tracks and the artist.
In this way we see that the relation nicely describes the (undirected) link between any pair of the roles:
original-track <-remix-> remixed-track
original-track <-remix-> remixing-artist
remixed-track <-remix-> remixing-artist
We can see that using remixed in place of remix wouldn't work so well, it would try to add direction to these links where there is none.
Grakn's data model can be extended on-the-fly. Therefore even if you start with a binary relation, should you later add more roles, making it ternary or N-ary, verb naming will no longer make sense.
It’s not always easy to name relations with nouns.
My suggestions are:
first try using a past or present participle of a verb (used as an adjective) with a noun. For example the role remixing-artist uses this.
resort to verbs when using nouns is really awkward, and/or if you’re dealing with a relation that you expect always to be binary.

If you must use verbs as relation names, then use the gerund form (which for all practical effects, acts like a noun). e.g.
faceting sub relation,
relates facet-assignment,
relates assigned-facet.
listing sub relation,
relates list-assignment,
relates assigned-list.

Related

UML class diagram: Association or Composition?

I'm a little confused on what the relationship would be for the scenario below. When examples of composition are used they always tend to use simple ones such as rooms and a building.
This scenario is that doctor patient visits are recorded. Would it be an association, composition or a mix of both? I've included a picture below of the two different relationships I am stuck between. I am thinking composition because the visit belongs to each party?
Derived association
In general my rule of thumb is that when in doubt, always use association rather than composition/aggregation. My reasons for this are:
(1) In Object-oriented analysis and design for information systems Wazlawick notes that the real advantage of composition and aggregation
is that the attributes of the parts are often used to derive attributes of the whole. As an example he mentions that the total value of an order (whole) is derived of the value of each of its items (parts). However, this to him is a design concern rather than a conceptual modelling
concern. From a conceptual modelling perspective, he believes that modellers often apply aggregation and composition inappropriately (that is, where whole-part relations are not present) and that their use seldom have real benefit. Hence he suggests avoiding or even abolishing their use.
(2) UML aims to provide a semi-formalization of part-whole relations through composition/aggregation. However, formalization of part-whole relations is a non-trivial task, which the UML specification does not do justice. Indeed, a number of researchers have pointed out various aspects with regards to aggregation and composition in which the UML specification is under specified. All have proposed means for addressing the shortcomings of the UML specification, but to date these changes have not been incorporated into the UML specification. See for instance Introduction to part-whole relations.
When being in doubt, which kind of associoation to use, use the more generic one. Especially, in your case there is no real "consists of" relation. Further in your EX2, you would have an instance of visit, which is an existance bound instance to an Doctor instance and to Patient instance. This is problem when applying the composition rules, as it also introduces an existence relation between Doctor and Patien implicitely. Thus, this shall not be done.
I guess the concept you are loooking for is an association class. This is a class, which instances give the association between an Doctor instance and Patient instance some further information.

NoSQL: new kind of relationships using Arrays?

I had to manage relationships between documents over a NoSQL engine (Couchbase), and I figured out this way to solve my problem. Here is my solution and the steps that let me realize it:
https://forums.couchbase.com/t/document-relationships-using-arrays-and-views-passing-though-graph-theory/3281
My questions are:
What do you think about this solution?
Have you ever used something like this? How is it working?
Are there any better ideas? Critical points of this solution should be helpful
Thank you.
Interesting post Matteo. After reading it I realized that you can possibly improve on few aspects:
Consider 1-1 node relationships. In your post you focus on N-N node
relationships (sure one can argue that 1-1 is a subset of
N-N)...however I think there is a potential of having a different (optimized) implememgtaion for 1-1 relationships. for 1-1 I use node key
value as a field in my json doc (e.g. user: {name:string, dob:date,
addressID:string})
Node key design to address relationships: You can encode in the key
value relationship information, e.g. key: "user#11", "user#11#address#123", "address#123#user#11", etc.
Data integrity aspects: Take into consideration missing complex
transactions. i.e. you can't mutate several documents in one
transaction. The design should compensate for that.
I have used similar solution in my model design for Couchbase in the past. Its now in production for several years already and its performing just fine (load is about 250 tps)...I was trying to avoid as much as possible creating complex node relations and ended up having very few 1-1 and 1-N types.
I tested out this solutions and works well. I like the flexibility of the 'always possible' N-N relationships, because you can simply add the relationship document when you need it without changing the application logic. There is a drawback: you need to implement your own application logic constraints to avoid relationships abuse.
I noticed that using arrays there isn't a great advantage compared to JSON objects and sometimes it may be useful to have other relationships data, for example the weight (or cost) of the relationship. So I suggest you to use a relationship document that as it's own type:
{
"type": "relationship",
"documents": ["key1", "key2"],
"all-the-data-you-need": { ... }
}
Looking at the performance there isn't so much difference using objects over arrays.
Hope this helps someone! ;)

freebase superclass subclass of a concept topic

is there a way to get the recursive superclass concepts of a concept from freebase? For example, i would call the topic "/games/game_publisher" a concept, and I would like to know if it has any superclasses (e.g., /organization/organization would make sense).
Many thanks!
Freebase Types (the equivalent of your "concept") don't have an inheritance structure. They do however have "included types". One key difference is that an included type only gets added to a topic by when it's main type does by convention of the web client (or other client), but after that it can be removed or re-added independently. For example, Deceased Person has Person as an included type and it's unlikely anyone would ever remove the latter, but Author also has Person as included type because that's the case for the overwhelming majority of authors, but for so-called "corporate authors" one would remove Person and add Organization.
So, the included types does carry some semantic information, but it's not as strong as a super/sub-class relationship.

Graph Traversing algorithms in Semantic web

I am asking about Algorithms that would be useful in Querying the Semantic web DB to get all the related RDFs to an original Object.
i.e If the original Object is the movie "inception", I want an algorithm to build queries to get the RDFs of the cast of the movie, the studio, the country ....etc so that I can build a relationship graph.
The most close example is the answer to this question , Especially this class , I wan similar algorithms or maybe titles to search in order to produce such an algorithm, I am thinking maybe some modifications on graph traversing algorithms can work, but I'm not sure.
NOTE: My project is in ASP.NET. So, it would help to use Exisiting .NET libraries.
You should be able to do a simple breadth-first-search to get all the objects that are a certain distance away from a given node.
You'll need to know something about the schema because some neighboring nodes are more meaningful than others. For example, in Freebase, we have intermediate nodes that link a film to an actor and a role. You need to know to go 2-ply deep to get at the actor and the role because just saying that the film is related to the intermediate nodes is not very interesting.
Did you take a look at "property paths"?
Property Paths give a more succinct way to write parts of basic graph
patterns and also extend matching of triple pattern to arbitrary
length paths. Property paths do not invalidate or change any existing
SPARQL query.
Triple stores and SPARQL engines such as OWLIM and AllegroGraph support them.

Can Drupal Taxonomy module be used to categorize court records and briefs?

I'm currently working on a project that involves moving a database of documents for court records and briefs over to a Drupal environment. One of the problems that we are faced with is how to index these documents.
In our court district, records and briefs all have a docket number which is assigned to a case. The interesting thing is that when multiple cases merge the docket numbers associated to the case become synonymous:
Case 1, documents have Doceket No. A
Case 2, documents have Docket No. B
If case Cases 1 and Case 2 merge, then Docket No. A = Docket No. B
My first inclination is to create a Docket Vocabulary and have the terms of this Taxonomy be the docket numbers. I am hoping to take advantage of the fact that terms can be synonymous.
I understand that there are several functions in the Taxonomy module that I may be able to take advantage of, including:
taxonomy_get_synonyms
taxonomy_get_related
But I'm having problems convincing my colleagues that this is the way to go, and frankly I'm not certain it's the right solution either. (Though one advantage I think is likely is that using Taxonomy in this way means we could take advantage of other taxonomy manipulating modules down the line).
If anyone has had a similar issue and can offer some guidance as to how to move forward, I would greatly appreciate it.
Thanks!
D
I've asked a related question (which I would also need to answer in order to move forward with this solution):
Can Drupal terms in different Taxonomies be synonymous?
This is a case for CCK. The Integer field type, most likely. If the dockets merge, edit the node, change the number. Revision history is tracked.
If you want to get fancy with the Docket Merging procedure, you will want to learn
How to create a custom Action.
How to use the Views Bulk Operations module.
Possibly, how to programmatically invoke the Bulk Operation via Rules.
It is a complex, but not difficult task, meaning there's a bunch to learn, but after that it shouldn't take long.
Trying to use the taxonomy module (and it's related counterparts) to force this behavior is just not a good idea.
Taxonomies are intend to bring some form of order and meaning to the content.
A vocabulary of thousands of terms that consists of numbers is just not a taxonomy.
When I took a closer look at the taxonomy module code, I decided that although I could probably force the behavior I'm seeking, taxonomies are not intended to be used to solve problems of this nature.
Also, the use of taxonomy_get_synonym as proposed in my question is plain wrong. Taking a look at the table were synonym relationships are kept in drupal-6 we see that that synonyms are not terms.
Though, there is the possibility of coming up with a similar solution using the related terms, this would be foolhardy.

Resources