How to change multiplicity of an edge in TitanDb? - graph

How to change multiplicity of an edge in TitanDb? I've got an edge set to ONE2MANY but need to change it to MULTI.

Not possible.
When an edge is marked as unique - ONE2MANY or MANY2ONE , its raw data (as stored internally by Titan on the backend, for example, HBase) is different than non-unique ones, such as edges marked as MULTI.
So even if you make Titan consider your edges as MULTI ones, it will fail parsing the old ones which were once unique.
Ideally you would like to migrate the old data to the new form, but as far as I am concerned, there is no standard way to do it. The only solution is to re-insert the edges.

Related

Jagged Result Array Gremlin Query

Please may you help me to write a query that returns each source vertex in my traversal along with its associated edges and vertices as arrays on each such source vertex? In short, I need a result set comprising an array of 3-tuples with item 1 of each tuple being the source vertex and items 2 and 3 being the associated arrays.
Thanks!
EDIT 1: Expanded on the graph data and added my current problem query.
EDIT 2: Improved Gremlin sample graph code (apologies, didn't think anyone would actually run it.)
Sample Graph
g.addV("blueprint").property("name","Mall").
addV("blueprint").property("name","HousingComplex").
addV("blueprint").property("name","Airfield").
addV("architect").property("name","Tom").
addV("architect").property("name","Jerry").
addV("architect").property("name","Sylvester").
addV("buildingCategory").property("name","Civil").
addV("buildingCategory").property("name","Commercial").
addV("buildingCategory").property("name","Industrial").
addV("buildingCategory").property("name","Military").
addV("buildingCategory").property("name","Resnameential").
V().has("name","Tom").addE("designed").to(V().has("name","HousingComplex")).
V().has("name","Tom").addE("assisted").to(V().has("name","Mall")).
V().has("name","Jerry").addE("designed").to(V().has("name","Airfield")).
V().has("name","Jerry").addE("assisted").to(V().has("name","HousingComplex")).
V().has("name","Sylvester").addE("designed").to(V().has("name","Mall")).
V().has("name","Sylvester").addE("assisted").to(V().has("name","Airfield")).
V().has("name","Sylvester").addE("assisted").to(V().has("name","HousingComplex")).
V().has("name","Mall").addE("classification").to(V().has("name","Commercial")).
V().has("name","HousingComplex").addE("classification").to(V().has("name","Resnameential")).
V().has("name","Airfield").addE("classification").to(V().has("name","Civil"))
Please note that the above is a very simplified rendering of our data.
Needed Query Results
I need to bring back each blueprint vertex as a base with each of its associated edges / vertices as arrays.
My Current Solution
Currently I do this very cumbersome query that gets the blueprints and assigns a label, gets the architects and assigns a label, then selects both labels. The solution is ok; however, it gets messy when I need to include edges or I need to get blueprint classification vertices (industrial, military, residential, commercial, etc.). In effect, the more associated data that I need to pull back for each blueprint, the sloppier my solution becomes.
My current query looks something like this:
g.V().hasLabel("blueprint").as("blueprints").
outE().or(hasLabel("designed"),hasLabel("assisted")).inV().as("architects").
select("blueprints").coalesce(out("classification"),constant()).as("classifications").
select("blueprints","architects","classifications")
The above produces a lot of duplication. If the number of: blueprints is b, architects is a, and classifications is c, the result set comprises b * a * c results. I'd like one blueprint with an array of its associated architects and an array of its associated classifications, if any.
Complications
I'm trying to do this in one query so that I can get all blueprint data from the graph to populate a filtered list. Once I have the list comprising all of the vertices, edges, and their properties, users can then click links to blobs, browse to project sites, etc. Accordingly, I've got pagination as well as filtering to think about and I'd prefer to make one trip to the server each time I get a new page or the filters change.
I figured out an answer; however, it quadruples the compute charge for the query. Not sure if this can be optimized further.
g.V().hasLabel("blueprint").
project("blueprints","architects").
by().
by(outE().or(hasLabel("designed"),hasLabel("assisted")).inV().dedup().fold())
I just solved for blueprints and architects, but classifications just needs another by(...traversal...) and projection label.
I may have to just get the blueprints in one query, get each of their associated items in parallel queries, then put it all together in the API. That would be very bad design for the API data layer but may be necessary for performance reasons.

how to ensure there single edge in a graph for a given order_id?

My current scenario is like I have I have products,customer and seller nodes in my graph ecosystem. The problem I am facing is that I have to ensure uniqueness of
(customer)-[buys]->product
with order_item_id as property of the buys relation edge.I have to ensure that there is an unique edge with buys property for a given order_item_id. In this way I want to ensure that my graph insertion remains idempotent and no repeated buys edges are created for a given order_item_id.
creating a order_item_id property
if(!mgmt.getPropertyKey("order_item_id")){
order_item_id=mgmt.makePropertyKey("order_item_id").dataType(Integer.class).make();
}else{
order_item_id=mgmt.getPropertyKey("order_item_id");
}
What I have found so far is that building unique index might solve my problem. like
if(mgmt.getGraphIndex('order_item_id')){
ridIndexBuilder=mgmt.getGraphIndex('order_item_id')
}else{
ridIndexBuilder=mgmt.buildIndex("order_item_id",Edge.class).addKey(order_item_id).unique().buildCompositeIndex();
}
Or I can also use something like
mgmt.buildEdgeIndex(graph.getOrCreateEdgeLabel("product"),"uniqueOrderItemId",Direction.BOTH,order_item_id)
How should I ensure this uniqueness of single buys edge for a given
order_item_id. (I don't have a use-case to search based on
order_item_id.)
What is the basic difference in creating an index on edge using
buildIndex and using buildEdgeIndex?
You cannot enforce the uniqueness of properties at the edge-level, ie. between two vertices (see this question on the mailing list). If I understand your problem correctly, building a CompositeIndex on edge with a uniqueness constraint for a given property should address your problem, even though you do not plan to search these edges by the indexed property. However, this could lead to performance issues when inserting many edges simultaneously due to locking. Depending on the rate at which you insert data, you may have to skip the locking (= skip the uniqueness constraint) and risk duplicate edges, then handle the deduplication yourself at read time and/or run post-import batch jobs to clean up potential duplicates.
buildIndex() builds a global, graph-index (either CompositeIndex or MixedIndex). These kind of indexes typically allows you to quickly find starting points of your Traversal within a graph.
However, buildEdgeIndex() allows you to build a local, "vertex-centric index" which is meant to speed-up the traversal between vertices with a potentially high degree (= huge number of incident edges, incoming and/or outgoing). Such vertices are called "super nodes" (see A solution to the supernode problem blog post from Aurelius) - and although they tend to be quite rare, the likelihood of traversing them isn't that low.
Reference: Titan documentation, Ch. 8 - Indexing for better Performance.

Why multiplicity.SIMPLE edges cannot be included in vertex-centric indices?

I have an Edgelabel
ContainsAttribute which has Multiplicity.SIMPLE
These edges also have a property let's call it x that I want to make the vertex-centric index on.
PropertyKey propertyX = mgmt.getPropertyKey("x");
EdgeLabel containsAttributeLabel = mgmt.makeEdgeLabel(EdgeLabels.ContainsAttribute).multiplicity(Multiplicity.SIMPLE).make();
mgmt.buildEdgeIndex(containsAttributeLabel,"propXIndex",Direction.IN, propertyX);
So the edges represent Entity --containsAttribute --> Attribute. The query I am trying to make will try to search Entities given queries by filtering on the Property x.
I wonder why it doesn't allow me saying:
The relation type [ContainsAttribute] has a multiplicity or cardinality constraint in direction [IN] and can therefore not be indexed
I think my use case makes sense and I wouldn't want to relax my edge label multiplicity from SIMPLE to MANY2ONE,ONE2MANY or MULTI, to make it work.
Edit: According to the example http://s3.thinkaurelius.com/docs/titan/1.0.0/indexes.html Hercules battled a lot of monsters so the edges labeled 'battled' are found coming out of 'Hercules' multiple times connecting with different monsters. Then the edge index is on attribute 'time' so filtering can be done. I want to do something similar and I thought vertex-centric indices are the way.. Those edges are Multiplicity.SIMPLE because there is at most one edge labeled 'battled' between Hercules and each of the monsters.
Edit 2:
Similar to the given example again a SIMPLE graph:
I believe it would make sense to have a vertex-centeric index for Hercules and the out-going SIMPLE 'battled' edges. That would make queries like time >=20 faster when traversing from Hercules to the monsters.
I don't understand why we must have a MULTI graph (less strict) like below to leverage vertex-centric indices..
Any help would be appreciated!
Thanks!

What is wrong with light weight edges?

I created an edge without attribute and guess what? it was created but still can not query it but then i created the same edge again and now they both are having same rid>?
I suggest you to start using OrientDB from the tutorial. This is an extract:
Starting from OrientDB v1.4.x edges, by default, are managed as lightweight edges: they don't have own identities as record, but are physically stored as links inside vertices. OrientDB automatically uses Lightweight edges only when edges have no properties, otherwise regular edges are used. From the logic point of view, lightweight edges are edges at all the effects, so all the graph functions work correctly. This is to improve performance and reduce the space on disk. But as a consequence, since lightweight edges don't exist as separate records in the database, the following query will not return the lightweight edges:
SELECT FROM E
In most of the cases Edges are used from Vertices, so this doesn't cause any particular problem. In case you need to query Edges directly, even those with no properties, disable lightweight edge feature by executing this command once:
ALTER DATABASE CUSTOM useLightweightEdges=false
This will only take effect for new edges. For more information look at Troubleshooting.
You can query for a list of names of edges with:
select name from ( select expand(classes) from metadata:schema ) where superClass="E"

Getting extra relationships in the Neo4j query

I am new to Neo4j.
For a given node (say, node 'n'), I am trying to find all other nodes in the graph that are in some way dependent on it. In other words, find nodes in graph who have edges directed towards node 'n'. I am getting correct nodes(lets call them c,d,e) using the following query:
MATCH (depNode)-[r]->(n:AttributeNode)
WHERE n.name='testnode'
RETURN depNode
In the original graph, the nodes c and d are connected as well using a relationship. In the result of the above query, I am also receiving that relationship (edge between c and d). How do I get rid of that edge in my output?
If I get your question correctly, I think you're already getting the correct answer in tabular form but in the visualization form Neo4j shows the "extra edges". You should check out the tabular form and confirm whether it's correctly showing the desired output or not (which it would be).
What's happening here is the default way how the Neo4j browser works. Whenever you try to retrieve some nodes, it shows all the relationships between the nodes as well. If you want to just visualize the nodes, you cannot do that in the current version of the Neo4j browser. You will have to use visualization tools like Gephi on your database and filter your results accordingly.
As of Neo4j 2.2.0.RC1 you can disable the extra relationships being used by setting Autocomplete to Off. The toggle appears at the bottom-right of your result graph and seems to be remembered for future requests.

Resources