I have a database with a whole bunch of ODocument records. They have their own Class hierarchy and it does not extend V.
I am in the process of adding in new collections and to support some of the features - we would like to use the graph db capabilities.
So I created a new Vertex per
Vertex company = graph.addVertex(null);
I find my existing ODoc and convert that to a vertex as
Vertex person = null;
for (Vertex v : graph.getVertices("Person.name", "Jay")) {
person = v;
}
and try to create an Edge
Edge sessionInIncident = graph.addEdge(null, company, person, "employs");
The edge creation leads to the following
Class 'Person' is not an instance of V
java.lang.IllegalArgumentException
at com.tinkerpop.blueprints.impls.orient.OrientElement.checkForClassInSchema(OrientElement.java:635)
at com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:905)
at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.addEdge(OrientBaseGraph.java:685)
In order to be a Vertex, class Person must extend the V class. Try this command:
alter class Person superclass V
Related
Maybe you can help. I'm an Elm beginner and I'm struggling with a rather mundane problem. I'm quite excited with Elm and I've been rather successful with smaller things, so now I tried something more complex but I just can't seem to get my head around it.
I'm trying to build something in Elm that uses a graph-like underlying data structure. I create the graph with a fluent/factory pattern like this:
sample : Result String MyThing
sample =
MyThing.empty
|> addNode 1 "bobble"
|> addNode 2 "why not"
|> addEdge 1 2 "some data here too"
When this code returns Ok MyThing, then the whole graph has been set up in a consistent manner, guaranteed, i.e. all nodes and edges have the required data and the edges for all nodes actually exist.
The actual code has more complex data associated with the nodes and edges but that doesn't matter for the question. Internally, the nodes and edges are stored in the Dict Int element.
type alias MyThing =
{ nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
Now, in the users of the module, I want to access the various elements of the graph. But whenever I access one of the nodes or edges with Dict.get, I get a Maybe. That's rather inconvenient because by the virtue of my constructor code I know the indexes exist etc. I don't want to clutter upstream code with Maybe and Result when I know the indexes in an edge exist. To give an example:
getNodeTexts : Edge -> MyThing -> Maybe (String, String)
getNodeTexts edge thing =
case Dict.get edge.from thing.nodes of
Nothing ->
--Yeah, actually this can never happen...
Nothing
Just fromNode -> case Dict.get edge.to thing.nodes of
Nothing ->
--Again, this can never actually happen because the builder code prevents it.
Nothing
Just toNode ->
Just ( fromNode.label, toNode.label )
That's just a lot of boilerplate code to handle something I specifically prevented in the factory code. But what's even worse: Now the consumer needs extra boilerplate code to handle the Maybe--potentially not knowing that the Maybe will actually never be Nothing. The API is sort of lying to the consumer. Isn't that something Elm tries to avoid? Compare to the hypothetical but incorrect:
getNodeTexts : Edge -> MyThing -> (String, String)
getNodeTexts edge thing =
( Dict.get edge.from thing.nodes |> .label
, Dict.get edge.to thing.nodes |> .label
)
An alternative would be not to use Int IDs but use the actual data instead--but then updating things gets very tedious as connectors can have many edges. Managing state without the decoupling through Ints just doesn't seem like a good idea.
I feel there must be a solution to this dilemma using opaque ID types but I just don't see it. I would be very grateful for any pointers.
Note: I've also tried to use both drathier and elm-community elm-graph libraries but they don't address the specific question. They rely on Dict underneath as well, so I end up with the same Maybes.
There is no easy answer to your question. I can offer one comment and a coding suggestion.
You use the magic words "impossible state" but as OOBalance has pointed out, you can create an impossible state in your modelling. The normal meaning of "impossible state" in Elm is precisely in relation to modelling e.g. when you use two Bools to represent 3 possible states. In Elm you can use a custom type for this and not leave one combination of bools in your code.
As for your code, you can reduce its length (and perhaps complexity) with
getNodeTexts : Edge -> MyThing -> Maybe ( String, String )
getNodeTexts edge thing =
Maybe.map2 (\ n1 n2 -> ( n1.label, n2.label ))
(Dict.get edge.from thing.nodes)
(Dict.get edge.to thing.nodes)
From your description, it looks to me like those states actually aren't impossible.
Let's start with your definition of MyThing:
type alias MyThing =
{ nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
This is a type alias, not a type – meaning the compiler will accept MyThing in place of {nodes : Dict Int String, edges : Dict Int {from : Int, to : Int, label : String}} and vice-versa.
So rather than construct a MyThing value safely using your factory functions, I can write:
import Dict
myThing = { nodes = Dict.empty, edges = Dict.fromList [(0, {from = 0, to = 1, label = "Edge 0"})] }
… and then pass myThing to any of your functions expecting MyThing, even though the nodes connected by Edge 0 aren't contained in myThing.nodes.
You can fix this by changing MyThing to be a custom type:
type MyThing
= MyThing { nodes : Dict Int String
, edges : Dict Int { from : Int, to : Int, label : String }
}
… and exposing it using exposing (MyThing) rather than exposing (MyThing(..)). That way, no constructor for MyThing is exposed, and code outside of your module must use the factory functions to obtain a value.
The same applies to Edge, wich I'm assuming is defined as:
type alias Edge =
{ from : Int, to : Int, label : String }
Unless it is changed to a custom type, it is trivial to construct arbitrary Edge values:
type Edge
= Edge { from : Int, to : Int, label : String }
Then however, you will need to expose some functions to obtain Edge values to pass to functions like getNodeTexts. Let's assume I have obtained a MyThing and one of its edges:
myThing : MyThing
-- created using factory functions
edge : Edge
-- an edge of myThing
Now I create another MyThing value, and pass it to getNodeTexts along with edge:
myOtherThing : MyThing
-- a different value of type MyThing
nodeTexts = getNodeTexts edge myOtherThing
This should return Maybe.Nothing or Result.Err String, but certainly not (String, String) – the edge does not belong to myOtherThing, so there is no guarantee its nodes are contained in it.
I have seen some graphs vertex signatures and even come up with my own:
module type VERTEX = sig
type t
type label
val equal : t -> t -> bool
val create : label -> t
val label : t -> label
end
But I have completely no idea how to implement it as a module. What types should t and label be? How can I create a t based on a label? And how do I get the label from a t?
I'm an author of Graphlib, so I can't pass by as this question hits me directly into my heart. Honestly, I was asked this question millions of times offline and never was able to provide a good answer.
The real problem is that the graph interfaces from the OCamlGraph library are all messed up. We started Graphlib as an attempt to fix them. However, OCamlGraph is a valuable repository of Graph algorithms, thus we have constrained ourselves to be compatible with the OCamlGraph interface. The main problem for us was and still is this Vertex interface that basically establishes a bijection between the set of labels and the set of nodes. People usually stumble on this, as this doesn't make sense - why do we need two different types, one for the label and another for the vertex, if they are the same?
Indeed, the simplest implementation of the VERTEX interface is the following module
module Int : VERTEX with type label = int = struct
type t = int
type label = int
let create x = x
let label x = x
end
In that case, we indeed have a trivial bijection (via the identity endofunctor) between the set of labels and the set of vertices.
However, the deeper look, shows us that a signature
val create : label -> t
val label : t -> label
Is not really a bijection, as the bijection is a one-to-one mapping. It is not really required or enforced by the type system. For example, the create function could be a surjection of label onto t, where label is some distinctive element of a family of vertices. Correspondingly, the label function, could be a forgetting functor that returns the distinctive label and forgetting everything else.
Given this approach, we can have another implementation:
module Labeled = struct
type label = int
type t = {
label : label;
data : "";
}
let create label = {label; data = ""}
let label n = n.label
let data n = n.data
let with_data n data = {n with data}
let compare x y = compare x.label y.label
end
In that implementation, we use the label as an identity of a node, and arbitrary attribute can be attached to a node. In this interpretation, the create function partitions all sets of nodes into a set of equivalence classes, where all members of a class, share the same identity, i.e., they represent the same real-world entity in different points of time or space. For example,
type color = Red | Yellow | Green
module TrafficLight = struct
type label = int
type t = {
id : label;
color : color
}
let create id = {id; color=Red}
let label t = t.id
let compare x y = compare x.id y.id
let switch t color = {t with color}
let color t = t.color
end
In this model, we represent a traffic light with its id number. The color attribute doesn't affect an identity of a traffic light (if a traffic light switches to another color it is still the same traffic light, although in a functional programming language it is represented with two different objects).
The main problem with the above representation is that in all graph textbooks the label is used in the opposite meaning - as an opaque attribute. In a textbook, they will refer to the color of a traffic light as a label. And the node itself will be represented as an int. That's why I'm saying that OCamlGraph interfaces are messed up (and consequently the Graphlib interfaces). So, if you don't want to fall in a contradiction with textbooks, then you should use unlabeled graphs (with int probably is the best representation of a node). And if you need to attach attributes to your nodes, you can use external finite maps, i.e., arrays, maps, associative lists, or any other dictionaries. Otherwise, you need to keep in mind that your label is not a label, but vice verse - the node.
With all this said, let's specify a better interface for a graph vertex:
module type VERTEX = sig
type id
type label
type t
val create : id -> t
val id : t -> id
val label : t -> label
val with_label : t -> label -> label
end
The proposed interface is compatible with your interface (and thus with the OCamlGraph), as it is isomorphic modulo renaming (i.e., we renamed label to id). It also allows us to create efficient unlabeled nodes, where id = t, as well as attach arbitrary information to a node without relying on external mappings.
Implementing a module based on a signature is like a mini puzzle. Here's how I would analyze it:
The first remark I have when reading that signature, is that there is no way in that signature to build values of type label. So, our implementation will need to be a bit larger, maybe by specifying type label = string.
Now, we have:
val create : label -> t
val label : t -> label
Which is a bijection (the types are "equivalent"). The simplest way to implement that is by defining type t = label, so that it's really only one type, but from the exterior of the module you don't know that.
The rest is
type t
val equal: t -> t -> bool
We said that label = string, and t = label. So t = string, and equal is the string equality.
Boom! here we are:
module String_vertex : VERTEX with type label = string = struct
type label = string
type t = string
let equal = String.equal
let create x = x
let label x = x
end
The VERTEX with type label = string part is just if you want to define it in the same file. Otherwise, you can do something like:
(* string_vertex.ml *)
type label = string
type t = string
let equal = String.equal
let create x = x
let label x = x
and any functor F that takes a VERTEX can be called with F(String_vertex).
It would be best practice to create string_vertex.mli with contents include VERTEX with type label = string, though.
I'm using ArangoDB 3.2.25. I want to extract neighbors from a starting node.
Here is what I tried:
FOR x IN 1..1
ANY "vert1/5001" Col_edge_L
RETURN x
but I'm getting missing vert2.
Here is the schema of the collection
{"_from":"vert1/560","_to":"vert2/5687768","id":771195,"score":218}
What you do in your query is to start at the vertex with key 5001 from the collection vert1 and follow all edges stored in collection Col_edge_L in any direction (so _from or _to equal to vert1/5001).
If there are edges in Col_edge_L like
{ "_from": "vert1/5001", "_to": "vert1/789" }
{ "_from": "vert2/44", "_to": "vert1/5001" }
then the result should be:
[
{ "_id": "vert2/44", ... },
{ "_id": "vert1/789", ... }
]
Exception: if the vertex collections exist, but not the vertices referenced in the _from and _to properties of the edges, the traversal will work but return null for the missing vertices (x variable).
The edge you posted in your question does not reference the starting vertex vert1/5001, so it wouldn't be followed and no vertex returned from this edge. If you miss vertices in the result, there might simply be no edges that link the starting vertex to another document.
I am able to build a graph using a vertexRDD and an edgeRDD via the GraphX API, no problem there. i.e.:
val graph: Graph[(String, Int), Int] = Graph(vertexRDD, edgeRDD)
However, I don't know where to start if I want to use two separate vertexRDD's instead of just one (a bipartite graph). Fore example, a graph containing shopper and product vertices.
My question is broad so I'm not expecting a detailed example, but rather a hint or nudge in the right direction. Any suggestions would be much appreciated.
For example to model users and products as a bipartite graph we might do the following:
trait VertexProperty
case class UserProperty(val name: String) extends VertexProperty
case class ProductProperty(val name: String,
val price: Double) extends VertexProperty
val users: RDD[(VertexId, VertexProperty)] = sc.parallelize(Seq(
(1L, UserProperty("user1")), (2L, UserProperty("user2"))))
val products: RDD[(VertexId, VertexProperty)] = sc.parallelize(Seq(
(1001L, ProductProperty("foo", 1.00)), (1002L, ProductProperty("bar", 3.99))))
val vertices = VertexRDD(users ++ products)
// The graph might then have the type:
val graph: Graph[VertexProperty, String] = null
I am working on a problem (from Algorithms by Sedgewick, section 4.1, problem 32) to help my understanding, and I have no idea how to proceed.
"Parallel edge detection. Devise a linear-time algorithm to count the parallel edges in a (multi-)graph.
Hint: maintain a boolean array of the neighbors of a vertex, and reuse this array by only reinitializing the entries as needed."
Where two edges are considered to be parallel if they connect the same pair of vertices
Any ideas what to do?
I think we can use BFS for this.
Main idea is to be able to tell if two or more paths exist between two nodes or not, so for this, we can use a set and see if adjacent nodes corresponding to a Node's adjacent list already are in the set.
This uses O(n) extra space but has O(n) time complexity.
boolean bfs(int start){
Queue<Integer> q = new Queue<Integer>(); // get a Queue
boolean[] mark = new boolean[num_of_vertices];
mark[start] = true; // put 1st node into Queue
q.add(start);
while(!q.isEmpty()){
int current = q.remove();
HashSet<Integer> set = new HashSet<Integer>(); /* use a hashset for
storing nodes of current adj. list*/
ArrayList<Integer> adjacentlist= graph.get(current); // get adj. list
for(int x : adjacentlist){
if(set.contains(x){ // if it already had a edge current-->x
return true; // then we have our parallel edge
}
else set.add(x); // if not then we have a new edge
if(!marked[x]){ // normal bfs routine
mark[x]=true;
q.add(x);
}
}
}
}// assumed graph has ArrayList<ArrayList<Integer>> representation
// undirected
Assuming that the vertices in your graph are integers 0 .. |V|.
If your graph is directed, edges in the graph are denoted (i, j).
This allows you to produce a unique mapping of any edge to an integer (a hash function) which can be found in O(1).
h(i, j) = i * |V| + j
You can insert/lookup the tuple (i, j) in a hash table in amortised O(1) time. For |E| edges in the adjacency list, this means the total running time will be O(|E|) or linear in the number of edges in the adjacency list.
A python implementation of this might look something like this:
def identify_parallel_edges(adj_list):
# O(n) list of edges to counts
# The Python implementation of tuple hashing implements a more sophisticated
# version of the approach described above, but is still O(1)
edges = {}
for edge in adj_list:
if edge not in edges:
edges[edge] = 0
edges[edge] += 1
# O(n) filter non-parallel edges
res = []
for edge, count in edges.iteritems():
if count > 1:
res.append(edge)
return res
edges = [(1,0),(2,1),(1,0),(3,4)]
print identify_parallel_edges(edges)