I am new to Neo4j and I am trying to compare two graphs in Neo4j. My first question is how to create two different graphs on Neo4j. I created something like below which is just like two disconnected graphs.
Neo4j graph
Is this the best way we can have two graphs in Neo4j for comparison or there is another way?
My second question is based on my requirement to find the common vertices and common edges in both graphs. How to get this information? I am using Java to connect to Neo4j to create and run Cypher queries.
There are some questions asked on this platform about this but none seems to be quite what I am looking for.
If anyone can even suggest better ways to implement similarity algorithms on large graphs, I would really appreciate it.
Well, in the standard Neo4j browser, you can only have a single Network view. You could use the concept of "virtual nodes", or use styling features that allow you to detect the differences visually. Perhaps this https://youtu.be/7iMraBHtTqE inspires you.
Disclaimer : I am the owner of Graphileon.
Related
I am trying to understand if I should use a GraphDB for my project. I am mapping a computer network and I use NetworkX. The relationships are physical or logical adjacency (L2 and L3) . In the current incarnation my program scans the network and dumps the adjacency info in a Postgress RDB. From there I use Python to build my graphs using NetworkX.
I am trying to understand if I should change my approach and if there is any benefit in storing the info in a GaphDB. Postgress has AgensGraph which seems to be built on top of Postgress as a GraphDB overlay or addon. I don not know yet if installing this on top will make my life easier. I barely survived the migration from SQLite to Postgress :-) and to SQLAlchemy so now in not even 3 months I am reconsidering things while I can (the migration is not complete)
I could chose to use a mix but I am not sure if it makes sense to use a GraphDB. From what I understand these has advantages as not needing a schema (which helps a lot for a DB newbie like me)
I am also wondering if NetworkX (Python librayr) and GraphDB overlap in any way. As far as I understand these things NetworkX could be instrumental in analyzing the topology of the graph while GraphDB is mainly used to query the data stored in the DB. Do they overlap in anyway? Can they be used together?
TLDR: Use Neo4j or OrientDB to store data and networkx for processing it (if you need complicated algorithms). It will be the best solution.
I strongly don't recommend you to use GraphDB for your purposes. GraphDB is based on RDF that is used for semantic web and common knowledge storage. It is not supposed to be used with problems like yours. There are many graph databases that will fit to you much better. I can recommend Neo4j (the most popular graph database, as you can see; free, but non-open-source) or OrientDB (the most popular open-source graph database).
I used graph database when I had a similar problem (but I used HP UCMDB, that is corporate software and is not free). It was really MUCH better than average relational DBs. So the idea of graph database usage is good and it fits to this kind of problems naturally.
I am not sure you really need networkx to analyze the graph (you can use graph query languages to it), but if you want, you can load the data from your DB to networkx with GraphML or some another methods (OrientDB is similar) to process it using networkx.
And the little question-answer quiz in the end:
As far as I understand these things NetworkX could be instrumental in analyzing the topology of the graph
Absolutely right.
while GraphDB is mainly used to query the data stored in the DB.
It is a database. And, yes, it is mainly used to query the data.
Do they overlap in anyway?
They are both about graphs. Of course they overlap :)
Can they be used together?
Yes, they can. No, they should not be used together for your problem.
I am studying graph databases (Neo4J).
I know graph databases come from graph theory, which has basis on Euler paper to solve the 7 bridges problem.. eulerian path/eulerian circuit.
I am trying to find an example, something, to show how graph databases use the Eulerian path to solve some problem.
Ideally I would need a real example from a source, but even any help to understand this better.. thanks!
I believe you have a misimpression. Even though Euler wrote a paper based on the 7 bridges problem (which was probably the first paper on graph theory), that does not mean that the design of neo4j uses his solution to that specific problem in any way.
I'm in the process of developing a software system ( Graph Database ) to study the interconnection between multiple components. It could end up with about 30 TB of data. I would like to know what all factors to consider in choosing the right database.
Some of the options i'm looking are Apache Giraph, TitanDB. I'm also wondering if a smaller scale DB like neo4j or OrientDB might itself work
This is a very broad question so I would define exactly what you looking for because size can be a bit vague.
I think any of the example graph dbs you provided can model data that large.
A few "more detailed" questions you could ask yourself include:
Do you care about Horizontal Scaling ? If yes then you should be looking at TitanDB, OrientDB or DSE Graph because Neo4J (at the time of writing) does not scale horizontally so it is limited by the size of the server.
Does a standardised language query/traversal language matter ? If yes then maybe you should be looking more at Tinkerpop vendors such as TitanDB, OrientDB, DSE Graph, and others. If no then any option will suit you.
Does my data have super nodes ? If yes then you should see how each vendor deals with super nodes. Some vendors shard, others use clever graph partioning algorithms.
How much support do you want ? If you need a lot then maybe you should look at strong enterprise solutions such as DSE, OrientDB or Neo4J. Neo4J is currently considered the most popular graph db and with that comes a large support base.
Do you want to use open source software ? If yes then TitanDB, Neo4j, or OrientDB may be for you
These are just some of the things you can look into when making a better decision between all the vendors. Note: There are many other vendors you could consider, Blazegraph, HypergraphDB, just to name a few.
I'm hoping you can help me out with some technical questions on graphs/trees.
I'm trying to display the creation of objects in systems.
It's really a tree structure.
It has some interesting requirements.
a)
One node can have many children. Say 20. Maybe more.
ie. one library can be used by many objects.
b)
A child node can have many parents. Say up to 20.
ie. many libraries are used by one procedure or object
c)
A particular node can appear in more than one place.
ie. a generic print, or logging function is called in many procedures
Note: This is just an -example- in tech terms I expect you will understand.
It is NOT the issue I need to model. No need to discuss it.
As I've thought about it, I realized that it's not a simple binary tree, or a linked list.
1)
What kind of data structure could I save all the data in?
2)
How could I produce a graph of this in java?
3)
What is a free open source graphing software that could graph such a tree?
Such as Neo4j
Perhaps in formats:
- as a tree, with a root, trunk, branches, and leaves?
- Like the graphs you see now, depicting social networks, with the root node in the center?
4)
Any good websites, or tutorials on this subject?
Thanks a lot!
Check out prefuse. It's old but it works. You'll have to invest a bit of time to learn how to use it though. Once you get there, it's just a matter of creating a prefuse.data.Graph object and fill in your nodes and their neighbors and then creating the visualization.
If you're open to other solution check out d3.js - draw graph using javascript on SVG element in your browser.
If this is really about objects, then maybe UML can help. It's designed to generate graphs of object relationships. There are tons of free UML tools out there. I'd download one and see if you can shoehorn your application into it.
JGraphT can represent your graph structure and can use JGraph for visualisation.
For an example visualization, look at this.
This is a fairly advanced topic related to directed graphs. I am looking into NOSQL technology for a project. In particular, using graph databases. It's a perfect fit in supporting the rich model I want to save relationships for and the problem domain is a graph itself (vertices and edges). Now obviously this made me look at Neo4j and other vendors in this space. Now I believe they def have solved or closed the gap with persisting data in a graph data structure way...which is perfect.
However my requirement goes further where there is a need to understand a specification of a directed graph to create an actual instance of a 'directed graph', such that particular rules and constraints are adhered to when actually creating the graph. The graph database doesn't concern itself with this which is correct and wouldnt want it too (happy that it's agnostic of this). The problem is this leaves it a little open ended as what ensures that the graph complies with your graph rules (ie that certain nodes can have certain relationships or even have relationships to other nodes). What should I be using that will allow me to specify the specification/metadata of the directed graph such that when creating an instance of it at runtime it adheres correctly to it's
specification.
Any help or suggestions on what is available or what is the standard way to approach this would be appreciated
I think you should take a look at Spring Data Graph, http://www.springsource.org/spring-data/neo4j that is as close as you get in having a powerful mapping layer that can project rules etc, much like JPA or Hibernate.
WDYT?
/peter