how to compare similarity between two Multi Dimensional Arrays in Java - graph

I have made a program which builds 3 graphs(arc and node).
The graphs were built based on 2D arrays of 1 and 0.
The first graph is the Ideal Graph which the other two are going to be compared to. I was wandering if there is a way of comparing similarity between two graphs ie How many of the vertices in graph 2 are similar to the vertices in graph 1.
Alternative question:
The graphs were built based on 2D arrays of 1 and 0 so if there is a way of comparing the 2D array for Graph 1 with the array for graph 2 instead than this would solve my problem.
I don't expect you all to put codes up but I'm just trying to understand if there is a method that can do this. And if so how it can be done.
Any help is appreciated greatly!
Hi All,
Just to make it a bit more clear:
I have two 2Dimensional arrays like so:
double [][] MatrixOfOriginalGraph= {{1,1,1,1},
{0,1,0,0},
{0,0,1,0},
{0,0,0,1}};
double [][] MatrixToBeComparedWithOriginal= {{1,0,0,1},
{0,1,1,0},
{0,0,1,1},
{0,0,0,1};
The first array is used to draw the first graph and the second is used to draw the second graph. The first graph is basically a 4 node graph where the first node is connected to all the other nodes. The other nodes are not connected to each other. The second graph is mixed. I want to know how many of the nodes in vertices in graph two are the same as the ones in graph 1.
Thanks,
jetnori.

Hopefully I got you right. You have a graph, and your array MatrixOfOriginalGraph would be a presentation of the following adjacency matrix:
1111
0100
0010
0001
And similarity is defined by similar connections?
Maybe a good starting point could be the Hamming distance between pairs of the nodes, each from one graph?
So
double [][] MatrixOfOriginalGraph= {{1,1,1,1}, {0,1,0,0}, {0,0,1,0}, {0,0,0,1}};
double [][] MatrixToBeComparedWithOriginal= {{1,0,0,1}, {0,1,1,0}, {0,0,1,1}, {0,0,0,1};
would lead to something like
int node1 = computeDistance({0,1,0,0}, {0,1,1,0});
int node2 = computeDistance({1,1,1,1}, {1,0,0,1});
and so on. But keep in mind using the Hamming distance there have to be an equal amount of variables in both compared structures. I don't know whether this is an limitation for your problem. If this would be a limitation you could instead try the Levenshtein distance.
The lesser the distance the similar the nodes. Then you may sum up the Hamming distances of all nodes (well actually you could have just computed the distance over the whole sequence of ones an zeros) or compute mean and deviation and so on, depending what you need or how you define similarity.

Related

Finding all possible directed graphs given a number of vertices

Is it possible to find all possible directed graphs given a pair of vertices and the information that an edge exists between them? For example if we know the vertices with edge pairs like
1 2
2 3
1 3
The possible directed graphs will be:
1→2, 2→3, 1→3
1→2, 2→3, 3→1
1→2, 3→2, 1→3
1→2, 3→2, 3→1
2→1, 2→3, 1→3
2→1, 2→3, 3→1
2→1, 3→2, 1→3
2→1, 3→2, 3→1
What data-structure to be used here to work with? What can be the working logic?
I was thinking of using adjacency matrix data structure and compute all possible adjacency matrix. Each adjacency matrix will represent a graph. We can use the graph as and when needed for tasks like checking whether cycle is present or not etc.
Apologies that this is more of a discussion than a programming question, but any help will be appreciated
You could maintain one undirected graph data structure G and work with the knowledge that the existence of an edge (u,v) means that there is only one directed edge in a particular instance of digraph possibility D.
If you want to maintain all possible digraphs separately, you would need 2^m many of them, where m is the number of edges. If the vertices and edges are always the same and only the direction is the invariant, then you could maintain 2^m bit-strings, each of length m. Each bit has a 0 or 1 depending on whether the edge (u,v) it corresponds to is u-->v or v<--u. Use the bit string to give direction to the undirected graph suggested above. You just need to generate all 2^m bit strings. Not very efficient... but you can't do better if you need to work with all possibilities.
You could use the bit string to construct a digraph. But, it would be more memory efficient at least to maintain only one bit-string per 'graph' rather than repeating the entire graph data structure with only directional changes. Record bit strings in a hash table: use each edge as a key and then bit value 0/1 depending on direction. Any graph traversal of one of the many possible digraphs D works on undirected G. Then in constant time you can check for incident (undirected) edges of a vertex, which are outgoing/incoming in D. So traversals can occur just as quickly by maintaining only 1 graph object and 1 bit hash table of size 2^m (rather than 2^m graph objects).

algorithm for 'generalized' matching in complete graphs

My problem is a generalization of a task solved by [Blossom algorithm] by Edmonds. The original task is the following: given a complete graph with weighted undirected edges, find a set of edges such that
1) every vertex of the graph is adjacent to only one edge from this set (i.e. vertices are grouped into pairs)
2) sum over weights of edges in this set is minimal.
Now, I would like to modify the first goal into
1') vertices are grouped into sets of 3 vertices (or in general, d vertices), and leave condition 2) unchanged.
My questions:
Do you know if this 'generalised' problem has a name?
Do you know about an algorithm solving it in number of steps being polynomial of number of vertices (like Blossom algorithm for an original problem)? I don't see a straightforward generalisation of Blossom algorithm, as it is based on looking for augmenting paths on a graph compressed to a bipartite graph (and uses here Hungarian algorithm). But augmenting paths do not seem to point to groups of vertices different than pairs.
Best regards,
Paweł

graph similarity having multiple edges between two nodes

There are many theories about calculating of graph similarity such as vertex edge overlap, jacard, co-sine, edit distance, signature similarity, lambda distance, deltacon so on. These things are based on single edge of the graph. But there are many graphs having multiple edges in real world.
Given similar two graphs like above, how could we calculate graph similarity?
Using previous graph similarity, there are only 2-dimension vector and the entry is just scalar that is number, but in multiple edge's graph, the entry should be tuple. Because there are one more actions between nodes. For the previous method, it could be called who-knows-whom schem, but latter graph, it could be said who-knows-whom*-how*. I think the previous mothods could be used for the multiple edge's graph easily, so there aren't logic or methods about it.
Thanks in advance!
There is not "the" way yo compute graph similarity.
Depending on your data and problem, very different approaches may be good. In many cases, simply merging the two edges into one makes perfect sense. For example, if I have two roads of capacity x and y to go from A to B - for many analyses this is comparable to having just one rode, with the combined capacity.

Implementing Undirected Weighted Graph

I want to know what will be an efficient way to implement an undirected weighted graph. I want to perform Prims and Kruskal algorithms on it. I know about adjacency lists but wouldn't that waste memory; for eg. lets suppose I have two vertexes A and B connected by an edge with weight 'x', so I'll need to add two entries in the adjacency list:
A,B,x
B,A,x
Am I missing something?
Adjacency lists are the memory-efficient way of implementing graphs, rather than adjacency matrices.
Actually, you have two options here.
If you want less time and more memory, you should do what you've written.
If you want more time and less memory, you could implement your edges A,B,x where A>B. But then, you would spend a lot of time while getting the adjacent vertices of any vertex.
It's your call. But second bullet is not preferred if you're dealing with less than millions of nodes.
since the graph is undirected I guess you will need only one edge between the nodes A and B

Sequence of number of vertices in a graph

I want to generate a sequence of the number of vertices in all graphs which each edge has the same number of leaving edges. I dont have to generate the whole sequence. Let's say the first 50 if exists.
I want:
Input: the number of edges leaving each vertex
Output: a sequence of the number of vertices
So far, I have looked at complete graphs. Complete graphs with n vertices always have n-1 edges leaving each vertex. But there are other kinds of graphs that have this property. For example, some polyhedrons, such as snub dodecahedron and truncated icosidodecahedron have this property.
How should I approach my problem?
I think you mean regular graphs:
http://en.wikipedia.org/wiki/Regular_graph
http://mathworld.wolfram.com/RegularGraph.html
I made a regular graph generator which isn't flawless by the way:
once you generate the nodes, say from 1 to n. You want regularity r.
For node 1, make connections to the following nodes until you reach degree r for node 1.
For node 2 you already have degree 1 (because of node 1), you connect again to further nodes until you reach degree r for node 2 too. And this way till the last node.
The flaw is that you can't define an r-regular graph for any number of nodes. The algorithm mentioned doesn't detect that, so some errors may occur. Also, this isn't a random r-regular graph generator, but instead give one possible solution.
I'm not much of an explainer, so please ask if the description lacks somewhere.

Resources