I have a graph which is described as edges in a database.
Can anybody share an idea how to identify loops in the graph?
Like I have in the example the loop among nodes 3, 4 and 5.
P.S.
I'm solving this problem in Oracle.
Related
What I'm currently doing is:
Train a GNN and see which graphs are labelled wrongly compared to the ground truth.
Use a GNN-explainer model to help explain which minimum sub-graph is responsible for the mislabeling by checking the wrongly label instances.
Use the graph_edit_distance from networkx to see how much these graphs differentiate from another.
See if I can find clusters that help explain why the GNN might label some graphs wrongly.
Does this seem reasonable?
How would I go around step 4? Would I use something like sklearn_extra.cluster.KMedoids?
All help is appreciated!
Use the graph_edit_distance from networkx to see how much these graphs
differentiate from another.
Guessing this gives you a single number for any pair of graphs.
The question is: on what direction is this number? How many dimensions ( directions ) are there? Suppose two graphs have the same distance from a third. Does this mean that the two graphs are close together, forming a cluster at a distance from the third graph?
If you have answers to the questions in the previous paragraph, then the KMeans algorithm can find clusters for as many dimensions as you might have. It is fast and easy to code, usually giving satisfactory results. https://en.wikipedia.org/wiki/K-means_clustering
I am trying to create an igraph graph using the co-ordinates of nodes. I realized that there are no direct ways to do this through the igraph package for R.
There are ways to plot a graph on a 2D space using the layout() function, but I need to create a graph from known co-ordinates of nodes. I'll appreciate any help anyone can provide with this.
Why I'm trying to set up a graph this way? After parametrizing the graph nodes with co-ordinates, I want to connect the nodes using a probability measure that takes the distance between the nodes into account.
Thanks.
Hitaysh
# Initial Co-ordinates of nodes
n = 1000 # no. of nodes
nodes.coord <- data.frame(x=runif(n,min=0,max=n),
y=runif(n,min=0,max=n))
# Set up a graph with nodes on above co-ordinates...
P.S. This is my first post on StackOverflow. Any constructive feedback on how to ask questions better, is also welcome.
After reading your question afew more times, i'm guessing that something like this will work
n = 20 # no. of nodes
set.seed(15)
nodes.coord <- data.frame(
x=runif(n,min=0,max=n),
y=runif(n,min=0,max=n)
)
gg <- graph.empty(n)
plot(gg, layout=as.matrix(nodes.coord[,c("x","y")]))
But if you are going to connect nodes based on distance, it probably makes sense to find the connections prior to creating the igraph object since you already know the locations.
I want to know what will be an efficient way to implement an undirected weighted graph. I want to perform Prims and Kruskal algorithms on it. I know about adjacency lists but wouldn't that waste memory; for eg. lets suppose I have two vertexes A and B connected by an edge with weight 'x', so I'll need to add two entries in the adjacency list:
A,B,x
B,A,x
Am I missing something?
Adjacency lists are the memory-efficient way of implementing graphs, rather than adjacency matrices.
Actually, you have two options here.
If you want less time and more memory, you should do what you've written.
If you want more time and less memory, you could implement your edges A,B,x where A>B. But then, you would spend a lot of time while getting the adjacent vertices of any vertex.
It's your call. But second bullet is not preferred if you're dealing with less than millions of nodes.
since the graph is undirected I guess you will need only one edge between the nodes A and B
I have been trying to create an algorithm that can create a graph. It is not a tree graph as nodes can have multiple parents, more like an activity diagram. My problem is with placing nodes on the x axis, making sure that they do not overlap each other. I have been looking around for months now, but I have been unable to find any information relevant to this kind of graph. So I where wondering if some of you people might know of an algorithm that can solve this problem, or an idea on what approach I should take.
Here you see my problem: The red nodes are overlapping other nodes
My best approach right now is where i add it all to row:
With this approach will the tree above look like this.
I have started to work with gephi to help me display a dataset.
The dataset contains:
tags (terms for a certain picture) as nodes
Normalized Google Similarity Distance between those tags as edges with a weight (between 0 und 1)
Every tag is connected to every other tag, as long as they both belong to the same picture. So I have one cluster of nodes and edges for every picture.
I have now imported this dataset to gephi in the following format:
nodes: id, label
edges: target, source, weight (between 0 and 1)
Like 500 nodes and 6000 edges.
My problem now is that after importing all those nodes and edges the graph looks kind of bunched with no real order. Every cluster of every picture is mixed into other clusters of other pictures.
Now using Modularity as Partition algorithm (which should use the Louvain method) the graph is getting colored, each color represent a picture. Now I can split this mess, using the Force Atlas 2 Layout.
I now have a colored graph with something like 15 clusters (every cluster represent 1 picture)
Now I want to cluster those clusters again using tags (nodes) according to their Normalized google distance (weight of the edges), which should then be tags which are somewhat equal in their meaning.
I hope you guys understand what I want to accomplish.
I can also upload a picture to clarify it.
Thanks a lot
I don't think you can do that with the standard version of Gephi. You would need to develop a plugin to implement the very last step of your process.
Gephi is good for visualizing and browsing graphs, but (for now) there are more complete tools when it comes to processing topological properties. for instance, the igraph library (available in C, R and python) might be more appropriate for you. And note that you can use a file format compatible with both Gephi and igraph, which allows you to use both tools on the same data.
I was able to solve my problem. I had to import every one of these 15 clusters on their own. In this way i could use the Modularity method on just those few.