The igraph package allows us to identify cliques within a graph fairly simply (https://igraph.org/r/doc/cliques.html). It returns lists of vertices. However, I need to simply calculate the size of the largest clique. In the documentation it mentions that the size of the largest clique can be calculated but no function is given for this task.
Other threads on the topic of cliques seem to be focused on identifying the largest clique, finding maximal cliques that meet certain criteria, counting non-overlapping cliques of a certain size, or etc. But I haven't found anything about simply reporting the size of the largest clique.
Does anyone know how to calculate the size (number of vertices) of the largest clique within a graph?
I found the function I was looking for. It's simply "clique_num"
Related
My problem is a generalization of a task solved by [Blossom algorithm] by Edmonds. The original task is the following: given a complete graph with weighted undirected edges, find a set of edges such that
1) every vertex of the graph is adjacent to only one edge from this set (i.e. vertices are grouped into pairs)
2) sum over weights of edges in this set is minimal.
Now, I would like to modify the first goal into
1') vertices are grouped into sets of 3 vertices (or in general, d vertices), and leave condition 2) unchanged.
My questions:
Do you know if this 'generalised' problem has a name?
Do you know about an algorithm solving it in number of steps being polynomial of number of vertices (like Blossom algorithm for an original problem)? I don't see a straightforward generalisation of Blossom algorithm, as it is based on looking for augmenting paths on a graph compressed to a bipartite graph (and uses here Hungarian algorithm). But augmenting paths do not seem to point to groups of vertices different than pairs.
Best regards,
Paweł
I have a large graph(100000 nodes) and i want to find its cliques of size 5.
I use this command for this goal:
cliques(graph, min=5, max=5)
It takes lots of time to calculate this operation. It seems that it first tries to find all of the maximal cliques of the graph and then pick the cliques with size 5; I guess this because of the huge difference of run time between these two commands while both of them are doing a same job:
adjacent.triangles (graph) # takes about 30s
cliques(graph, min=3, max=3) # takes more than an hour
I am looking for a command like adjacent.triangles to find clique with size 5 efficiently.
Thanks
There is a huge difference between adjacent.triangles() and cliques(). adjacent.triangles() only needs to count the triangles, while cliques() needs to store them all. This could easily account for the time difference if there are lots of triangles. (Another factor is that the algorithm in cliques() has be generic and not limited to triangles - it could be the case that adjacent.triangles() contains some optimizations since we know that we are interested in triangles only).
For what it's worth, cliques() does not find all the maximal cliques; it starts from 2-cliques (i.e. edges) and then merges them into 3-cliques, 4-cliques etc until it reaches the maximum size that you specified. But again, if you have lots of 3-cliques in your graph, this could easily become a bottleneck as there is one point in the algorithm where all the 3-cliques have to be stored (even if you are not interested in them) since we need them to find the 4-cliques.
You are probably better off with maximal.cliques() first to get a rough idea of how large the maximal cliques are in your graph. The idea here is that you have a maximal clique of size k, then all its subsets of size 5 are 5-cliques. This means that it is enough to search for the maximal cliques, keep the ones that are at least size 5, and then enumerate all their subsets of size 5. But then you have a different problem as some cliques may be counted more than once.
Update: I have checked the source code for adjacent.triangles and basically all it does is that it loops over all vertices, and for each vertex v it enumerates all the (u, w) pairs of its neighbors, and checks whether u and w are connected. If so, there is a triangle adjacent on vertex v. This is an O(nd2) operation if you have n vertices and the average degree is d, but it does not generalize to groups of vertices of arbitrary size (because you would need to hardcode k-1 nested for loops in the code for a group of size k).
I'm trying to test some models of graph partitioning (these come from the real world, where a graph slowly self-partitions). To do this, I need to be able to uniformly randomly partition this graph into contiguous components (we are given the graph is initially connected, as well). Were the contiguity criterion not required I believe this would be the problem of randomly partitioning a set, which can be combinatorially analyzed. Does anyone know of any way to randomly partition graphs into subgraphs (i.e. randomly sample one partition), or, if no such method is known, to randomly sample a set of elements? The method of randomizing the number of partitions and then randomizing membership won't work because there are different numbers of possible partitions for each partition size.
You have to differentiate edge-cut partitioning and vertex-cut partitioning, where you divide the graph along the edges or vertices. This significantly impacts your problem as the number of different vertex-cuts is much larger than the number of edge-cuts. The reason is that you exclusively assign edges to partitions in vertex-cut - as opposed to edge-cut where you assign vertices to partitions - and there are much more edges than vertices (e.g. O(n^2) edges for n vertices). Hence, the combinatorially larger vertex-cut leads to a larger number of subgraphs that have to be checked for connectivity. A naive method for randomization is to enumerate all partitionings, iteratively select one partitioning, and check connectivity of all subgraphs in the selected partitioning. Then you just take the first one. In this case, all solutions have equal probability (uniformly random).
I have come across the same problem in work I am doing. I have two solutions to randomly partition a graph into m contiguous components:
Spanning Tree Approach. Randomly choose a spanning tree of your graph (e.g. Using Wilson's algorithm which chooses uniformly amongst all spanning trees). Then randomly select m-1 edges (without replacements) and remove them from the spanning tree. This will give m components which are each connected in the original graph.
Edge contraction approach. Randomly choose an edge and contract it, renaming the (new) vertex as the union of the two previous vertices. Repeat until you have only m vertices left. Identify each vertex with the subset of (original) vertices that were contracted into it.
I don't want to find all the minimum spanning trees but I want to know how many of them are there, here is the method I considered:
Find one minimum spanning tree using prim's or kruskal's algorithm and then find the weights of all the spanning trees and increment the running counter when it is equal to the weight of minimum spanning tree.
I couldn't find any method to find the weights of all the spanning trees and also the number of spanning trees might be very large, so this method might not be suitable for the problem.
As the number of minimum spanning trees is exponential, counting them up wont be a good idea.
All the weights will be positive.
We may also assume that no weight will appear more than three times in the graph.
The number of vertices will be less than or equal to 40,000.
The number of edges will be less than or equal to 100,000.
There is only one minimum spanning tree in the graph where the weights of vertices are different. I think the best way of finding the number of minimum spanning tree must be something using this property.
EDIT:
I found a solution to this problem, but I am not sure, why it works. Can anyone please explain it.
Solution: The problem of finding the length of a minimal spanning tree is fairly well-known; two simplest algorithms for finding a minimum spanning tree are Prim's algorithm and Kruskal's algorithm. Of these two, Kruskal's algorithm processes edges in increasing order of their weights. There is an important key point of Kruskal's algorithm to consider, though: when considering a list of edges sorted by weight, edges can be greedily added into the spanning tree (as long as they do not connect two vertices that are already connected in some way).
Now consider a partially-formed spanning tree using Kruskal's algorithm. We have inserted some number of edges with lengths less than N, and now have to choose several edges of length N. The algorithm states that we must insert these edges, if possible, before any edges with length greater than N. However, we can insert these edges in any order that we want. Also note that, no matter which edges we insert, it does not change the connectivity of the graph at all. (Let us consider two possible graphs, one with an edge from vertex A to vertex B and one without. The second graph must have A and B as part of the same connected component; otherwise the edge from A to B would have been inserted at one point.)
These two facts together imply that our answer will be the product of the number of ways, using Kruskal's algorithm, to insert the edges of length K (for each possible value of K). Since there are at most three edges of any length, the different cases can be brute-forced, and the connected components can be determined after each step as they would be normally.
Looking at Prim's algorithm, it says to repeatedly add the edge with the lowest weight. What happens if there is more than one edge with the lowest weight that can be added? Possibly choosing one may yield a different tree than when choosing another.
If you use prim's algorithm, and run it for every edge as a starting edge, and also exercise all ties you encounter. Then you'll have a Forest containing all minimum spanning trees Prim's algorithm is able to find. I don't know if that equals the forest containing all possible minimum spanning trees.
This does still come down to finding all minimum spanning trees, but I can see no simple way to determine whether a different choice would yield the same tree or not.
MST and their count in a graph are well-studied. See for instance: http://www14.informatik.tu-muenchen.de/konferenzen/Jass08/courses/1/pieper/Pieper_Paper.pdf.
I have found this paper so far. Is it outdated? Are there any faster and better implementations?
By the way, Wikipedia says that there can be n^n-2 spanning trees in a undirected graph. How many spanning trees can be in a directed graph?
If you use terms from paper you mentioned and you define spanning tree of directed graph as tree rooted in vertex r, having unique path from r to any other vertex then:
It's obvious that worst case when directed graph has the greatest number of the spanning trees is complete graph (there are a->b and b->a edges for any pair).
If we "forget" about directions we will get n^{n-2} spanning trees as in case of undirected graphs. For any of this spanning trees we have n options to choose a root, and this choice define uniquely define directions of edges we need to use. Not hard to see, that all trees we get are spanning, unique and there are no nother options. So we get n^{n-1} spanning trees. Strict proof will take time, I hope that simple explanation is enough.
So this task will take exponential time depend from vertex count in worst case. Considering the size of output (all spanning trees), I conclude that for arbitrary graph, algorithm can not be significantly faster and better. I think you need to somehow reformulate your original problem to not deal with all spanning trees, and may be search only needed by some criteria.
for undirected graph only....
n^n-2 spanning tress are possible for only complete graph....to find total number of spanning trees of any graph u can apply this method.....
find the adjacency matrix of the graph.
if column values are represented by 'i' and row entries by 'j' then...
if i=j...then the value will be the degree of vertex
suppose,there is a single edge between vertex v1 and v2 then the value of matrix entry will be -1......7 if there are two edges then it will be -2...& so on...
after constructing adjacency matrix....exclude any row and column...i.e, Nth row and Nth column....
answer will be the total number of spanning tress.