delete all edges with 0 jaccard similarity - r

I don't know anything about R language(syntax)
How can I delete all edges with weight=0 from the graph?
For example all edges with 0 similarity 1

Here's an example on how to delete edges between vertices with zero jaccard similarity:
library(igraph)
g <- make_ring(5) + edges(4,1,2,2)
par(mfrow = c(1,2))
plot(g)
(s <- similarity(g, method = "jaccard"))
idx <- which(s == 0, arr.ind = T)
g2 <- g - edges(as.vector(t(idx)))
plot(g2)

Related

How to define the mapping for a vetrex contraction?

I have a graph $G=(V, E)$. I need to make a vetrex contraction by the rules:
Find all articulation vertices with degree is greater than 2 and contract all vertices into them that can only be reached through this vertex.
The original graph in left, the expected graph in right. It is should be noted: instead of leaves "14", "8" and "3" can be subgraphs with more that one vertex.
First, I have found the bridge edges (red color) and two types of the articulation vertices: a) on a chain (red color), b) not on a chain (green color). The criteria for articulation classifucation is a vertex degree (2 or not 2).
My attemp is:
library(igraph)
set.seed(44)
n = 20
m = 35
G <- sample_gnm(n=n, m=m)
V(G)$group <- 1:n
V(G)$color <- "black"
E(G)$color <- "black"
ind <- articulation.points(G)
V(G)$color[ind] <- ifelse(degree(G, V(G)[ind])==2, "red", "green")
if(degree(G, V(G)[ind])==2) V(G)[ind]$group = 0
num_comp <- length(decompose.graph(G))
for (i in 1:m) {
G_sub <- delete.edges(G, i)
if (length(decompose.graph(G_sub)) > num_comp) E(G)$color[i] <- "red"
}
plot(G, layout = layout.fruchterman.reingold,
vertex.size = 15, vertex.color= V(G)$color,
vertex.label.color = "white" )
g2 <- contract(G, mapping = factor(V(G)$group),
vertex.attr.comb=toString)
plot(g2, layout = layout.fruchterman.reingold,
vertex.size = 15, vertex.color= V(G)$color,
vertex.label.color = "white" )
Question. How to define the mapping?
Edit. After the ThomasIsCoding's answer I'd add the figure for the remark: It is should be noted: instead of leaves "14", "8" and "3" can be subgraphs with more that one vertex. For instance, I can have the case:
set.seed(44)
n <- 20
m <- 35
G <- sample_gnm(n = n, m = m) %>%
add_vertices(1) %>%
add_vertices(1) %>%
add_edges(c(3,21, 3,22, 21,22))
plot(G)
In the figure below one can see the five bridges. The degree of vertices 8 and 14 equal to one, but vertex 3 in not a leaf now.
My problem is: how to distinguish the chain and no chain.
for (k in ind) {
nbs <- neighbors(G, k)
if (degree(G, k) == 2) # chain
V(G)$group <- replace(V(G)$group,
match(nbs[degree(G, nbs) == 1], V(G)), match(k, V(G)))
else # no chain
V(G)$group <- ...
}
Also weak place is: To which subgraph (A or B) should the vertex contraction operation be applied? In the original case the one vetrex were contracted only. The original task come from the simplification big graph for future analysis. And I think I can make the simplification based on bridges and cut-vertices. But now I am thinking on the selection subgraph for the vetrex contraction. The ccurrent point of view: apply the the vertex contraction for the subgraph with the minimal geodesic spanning tree.
You can try the code below to produce the mapping argument (see the for loop part)
library(igraph)
set.seed(44)
n <- 20
m <- 35
G <- sample_gnm(n = n, m = m)
V(G)$group <- 1:n
ind <- articulation.points(G)
for (k in ind) {
nbs <- neighbors(G, k)
V(G)$group <- replace(V(G)$group, match(nbs[degree(G, nbs) == 1], V(G)), match(k, V(G)))
}
g2 <- contract(G, mapping = factor(V(G)$group))

How to use the names from original graph on quotient graph?

I have a graph g1 and I need to find the quotient graph g2.
My attept is:
library(igraph)
n = 8
m <- t(matrix(c(
0,0,0,0,0,0,0,8,
3,0,0,0,0,0,0,0,
5,0,0,5,1,0,0,0,
0,0,6,0,0,7,1,0,
0,6,2,0,0,0,0,0,
0,0,0,0,0,0,0,0,
7,4,0,0,8,0,0,3,
0,3,0,0,0,9,0,0),ncol=n))
g1 <- graph_from_adjacency_matrix(m, weighted=TRUE, mode="directed")
V(g1)$names <- letters[1:n]
V(g1)$label <- V(g1)$names
g2 <- contract(g1, components(g1, mode = "strong")$membership, vertex.attr.comb=toString)
g2 <- simplify(g2)
The graph g2 has three componets: {a, b, h}, {c, d, e, g} and {f} and I need to use the first letters from each component as vertex.labels.
plot(g2, vertex.label = substr(toupper(V(g2)$label), 1, 1))
The result is correct for me:
Question. Is it possible to solve the task without using the additional attribute V(g1)$label?
You should use $name (instead of $names) to add vertex name attribute, e.g.,
g1 <- graph_from_adjacency_matrix(m, weighted = TRUE, mode = "directed")
V(g1)$name <- letters[1:n]
g2 <- contract(g1, components(g1, mode = "strong")$membership, vertex.attr.comb = toString)
g2 <- simplify(g2)
then, when you run plot(g2), you will see

How to restore attribute after union n igraphs?

let's say I have n igraphs objects g1, g2,.., gn. They are undirected and weighted graphs, i.e. new weight's attribute should be added. I'd like to union n graphs into the weighted graph g.
It is known from the documentation (see ?graph.union) if the n graphs have the weight attribute, it is renamed by adding a _1 and _2 (and _3, etc.) suffix, i.e. weight_1, weight_2,..., weight_n.
I have seen the answer and wrote the code for n=3 graphs (see below).
Edited:
library(igraph)
rm(list=ls(all=TRUE)) # delete all objects
g1 <- graph_from_literal(A1-B1-C1)
g2 <- graph_from_literal(A2-B2-C2)
g3 <- graph_from_literal(A3-B3-C3)
E(g1)$weight <- c(1, 2)
E(g2)$weight <- c(3, 4)
E(g3)$weight <- c(5, 6)
g <- union(g1, g2, g3)
new_attr <- as.list(list.edge.attributes(g))
k <- length(new_attr) # number of new attributes
value_new_attr <- lapply(list.edge.attributes(g),
function(x) get.edge.attribute(g,x))
df <- data.frame()
for (i in 1:k) {df <- rbind(df, value_new_attr[[i]])}
E(g)$weight <- colSums(df, na.rm=TRUE)
g <- delete_edge_attr(g, "weight_1") # 1
g <- delete_edge_attr(g, "weight_2") # 2
g <- delete_edge_attr(g, "weight_3") # 3
Question. How to rewrite the last tree commands with the lapply() function?
My attempt does not work:
g <- lapply(value_new_attr, function(x) {g <- delete_edge_attr(g, x)})
I have found the solution with for-loop
# delete edge attributes with suffix
for (i in 1:k) {g <- delete_edge_attr(g, new_attr[i])}

second order neighbors of graph nodes in R

I am looking for an efficient way to find the neighborhoods of exact degree of all nodes in a large graph. Even though it stores graphs as sparse matrices, igraph::ego blows up:
require(Matrix)
require(igraph)
require(ggplot2)
N <- 10^(1:5)
runtimes <- function(N) {
g <- erdos.renyi.game(N, 1/N)
system.time(ego(g, 2, mindist = 2))[3]
}
runtime <- sapply(N, runtimes)
qplot(log10(N), runtime, geom = "line")
Is there a more efficient way?
Using adjacency matrices directly provides a significant improvement.
# sparse adjacency-matrix calculation of indirect neighbors -------------------
diff_sparse_mat <- function(A, B) {
# Difference between sparse matrices.
# Input: sparse matrices A and B
# Output: C = (A & !B), using element-wise diffing, treating B as logical
stopifnot(identical(dim(A), dim(B)))
A <- as(A, "generalMatrix")
AT <- as.data.table(summary(as(A, "TsparseMatrix")))
setkeyv(AT, c("i", "j"))
B <- drop0(B)
B <- as(B, "generalMatrix")
BT <- as.data.table(summary(as(B, "TsparseMatrix")))
setkeyv(BT, c("i", "j"))
C <- AT[!BT]
if (length(C) == 2) {
return(sparseMatrix(i = C$i, j = C$j, dims = dim(A)))
} else {
return(sparseMatrix(i = C$i, j = C$j, x = C$x, dims = dim(A)))
}
}
distance2_peers <- function(adj_mat) {
# Returns a matrix of indirect neighbors, excluding the diagonal
# Input: adjacency matrix A (assumed symmetric)
# Output: (A %*% A & !A) with zero diagonal
indirect <- forceSymmetric(adj_mat %*% adj_mat)
indirect <- diff_sparse_mat(indirect, adj_mat) # excl. direct neighbors
indirect <- diff_sparse_mat(indirect, Diagonal(n = dim(indirect)[1])) # excl. diag.
return(indirect)
}
for the Erdos Renyi example, in half a minute now a network of 10^7, not 10^5 can be analyzed:
N <- 10 ^ (1:7)
runtimes <- function(N) {
g <- erdos.renyi.game(N, 1 / N, directed = FALSE)
system.time(distance2_peers(as_adjacency_matrix(g)))[3]
}
runtime <- sapply(N, runtimes)
qplot(log10(N), runtime, geom = "line")
The resulting matrix containst at (i, j) the number of paths from i to j of length 2 (excluding paths that include i itself).

Using igraph, how to force curvature when arrows point in opposite directions

autocurve.edges does an amazing job of curving edges in igraph plots so that they don't overlap when they point in the same direction. However, when they point in opposite directions, no curvature is applied.
d <- data.frame(start=c("a","a","b","c"),end=c("b","b","c","b"))
graph <- graph.data.frame(d, directed=T)
plot(graph,
vertex.color="white")
The issue is for the arrows between b and c (or c and b).
Other than specifying curvature manually, any suggestions?
I would use the edge.curved option with the same seq call that autocurve.edges uses.
plot(graph,
vertex.color="white", edge.curved=seq(-0.5, 0.5, length = ecount(graph)))
EDIT:
As Étienne pointed out, this solution also curves edges for unique observations. The solution is then to modify the autocurve.edges function. This is my modified function called autocurve.edges2. Basically, it generates a vector, which curves only non-unique edges.
autocurve.edges2 <-function (graph, start = 0.5)
{
cm <- count.multiple(graph)
mut <-is.mutual(graph) #are connections mutual?
el <- apply(get.edgelist(graph, names = FALSE), 1, paste,
collapse = ":")
ord <- order(el)
res <- numeric(length(ord))
p <- 1
while (p <= length(res)) {
m <- cm[ord[p]]
mut.obs <-mut[ord[p]] #are the connections mutual for this point?
idx <- p:(p + m - 1)
if (m == 1 & mut.obs==FALSE) { #no mutual conn = no curve
r <- 0
}
else {
r <- seq(-start, start, length = m)
}
res[ord[idx]] <- r
p <- p + m
}
res
}
And here's the result when adding a single, non-mutual edge (C->D):
library(igraph)
d <- data.frame(start=c("a","a","b","c","c"),end=c("b","b","c","b","d"))
graph <- graph.data.frame(d, directed=T)
curves <-autocurve.edges2(graph)
plot(graph, vertex.color="white", edge.curved=curves)

Resources