How to randomly select 2 vertices from a graph in R?

How to randomly select 2 vertices from a graph in R? - r

I'm new to R, and I'm trying to randomly select 2 vertices from a graph.
What I've done so far is:
First, set up a graph
edgePath <- "./project1/data/smalledges.csv"
edgesMatrix <- as.matrix(read.csv(edgePath, header = TRUE, colClasses = "character"))
graph <- graph.edgelist(edgesMatrix)
The smalledges.csv is a file look like this:
from to
4327231 2587908
Then I get all the vertices from the graph into a list:
vList <- as.list(get.data.frame(graph, what = c("vertices")))
After that, I try to use:
sample(vList, 2)
But what I've got is an error:
cannot take a sample larger than the population when 'replace = FALSE'
I guess it's because R thinks what I want is 2 random lists, so I tried this:
sample(vList, 2, replace = TRUE)
And then I've got 2 large lists... BUT THAT'S NOT WHAT I WANTED! So guys, how can I randomly select 2 vertices from my graph? Thanks!

Not clear from your question whether you want just the vertices, or a sub-graph containing those vertices. Here's an example of both.
library(igraph)
set.seed(1) # for reproducible example
g <- erdos.renyi.game(10, 0.3)
par(mfrow=c(1,3), mar=c(1,1,1,1))
set.seed(1) # for reproducible plot
plot(g)
# random sample of vertices
smpl <- sample(1:vcount(g),5)
V(g)[smpl] # 5 random vertices
# Vertex sequence:
# [1] 9 5 7 2 4
# change the color of only those vertices
V(g)[smpl]$color="lightgreen" # make them light green
set.seed(1) # for reproducible plot
plot(g)
# create a sub-graph with only those vertices, retaining edge structure
sub.g <- induced.subgraph(g,V(g)[smpl])
plot(sub.g)

Related

How do I see the two nodes of most weighted egdes in R igraph?

I have an igraph network object constructed in R and generated weight information for each edge. I want to see the nodes of the most weighted edges (descending). What codes should I use to do that? Thank you!
# create an igraph project of user interaction network and check descriptives.
library(igraph)
#edge list
EL = read.csv("(file path omitted)user_interaction_structure.csv")
head(EL)
#node list: I do not have a node list
#construct an igraph oject
g <- graph_from_data_frame(EL, directed = TRUE, vertices = NULL)
#check the edge and node number of the network
gsize(g)
vcount(g)
#check nodes based on degree (descending)
deg <- igraph::degree(g)
dSorted <-sort.int(deg,decreasing=TRUE,index.return=FALSE)
dSorted
#check edges based on weight
E(g)
#the network will contain loop edges and multiple edges
#simplify multiple edges
g_simple <- graph.adjacency(get.adjacency(g),weighted=TRUE)
#check edge weight
E(g_simple)$weight
#igraph can generate a matrix
g_simple[]
Then I wanted to see who were interacting heavily with whom (the nodes of the edges with the largest weight)，so I tried
e_top_weights <- order(order(E(g_simple))$weight, decreasing=TRUE)
but it did not work.

I think what you want is the igraph function strength(), which gives the sum of the weights of the edges incident to each node. Here's an example:
library(igraph)
# A small graph we can visualize
g <- make_ring(5)
# Assign each edge an increasing weight, to make things
# easy
edgeweights<- 1:ecount(g)
E(g)$weight <- edgeweights
# The strength() function sums the weights of edges incident
# to each node
strengths <- strength(g)
# We can collect the top two strengths by sorting the
# strengths vector, then asking for which elements of the
# strengths vector are equal to or greater than the second
# largest element.
toptwo <- which(strengths >= sort(strengths, decreasing = TRUE)[2])
## [1] 4 5
# Assign nodes a color blue that is more saturated when nodes
# have greater strength.
cr <- colorRamp(c(rgb(0,0,1,.1), rgb(0,0,1,1)), alpha = TRUE)
colors <- cr(strengths/max(strengths))
V(g)$color <- apply(colors, 1, function(row) rgb(row[1], row[2], row[3], row[4], maxColorValue = 255))
# Plot to confirm
plot(g, edge.width = edgeweights)
Edit
Here are two different ways to find the two nodes (the "from" node and the "to" node) which are the ends of the edge with the maximum weight:
## 1
edge_df <- as_data_frame(g, "edges")
edge_df[which(edge_df$weight == max(edge_df$weight)), c("from", "to")]
## 2
max_weight_edge <- E(g)[which(E(g)$weight == max(E(g)$weight))]
ends(g, es = max_weight_edge)

How to use delete_vertices to remove, from a cluster_louvain, the group with the largest number of nodes?

I'm using UKfaculty data available in the library igraphdata
library(igraph)
library(igraphdata)
library(RColorBrewer)
data("UKfaculty")
I used the function as.undirected() to transform the directed graph to an undirected graph.
UND_UKfaculty <- as.undirected(UKfaculty)
And I applied the cluster_louvain function to get grouping:
lc <- cluster_louvain(UND_UKfaculty)
Now I found how many node I have in each group:
sizes(lc)
Community sizes
1 2 3 4 5
18 19 13 25 6
I am trying to remove from the initial graph the group with the highest number of nodes (group 4) with the function delete_vertices.
My question is focused on this last part. I don't get how to use the function delete_vertices.
I tried :
ld<-delete_vertices(lc, 4)
but it is mentioned that delete_vertices(lc, 4) is Not a graph object

For the cluster id, e.g. 4, the corresponding vertices should be determined.
Then these vertices can be removed.
Note that the vertex ids in igraph are always automatically renumbered from 1 to max.
Therefore we have to preserve the vertex names of the source graph.
require(igraph)
##
## step 1
## calculate cluster, show in plot
## community object lc: vertices indexed by cluster id
g1 <- make_tree(20, children = 2, mode = "undirected")
V(g1)$names <- as.character(V(g1))
lc <- cluster_louvain(g1, resolution = quantile(degree(g1))[4] / (ecount(g1) - 1))
plot(g1, mark.groups=lc)
## step2
## calculate the ids of the most densely populated clusters
## and determine the corresponding vertices
sz <- sizes(lc)
max_clusters <- ( which(sz == max(sz)) )
V_to_delete <- unlist(lc[max_clusters])
## step 3
## delete vertices and plot new graph
g2 <- delete_vertices(g1, V_to_delete)
V(g2)$label <- V(g2)$names
dev.new()
plot(g2)

How to calculate the number of vertices contracted into one graph?

I have a few large igraph objects that represent social networks. All nodes have various attributes, among them sector which is a factor variable. I have contracted this large network into a small where vertices represent groups and edges have the sum of individual edges in the original network. The label attribute in the second network represents the sector attribute in the first.
groupnet <- contract(g, as.integer(as.factor(V(g)$sector)), "ignore")
E(groupnet)$weight <- 1
groupnet <- simplify(groupnet, edge.attr.comb = list(weight = "sum"))
V(groupnet)$label <- levels(as.factor(V(g)$sector))
I would like to add another attribute to the second object V(groupnet)$groupsize that represents the number of original vertices that were contracted into groupnet. I have tried it with the following code but it did not work:
V(groupnet)$groupsize <- length(V(g)$sector[V(g)$sector == V(groupnet)$label])
How can I do this properly?

table() could be helpful here. Try out:
set.seed(1234)
library(igraph)
g <- make_ring(1000)
V(g)$sector <- factor(sample(LETTERS, 100, replace = T))
V(g)$sector
## contracted network
groupnet <- contract(g, as.integer(as.factor(V(g)$sector)), "ignore")
E(groupnet)$weight <- 1
V(groupnet)$label <- levels(as.factor(V(g)$sector))
## number of original vertices that were contracted into groupnet
# the tip is to see that table(V(g)$sector) provides the number of vertices per sector and
# its output is also arranged like V(groupnet)
table(V(g)$sector)
V(groupnet)
# solution
V(groupnet)$groupsize <- as.numeric(table(V(g)$sector))

How to complete a graph assigning 0 as weight attribute of new edges

Please consider the following graph
library(igraph)
g <- erdos.renyi.game(100, 2/100)
E(g)$weight <- sample(1:10, ecount(g), replace=TRUE)
I am interested in "completing" the graph by adding all missing edges (as a result each pair of vertices will be connected by an edge) but making sure the new edges are assign E(g)$weight = 0.
Is it possible?

This should work.
olde = E(g) # saving edges
g[V(g), V(g)] <- TRUE # adding all possible edges
E(g)$weight <- 0 # all weights is 0
E(g)[olde]$weight <- olde$weight # old weights is equal to old weights
g <- simplify(g) # removing loops
According to the comments, I would like to suggest more robust answer that is based on the extra attribute of edges,
please see below and give comments.
ids = E(g) # saving old ids
E(g)$oldids = ids # assigning to specific edge as extra attribute
olde = E(g) # saving edges
g[V(g), V(g)] <- TRUE # adding all possible edges
E(g)$weight <- 0 # all weights is 0
# Now it is more robust, because it matches oldids with the ids of old graph
E(g)[match(ids, oldids)]$weight <- olde$weight[ids] # old weights is equal to old weights

2nd Degree Connections in igraph

I think have this working correctly, but I am looking to mimic something similar to Facebook's Friend suggestion. Simply, I am looking to find 2nd degree connections (friends of your friends that you do not have a connection with). I do want to keep this as a directed graph and identify the 2nd degree outward connections (the people your friends connect to).
I believe my dummy code is achieving this, but since the reference is on indices and not vertex labels, I was hoping you could help me modify the code to return useable names.
### create some fake data
library(igraph)
from <- sample(LETTERS, 50, replace=T)
to <- sample(LETTERS, 50, replace=T)
rel <- data.frame(from, to)
head(rel)
### lets plot the data
g <- graph.data.frame(rel)
summary(g)
plot(g, vertex.label=LETTERS, edge.arrow.size=.1)
## find the 2nd degree connections
d1 <- unlist(neighborhood(g, 1, nodes="F", mode="out"))
d2 <- unlist(neighborhood(g, 2, nodes="F", mode="out"))
d1;d2;
setdiff(d2,d1)
Returns
> setdiff(d2,d1)
[1] 13
Any help you can provide will be great. Obviously I am looking to stay within R.

You can index back into the graph vertices like:
> V(g)[setdiff(d2,d1)]
Vertex sequence:
[1] "B" "W" "G"
Also check out ?V for ways to get at this type of info through direct indexing.

You can use the adjacency matrix $G$ of the graph $g$ (no latex here?). One of the properties of the adjacency matrix is that its nth power gives you the number of $n$-walks (paths of length n).
G <- get.adjacency(g)
G2 <- G %*% G # G2 contains 2-walks
diag(G2) <- 0 # take out loops
G2[G2!=0] <- 1 # normalize G2, not interested in multiplicity of walks
g2 <- graph.adjacency(G2)
An edge in graph g2 represents a "friend-of-a-friend" bond.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to randomly select 2 vertices from a graph in R? - r

Related

How do I see the two nodes of most weighted egdes in R igraph?

How to use delete_vertices to remove, from a cluster_louvain, the group with the largest number of nodes?

How to calculate the number of vertices contracted into one graph?

How to complete a graph assigning 0 as weight attribute of new edges

2nd Degree Connections in igraph

Categories

Resources