how extract membership vector for my GNgraph in r? - r

i want to use NMI to compare my algorithm in community detection with other methods.so i am making some graphs with sample_sbm() which i define to give me 10 nodes ,and in block.sizes=c(3,3,4) part i define to have communities,first one has 3members,second 3,third 4 members.
now i want a membership vector of them.it should be : 1 1 1 2 2 2 3 3 3 3
what is the best way to do it?i was thinking of taking 3 arguments c1,c2,c3 and then using them in block.sizes(),so i can use a for loop to build the membership vector.but looks a bit dirty.cause the number of communities should be arbitrary.
i will be thankful if you suggest me something nicer
library(igraph)
p<-cbind( c(1, 0,0), c(0, 1,0) ,c(0,0,1))
g <- sample_sbm(10, pref.matrix=p, block.sizes=c(3,3,4) )
#comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a=membership(wc)

UPDATE following the original question-asker's comments:
I store the sizes of the blocks in a my_block_sizes vector. Then I use the rep.int function and the seq_along function to create the membership vector according to the sizes of the blocks.
library(NMI)
library(igraph)
my_block_sizes <- c(3,3,4)
# make a membership vector
membership_vector <- rep.int(seq_along(my_block_sizes), my_block_sizes)
membership_vector
[1] 1 1 1 2 2 2 3 3 3 3
p <- cbind(c(1,0,0), c(0,1,0), c(0,0,1))
g <- igraph::sample_sbm(10, pref.matrix=p, block.sizes=my_block_sizes)
# comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a <- membership(wc)
Original answer:
I'm not 100% sure this is what you're after, but based on the information you've provided, this may solve your problem.
I use the length of the wc object to determine the number of communities detected by the community detection algorithm, and the rep.int function to repeat each community number according to the size of the blocks, which I store in advance in the my_block_sizes object.
library(NMI)
library(igraph)
my_block_sizes <- c(3,3,4)
p <- cbind(c(1,0,0), c(0,1,0), c(0,0,1))
g <- igraph::sample_sbm(10, pref.matrix=p, block.sizes=my_block_sizes)
#comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a <- membership(wc)
# make a membership vector
membership_vector <- rep.int(1:length(wc), my_block_sizes)
membership_vector
[1] 1 1 1 2 2 2 3 3 3 3

Related

How to create all non-isomorphic trees with n=6 nodes?

I need to create all non-isomorphic trees with n=6 nodes. I have found the degree sequence and try to generate trees this degree.sequence.game() function:
library(igraph)
set.seed(46)
par(mfrow=c(2, 3))
degs <- matrix(c(1,1,1,2,2,3,
1,1,1,3,2,2,
1,1,2,2,2,2,
1,1,1,1,2,4,
1,1,1,1,1,5,
1,1,1,1,3,3), nrow=6, byrow=T)
for(i in 1:6){
g6 <- degree.sequence.game(degs[i,], method="vl")
plot(g6, vertex.label=NA)
}
The output is:
One can see graphs A and B in left figure are isomorphic.
Expected result in right figure.
Question. What is an alternative method to create non-isomorphic trees?
Update
It seems I misunderstood your objective. Below might be one solution if you try simple.no.multiple.uniform option with in degree.sequence.game, i.e.,
g6 <- degree.sequence.game(degs[i, ], method = "simple.no.multiple.uniform")
and we can obtain
BTW, the version of igraph I am using is igraph_1.3.5 (you can see it when typing sessionInfo() in the console) and you can try with this version, which hopefully helps to address your problem as well.
Previous Answer
I think the pain point in your problem is "How to find all distinct degree sequences with given number of vertices in a tree graph?".
We can break this primary problem into two sub-problems:
What is the sum of degrees given n vertices (if we want generate a tree)? The answer is: 2*(n-1)
How to partition the 2*(n-1) into n non-isomorphic groups that consist of positive integers? the answer is: Using partitions::restrictedparts
library(partitions)
n <- 6
degs <- t(restrictedparts(2*(n-1), n, include.zero = FALSE)
and you will see
> degs
[1,] 1 1 1 1 1 5
[2,] 1 1 1 1 2 4
[3,] 1 1 1 1 3 3
[4,] 1 1 1 2 2 3
[5,] 1 1 2 2 2 2
then you can use degree.sequence.game(degs[i,], method="vl") by iterating i through 1 to nrow(degs).

Getting the biggest connected component in R igraph

How do I get a subgraph of the the biggest component of a graph?
Say for example I have a graph g.
size_components_g <-clusters(g, mode="weak")$csize
size_components_g
#1 2 3 10 25 2 2 1
max_size <- max(size_components_g)
max_size
#25
So 25 is the biggest size.
I want to extract the component that has these 25 vertices. How do I do that?
Well, detailed explanation of output value of any function in the R package could be found in its documentation. In this case igraph::clusters returns a named list where in csize sizes of clusters are stored while membership contains the cluster id to which each vertex belongs to.
g <- igraph::sample_gnp(20, 1/20)
components <- igraph::clusters(g, mode="weak")
biggest_cluster_id <- which.max(components$csize)
# ids
vert_ids <- V(g)[components$membership == biggest_cluster_id]
# subgraph
igraph::induced_subgraph(g, vert_ids)

how to write a loop of the number of for loops in R?

this is probably a simple one, but I somehow got stuck...
I need to many loops to get the result of every sample in my support like the usual stacked loops:
for (a in 1:N1){
for (b in 1:N2){
for (c in 1:N3){
...
}
}
}
but the number of the for loops needed in this messy system depends on another random variable, let's say,
for(f in 1:N.for)
so how can I write a for loop to do deal with this? Or are there more elegant ways to do this?
note that the difference is that the nested for loops above (the variables a,b,c,...) do matter in my calculations, but the variable f of the for loop that controls for the number of for loops needed does not go into any of my calculations for my real purpose - all it does is count/ensure the number of for loops needed is correct.
Did I make it clear?
So what I am actually trying to do is generate all the possible combinations of a number of peoples preferences towards others.
Let's say I have 6 people (the simplest case for my purpose): Abi, Bob, Cath, Dan, Eva, Fay.
Abi and Bob have preference lists of C D E F ( 4!=24 possible permutations for each of them);
Cath and Dan have preference lists of A B and E F, respectively (2! * 2! = 4 possible permutations for each of them);
Eva and Fay have preference lists of A B C D (4!=24 possible permutations for each of them);
So all together there should be 24*24*4*4*24*24 possible permutations of preferences when taking all six them together.
I am just wondering what is a clear, easy and systematic way to generate them all at once?
I'd want them in the format such as
c.prefs <- as.matrix(data.frame(Abi = c("Eva", "Fay", "Dan", "Cath"),Bob = c("Dan", "Eva", "Fay", "Cath"))
but any clear format is fine...
Thank you so much!!
I'll assume you have a list of each loop variable and its maximum value, ordered from the outermost to innermost variable.
loops <- list(a=2, b=3, c=2)
You could create a data frame with all the loop variable values in the correct order with:
(indices <- rev(do.call(expand.grid, lapply(rev(loops), seq_len))))
# a b c
# 1 1 1 1
# 2 1 1 2
# 3 1 2 1
# 4 1 2 2
# 5 1 3 1
# 6 1 3 2
# 7 2 1 1
# 8 2 1 2
# 9 2 2 1
# 10 2 2 2
# 11 2 3 1
# 12 2 3 2
If the code run at the innermost point of the nested loop doesn't depend on the previous iterations, you could use something like apply to process each iteration independently. Otherwise you could loop through the rows of the data frame with a single loop:
for (i in seq_len(nrow(indices))) {
# You can get "a" with indices$a[i], "b" with indices$b[i], etc.
}
For the way of doing the calculation, an option is to use the Reduce function or some other higher-order function.
Since your data is not inherently ordered (an individual is part of a set, its preferences are part of the set) I would keep indivudals in a factor and have eg preferences in lists named with the individuals. If you have large data you can store it in an environment.
The first code is just how to make it reproducible. the problem domain was akin for graph oriented naming. You just need to change in the first line and in runif to change the behavior.
#people
verts <- factor(c(LETTERS[1:10]))
#relations, disallow preferring yourself
edges<-lapply(seq_along(verts), function(ind) {
levels(verts)[-ind]
})
names(edges) <- levels(verts)
#directions
#say you have these stored in a list or something
pool <- levels(verts)
directions<-lapply(pool, function(vert) {
relations <- pool[unique(round(runif(5, 1, 10)))]
relations[!(vert %in% relations)]
})
names(directions) = pool
num_prefs <- (lapply(directions, length))
names(num_prefs) <- names(directions)
#First take factorial of each persons preferences,
#then reduce that with multiplication
combinations <-
Reduce(`*`,
sapply(num_prefs, factorial)
)
I hope this answers your question!

R igraph - Convert a weighted adjacency matrix into weighted edgelist

I have a nxm adjacency matrix, where (i,j) represent the score of association between i and j. I need to convert this into the following format like :
i j <score1>
using R' igraph package and output it into a text file.
I can derive the edgelist, but its showing up without the weights. I used the following code:
library(igraph)
g <- graph.adjacency(myAdjacencymatrix)
get.edgelist(g)
However, it does not show the weights.
library(igraph)
set.seed(1) # for reproducible example
myAdjacencyMatrix <- matrix(runif(400),nc=20,nr=20)
g <- graph.adjacency(myAdjacencyMatrix,weighted=TRUE)
df <- get.data.frame(g)
head(df)
# from to weight
# 1 1 1 0.2655087
# 2 1 2 0.9347052
# 3 1 3 0.8209463
# 4 1 4 0.9128759
# 5 1 5 0.4346595
# 6 1 6 0.6547239
You need to use weighted=TRUE in the call to graph.adjacency(...) to have weights assigned to the edges. Then, get.data.frame(...) will return a data frame of the edges with all edge attributes by default. You can use the what=... argument to return, e.g., the vertex list with attributes.
In future: provide an example, rather than forcing us to create one for you!!!

Dynamic Network In R

I am currently working on dynamic temporal network.
Header: Time Sender Receiver
1 1 2
1 1 3
2 2 1
2 2 1
3 1 2
3 1 2
The above is a sample of my dataset.
There are 3 time periods (sessions) and the edgelists between nodes.
I want to compute centrality measures by each time period.
I am thinking about writing a script that compute centrality measures within the same period of the time.
However I am just wondering whether there might be R libraries that can handle this problem.
Is there anyone who knows about?
Jinie
I tried to write the code for subsetting data based on Time as follows:
uniq <-unique(unlist(df$Time))
uniq
[1] 1 2 3
for (i in 1:length(uniq)){
t[i]<-subset(df, Time==uniq[i])
net[i] <-as.matrix(t[i])
netT[i]<-net[i][,-3] #removing time column
#### getting edgelist
netT[i][,1]=as.character(net[i][,1])
netT[i][,2]=as.character(net[i][,2])
g [i]=graph.edgelist(netT [i], directed=T)
g[i]
}
however, I've got a error message ( Error in t[i] <- subset(df, Time == uniq[i]) :
object of type 'closure' is not subsettable)
Do you know why? I am kind of new to R so it is hard to figure it out.
I guess t[i] is the problem. I don't know how to assign t[i] as a data frame.
The networkDynamic R library is helpful for this sort of thing (disclaimer: I'm a package maintainer)
library(networkDynamic)
# a data frame with your input data
raw<-data.frame(time=c(1,1,2,2,3,3),
sender=c(1,1,2,2,1,1),
receiver=c(2,3,1,1,2,2))
# add another time column to define a start and end time for each edge spell
raw2<-cbind(raw$time,raw$time+1,raw$sender,raw$receiver)
# create a networkDynamic object using this edge timing info
nd<-networkDynamic(edge.spells=raw2)
# load the sna library with static network measures
library(sna)
# apply degree measure to static networks extracted at default time points
lapply(get.networks(nd),degree)
[[1]]
[1] 2 1 1
[[2]]
[1] 1 1 0
[[3]]
[1] 1 1 0
You could try the igraph library. I'm not familiar with it, but i find this question interesting enough to code up an answer, so here we go:
Because you've got a directed network (senders and receivers) you're going to need to two measures of centrality: indegree and outdegree.
Calculating this is fairly simple, the complication is splitting by time points. So for each time point we need to do the following:
Create an adjacency matrix indicating for each row (sender) the number of connections to each column (receiver).
From that we can simply add up the connections in the rows to get the outdegree, and the connections in the columns for the indegree.
Assuming your data is stored in a data.frame named df we can use split to split your data.frame by time point:
nodes <- unique(c(unique(df$Sender), unique(df$Receiver)))
centrality <- lapply(split(df, df$Time), function(time.df) {
adj <- matrix(0, length(nodes), length(nodes), dimnames=list(nodes, nodes))
for (i in 1:nrow(time.df)) {
sender <- time.df[i, "Sender"]
receiver <- time.df[i, "Receiver"]
adj[sender, receiver] <- adj[sender, receiver] + 1
}
list(indegree=colSums(adj), outdegree=rowSums(adj))
})
names(centrality) <- paste0("Time.Point.", 1:length(centrality))
If we run the code on your data (I've replaced the Senders and Receivers with letters for clarity):
> centrality
$Time.Point.1
$Time.Point.1$indegree
a b c
0 1 1
$Time.Point.1$outdegree
a b c
2 0 0
$Time.Point.2
$Time.Point.2$indegree
a b c
2 0 0
$Time.Point.2$outdegree
a b c
0 2 0
$Time.Point.3
$Time.Point.3$indegree
a b c
0 2 0
$Time.Point.3$outdegree
a b c
2 0 0

Resources