R igraph - Convert a weighted adjacency matrix into weighted edgelist - r

I have a nxm adjacency matrix, where (i,j) represent the score of association between i and j. I need to convert this into the following format like :
i j <score1>
using R' igraph package and output it into a text file.
I can derive the edgelist, but its showing up without the weights. I used the following code:
library(igraph)
g <- graph.adjacency(myAdjacencymatrix)
get.edgelist(g)
However, it does not show the weights.

library(igraph)
set.seed(1) # for reproducible example
myAdjacencyMatrix <- matrix(runif(400),nc=20,nr=20)
g <- graph.adjacency(myAdjacencyMatrix,weighted=TRUE)
df <- get.data.frame(g)
head(df)
# from to weight
# 1 1 1 0.2655087
# 2 1 2 0.9347052
# 3 1 3 0.8209463
# 4 1 4 0.9128759
# 5 1 5 0.4346595
# 6 1 6 0.6547239
You need to use weighted=TRUE in the call to graph.adjacency(...) to have weights assigned to the edges. Then, get.data.frame(...) will return a data frame of the edges with all edge attributes by default. You can use the what=... argument to return, e.g., the vertex list with attributes.
In future: provide an example, rather than forcing us to create one for you!!!

Related

How to create all non-isomorphic trees with n=6 nodes?

I need to create all non-isomorphic trees with n=6 nodes. I have found the degree sequence and try to generate trees this degree.sequence.game() function:
library(igraph)
set.seed(46)
par(mfrow=c(2, 3))
degs <- matrix(c(1,1,1,2,2,3,
1,1,1,3,2,2,
1,1,2,2,2,2,
1,1,1,1,2,4,
1,1,1,1,1,5,
1,1,1,1,3,3), nrow=6, byrow=T)
for(i in 1:6){
g6 <- degree.sequence.game(degs[i,], method="vl")
plot(g6, vertex.label=NA)
}
The output is:
One can see graphs A and B in left figure are isomorphic.
Expected result in right figure.
Question. What is an alternative method to create non-isomorphic trees?
Update
It seems I misunderstood your objective. Below might be one solution if you try simple.no.multiple.uniform option with in degree.sequence.game, i.e.,
g6 <- degree.sequence.game(degs[i, ], method = "simple.no.multiple.uniform")
and we can obtain
BTW, the version of igraph I am using is igraph_1.3.5 (you can see it when typing sessionInfo() in the console) and you can try with this version, which hopefully helps to address your problem as well.
Previous Answer
I think the pain point in your problem is "How to find all distinct degree sequences with given number of vertices in a tree graph?".
We can break this primary problem into two sub-problems:
What is the sum of degrees given n vertices (if we want generate a tree)? The answer is: 2*(n-1)
How to partition the 2*(n-1) into n non-isomorphic groups that consist of positive integers? the answer is: Using partitions::restrictedparts
library(partitions)
n <- 6
degs <- t(restrictedparts(2*(n-1), n, include.zero = FALSE)
and you will see
> degs
[1,] 1 1 1 1 1 5
[2,] 1 1 1 1 2 4
[3,] 1 1 1 1 3 3
[4,] 1 1 1 2 2 3
[5,] 1 1 2 2 2 2
then you can use degree.sequence.game(degs[i,], method="vl") by iterating i through 1 to nrow(degs).

Adding edge attribute to a network constructed from a co-occurrence matrix in R

I have a list of binary coded observations regarding whether one entity is present or absent at one given time, e.g.,
date a b c d e f g
07-07-2021 0 1 1 0 0 0 0
07-08-2021 1 0 0 0 1 1 1
07-10-2021 0 0 1 1 1 1 0
07-11-2021 1 1 1 0 0 1 1
I have created a network object from a co-occurrence matrix calculated by using crossprod(). I would like to add the observation date to the network as an edge attribute, but I'm not sure how to do that. I'm wondering how I can achieve my goal using R package(s). Thank you!
This is a very inelegant solution and I'm sure someone smarter than me could do better. Call it the brute force approach. The basic idea is that rather than using crossprod() to get a single adjacency matrix, create separate adjacency matrices for every date. You can do this by turning the initial data into a matrix that replicates each row by the size of the data, and then multiplying by the transpose of itself. Then turn each adjacency matrix into an edgelist and add the date as an attribute of every edge. Then combine all the edgelists into one. Create an igraph object from the edgelist. Then add the dates as an edge attribute (as far as I know, igraph requires that you do these last two as separate steps). I told you it was inelegant.
library(igraph)
dates <- paste("day",1:4) # I simplified the dates
data <-matrix(c(0,1,1,0,0,0,0,1,0,0,0,1,1,1,0,0,1,1,1,1,0,1,1,1,0,0,1,1),
ncol = 7, nrow = 4, byrow =T) # your data
colnames(data) <- letters[1:7]
rownames(data) <- dates
data <- as.data.frame(t(data)) # turn the data on its side
edgelists <- mapply(function(x, dates){
m <- matrix(x,nrow = length(x), ncol = length(x)) #turn each ROW of original data (now each COLUMN) into a matrix
rownames(m) <- colnames(m) <- rownames(data) # it will help to keep track of the names
n <- as.data.frame(as_edgelist(graph_from_adjacency_matrix(m*t(m)))) #create adjacency matrix and then turn it back into an edgelist
n$date <- dates # asign date
return(n)
},
x = data,
dates = as.list(dates),
SIMPLIFY =F)
el <- do.call("rbind", edgelists) # combine all edgelists into one
ig <- graph_from_edgelist(as.matrix(el[,1:2])) # make igraph object
E(ig)$date <- el$date # add the date as edge attribute
plot(ig, edge.label = E(ig)$date)) #check result

how extract membership vector for my GNgraph in r?

i want to use NMI to compare my algorithm in community detection with other methods.so i am making some graphs with sample_sbm() which i define to give me 10 nodes ,and in block.sizes=c(3,3,4) part i define to have communities,first one has 3members,second 3,third 4 members.
now i want a membership vector of them.it should be : 1 1 1 2 2 2 3 3 3 3
what is the best way to do it?i was thinking of taking 3 arguments c1,c2,c3 and then using them in block.sizes(),so i can use a for loop to build the membership vector.but looks a bit dirty.cause the number of communities should be arbitrary.
i will be thankful if you suggest me something nicer
library(igraph)
p<-cbind( c(1, 0,0), c(0, 1,0) ,c(0,0,1))
g <- sample_sbm(10, pref.matrix=p, block.sizes=c(3,3,4) )
#comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a=membership(wc)
UPDATE following the original question-asker's comments:
I store the sizes of the blocks in a my_block_sizes vector. Then I use the rep.int function and the seq_along function to create the membership vector according to the sizes of the blocks.
library(NMI)
library(igraph)
my_block_sizes <- c(3,3,4)
# make a membership vector
membership_vector <- rep.int(seq_along(my_block_sizes), my_block_sizes)
membership_vector
[1] 1 1 1 2 2 2 3 3 3 3
p <- cbind(c(1,0,0), c(0,1,0), c(0,0,1))
g <- igraph::sample_sbm(10, pref.matrix=p, block.sizes=my_block_sizes)
# comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a <- membership(wc)
Original answer:
I'm not 100% sure this is what you're after, but based on the information you've provided, this may solve your problem.
I use the length of the wc object to determine the number of communities detected by the community detection algorithm, and the rep.int function to repeat each community number according to the size of the blocks, which I store in advance in the my_block_sizes object.
library(NMI)
library(igraph)
my_block_sizes <- c(3,3,4)
p <- cbind(c(1,0,0), c(0,1,0), c(0,0,1))
g <- igraph::sample_sbm(10, pref.matrix=p, block.sizes=my_block_sizes)
#comunity detection algorithm
wc <- cluster_walktrap(g)
modularity(wc)
a <- membership(wc)
# make a membership vector
membership_vector <- rep.int(1:length(wc), my_block_sizes)
membership_vector
[1] 1 1 1 2 2 2 3 3 3 3

Dynamic Network In R

I am currently working on dynamic temporal network.
Header: Time Sender Receiver
1 1 2
1 1 3
2 2 1
2 2 1
3 1 2
3 1 2
The above is a sample of my dataset.
There are 3 time periods (sessions) and the edgelists between nodes.
I want to compute centrality measures by each time period.
I am thinking about writing a script that compute centrality measures within the same period of the time.
However I am just wondering whether there might be R libraries that can handle this problem.
Is there anyone who knows about?
Jinie
I tried to write the code for subsetting data based on Time as follows:
uniq <-unique(unlist(df$Time))
uniq
[1] 1 2 3
for (i in 1:length(uniq)){
t[i]<-subset(df, Time==uniq[i])
net[i] <-as.matrix(t[i])
netT[i]<-net[i][,-3] #removing time column
#### getting edgelist
netT[i][,1]=as.character(net[i][,1])
netT[i][,2]=as.character(net[i][,2])
g [i]=graph.edgelist(netT [i], directed=T)
g[i]
}
however, I've got a error message ( Error in t[i] <- subset(df, Time == uniq[i]) :
object of type 'closure' is not subsettable)
Do you know why? I am kind of new to R so it is hard to figure it out.
I guess t[i] is the problem. I don't know how to assign t[i] as a data frame.
The networkDynamic R library is helpful for this sort of thing (disclaimer: I'm a package maintainer)
library(networkDynamic)
# a data frame with your input data
raw<-data.frame(time=c(1,1,2,2,3,3),
sender=c(1,1,2,2,1,1),
receiver=c(2,3,1,1,2,2))
# add another time column to define a start and end time for each edge spell
raw2<-cbind(raw$time,raw$time+1,raw$sender,raw$receiver)
# create a networkDynamic object using this edge timing info
nd<-networkDynamic(edge.spells=raw2)
# load the sna library with static network measures
library(sna)
# apply degree measure to static networks extracted at default time points
lapply(get.networks(nd),degree)
[[1]]
[1] 2 1 1
[[2]]
[1] 1 1 0
[[3]]
[1] 1 1 0
You could try the igraph library. I'm not familiar with it, but i find this question interesting enough to code up an answer, so here we go:
Because you've got a directed network (senders and receivers) you're going to need to two measures of centrality: indegree and outdegree.
Calculating this is fairly simple, the complication is splitting by time points. So for each time point we need to do the following:
Create an adjacency matrix indicating for each row (sender) the number of connections to each column (receiver).
From that we can simply add up the connections in the rows to get the outdegree, and the connections in the columns for the indegree.
Assuming your data is stored in a data.frame named df we can use split to split your data.frame by time point:
nodes <- unique(c(unique(df$Sender), unique(df$Receiver)))
centrality <- lapply(split(df, df$Time), function(time.df) {
adj <- matrix(0, length(nodes), length(nodes), dimnames=list(nodes, nodes))
for (i in 1:nrow(time.df)) {
sender <- time.df[i, "Sender"]
receiver <- time.df[i, "Receiver"]
adj[sender, receiver] <- adj[sender, receiver] + 1
}
list(indegree=colSums(adj), outdegree=rowSums(adj))
})
names(centrality) <- paste0("Time.Point.", 1:length(centrality))
If we run the code on your data (I've replaced the Senders and Receivers with letters for clarity):
> centrality
$Time.Point.1
$Time.Point.1$indegree
a b c
0 1 1
$Time.Point.1$outdegree
a b c
2 0 0
$Time.Point.2
$Time.Point.2$indegree
a b c
2 0 0
$Time.Point.2$outdegree
a b c
0 2 0
$Time.Point.3
$Time.Point.3$indegree
a b c
0 2 0
$Time.Point.3$outdegree
a b c
2 0 0

Count the number of instances where a variable or a combination of variables are TRUE

I'm an enthusiastic R newbie that needs some help! :)
I have a data frame that looks like this:
id<-c(100,200,300,400)
a<-c(1,1,0,1)
b<-c(1,0,1,0)
c<-c(0,0,1,1)
y=data.frame(id=id,a=a,b=b,c=c)
Where id is an unique identifier (e.g. a person) and a, b and c are dummy variables for whether the person has this feature or not (as always 1=TRUE).
I want R to create a matrix or data frame where I have the variables a, b and c both as the names of the columns and of the rows. For the values of the matrix R will have to calculate the number of identifiers that have this feature, or the combination of features.
So for example, IDs 100, 200 and 400 have feature a then in the diagonal of the matrix where a and a cross, R will input 3. Only ID 100 has both features a and b, hence R will input 1 where a and b cross, and so forth.
The resulting data frame will have to look like this:
l<-c("","a","b","c")
m<-c("a",3,1,1)
n<-c("b",1,2,1)
o<-c("c",1,1,2)
result<-matrix(c(l,m,n,o),nrow=4,ncol=4)
As my data set has 10 variables and hundreds of observations, I will have to automate the whole process.
Your help will be greatly appreciated.
Thanks a lot!
With base R:
crossprod(as.matrix(y[,-1]))
# a b c
# a 3 1 1
# b 1 2 1
# c 1 1 2
This is called an adjacency matrix. You can do this pretty easily with the qdap package:
library(qdap)
adjmat(y[,-1])$adjacency
## a b c
## a 3 1 1
## b 1 2 1
## c 1 1 2
It throws a warning because you're feeding it a dataframe. Not a big deal and can be ignored. Also noticed I dropped the first column (ID's) with negative indexing y[, -1].
Note that because you started out with a Boolean matrix you could have gotten there with:
Y <- as.matrix(y[,-1])
t(Y) %*% Y

Resources