(R Igraph) Using affilliation to create subgraph from adjacency matrix - r

I need to create a subgraph from an adjacency matrix selecting by affiliation data. How do I match an adjacency and an affiliation matrix?
Take the following adjacency matrix:
A B C D E F G
A 0 1 0 1 0 1 0
B 1 0 1 1 0 1 0
C 0 1 0 0 0 0 0
D 1 1 0 0 1 1 0
E 0 0 0 1 0 1 0
F 1 1 0 1 1 0 1
G 0 0 0 0 0 1 0
And the following affiliation matrix:
P R Q
A 1 1 0
B 1 0 1
C 1 1 0
D 0 1 0
E 1 0 1
F 0 0 1
G 1 1 0
How do I create a subgraph from the adjacency matrix only with the nodes corresponding to P in the affiliation matrix?

If your goal is to:
filter out nodes from your adjacency matrix where the corresponding P is 1 in the affiliation matrix
convert filtered adjacency matrix to an igraph object
then you can accomplish that with the following:
# the names(which()) isn't needed for the subset of adj
p_nodes <- names(which(aff[,"P"] == 1))
p_adj <- adj[p_nodes, p_nodes]
p_graph <- igraph::graph.adjacency(p_graph)

Related

How to quantify observation pairs in individuals

I'm looking for a way to quantify observation pairs in individuals (patients). In this example I have patients who each had two different diseases. The couple of disease(that is, in the same individuals) "a" and "b" is repeated 4 times, for example, in patients "G", "H", "I" and "J" and the couple "k" and "o" is repeated twice (patient "D" has done diseases "k" and "o" and patient "E" has also done these two diseases).
Patient_ID<- c("A","A","B","B","C","C","D","D","E","E","F","F",
"G","G","H","H","I","I","J","J")
Disease<-c("v","s","s","v","s","v" ,"k","o","k","o","o","s","a","b",
"a","b","b","a","b","a")
DATA<-data.frame(Patient_ID,Disease)
print(DATA)
Patient_ID Disease
1 A v
2 A s
3 B s
4 B v
5 C s
6 C v
7 D k
8 D o
9 E k
10 E o
11 F o
12 F s
13 G a
14 G b
15 H a
16 H b
17 I b
18 I a
19 J b
20 J a
With these statistics I would like to generate such a table below.
a b k o v s
a 0 4 0 0 0 0
b 4 0 0 0 0 0
k 0 0 0 2 0 0
o 0 0 2 0 0 1
v 0 0 0 0 0 3
s 0 0 0 1 3 0
Then generate a table for only levels that have count above a certain threshold (for example 2) like in the second table (below).
a b v s
a 0 4 0 0
b 4 0 0 0
v 0 0 0 3
s 0 0 3 0
Here is a base R option using table+crossprod, i.e.,
res <- `diag<-`(crossprod(table(DATA)),0)
which gives
> res
Disease
Disease a b k o s v
a 0 4 0 0 0 0
b 4 0 0 0 0 0
k 0 0 0 2 0 0
o 0 0 2 0 1 0
s 0 0 0 1 0 3
v 0 0 0 0 3 0
For the subset by given threshold, you can use
th <- 2
inds <- rowSums(res > th)>0
subset_res <- subset(res,inds,inds)
which gives
> subset_res
Disease
Disease a b s v
a 0 4 0 0
b 4 0 0 0
s 0 0 0 3
v 0 0 3 0
At first, use unstack() to transform Disease to a data frame with 2 columns. Remember to make both columns have equal levels. This step is to prevent dropping levels in the following operation. Then input the data frame into table() and it'll create a contingency table. In this table, "a & b" and "b & a" are different. To compute the total counts, you need tab + t(tab).
pair <- data.frame(t(unstack(DATA, Disease ~ Patient_ID)))
pair[] <- lapply(pair, factor, levels = levels(DATA$Disease))
tab <- table(pair)
tab + t(tab)
# X2
# X1 a b k o s v
# a 0 4 0 0 0 0
# b 4 0 0 0 0 0
# k 0 0 0 2 0 0
# o 0 0 2 0 1 0
# s 0 0 0 1 0 3
# v 0 0 0 0 3 0

Calculate degree of a subgraph using r igraph

I know the degree of my global graph, but now I need to find the degrees of nodes within a subgraph. So, John has 4 friends in his school, but three friends in his class. How do I instruct igraph to count those three friends in his class, but not the rest in his school?
My global graph
library(igraph)
school <- read.table(text="
A B C D E F G
A 0 1 0 1 0 1 1
B 1 0 1 1 0 1 0
C 0 0 0 0 0 0 1
D 1 1 0 0 1 0 0
E 0 0 0 1 0 1 1
F 0 1 0 0 1 0 1
G 1 0 1 0 1 1 0", header=TRUE)
mat <- as.matrix(school)
g <- graph.adjacency(mat, mode="undirected", add.rownames = T)
My affiliation matrix for classes P, Q, and R
x <- read.table(text="
P Q R
A 1 1 0
B 0 0 1
C 0 0 0
D 1 0 1
E 1 1 0
F 0 1 0
G 1 1 1", header=TRUE)
inc <- as.matrix(x)
ginc <- graph.incidence(inc)
My subgraph for class P
class_nodes <- names(which(inc[,"P"] == 1))
class_adj <- mat[class_nodes, class_nodes]
class_graph <- graph.adjacency(class_adj, mode = "undirected")
I need to calculate the degree of nodes in subgraph "class_graph", but counting only their ties within the subgraph, not the global graph.
You can find all the nodes in class P with (we specifically extract the names so we can look them up in a different graph object).
V(ginc)[.nei("P")]$name
Then you can extract just that subset of connections from the main graph with
subg <- induced.subgraph(g, V(ginc)[.nei("P")]$name)
and you can calculate the degree of those nodes with
degree(subg)
# A D E G
# 2 2 2 2

Assign colour to chosen communities in a plot

In a plot, I need to colour two specific communities. Take the following data frame:
A B C D E F G
A 0 1 0 1 0 1 0
B 1 0 1 1 0 1 0
C 0 1 0 0 0 0 0
D 1 1 0 0 1 1 0
E 0 0 0 1 0 1 0
F 1 1 0 1 1 0 1
G 0 0 0 0 0 1 0
ob <- read.csv("...ties.csv",sep = ",", header = TRUE, row.names = 1)
m <- as.matrix(ob)
g <- graph.adjacency(m, mode="undirected", weighted = T, add.rownames = T)
First, I detect the communities (com) of my graph g using edge.betweenness:
com <- edge.betweenness.community(g)
V(g)$memb <- com$membership
This operation produces a number of communities, com[[1]],com[[2]], etc. I plot the resulting graph -- each community one colour -- with the following code:
plot(g, vertex.color=membership(com))
Now, how do I colour only two chosen communities, say com[[1]] and com[[2]], keeping the rest of the nodes homogeneous?
I had to tweak your adjacency matrix so that more than 1 community showed up.
library(igraph)
ob <- read.table(text="
A B C D E F G
A 0 1 0 1 0 1 0
B 1 0 1 1 0 1 0
C 0 0 0 0 0 0 0
D 1 1 0 0 1 0 0
E 0 0 0 1 0 1 1
F 0 1 0 0 1 0 1
G 0 0 0 0 1 1 0", header=TRUE)
m <- as.matrix(ob)
g <- graph.adjacency(m, mode="undirected", weighted = T, add.rownames = T)
com <- edge.betweenness.community(g)
V(g)$memb <- com$membership
cols <- membership(com)
cols[cols!=3] <- 1
plot(g, vertex.color=cols)

Measure weight of communities for different subgraphs

I detect communities in my adjacency matrix. Parallely, I create an affiliation matrix using the vertices of the same matrix. How do I measure the weight of the communities in each of the columns of the affiliation matrix?
Take the following adjacency matrix:
A B C D E F G
A 0 1 0 1 0 1 0
B 1 0 1 1 0 1 0
C 0 1 0 0 0 0 0
D 1 1 0 0 1 1 0
E 0 0 0 1 0 1 0
F 1 1 0 1 1 0 1
G 0 0 0 0 0 1 0
I identify the communities:
com <- edge.betweenness.community(g)
V(g)$memb <- com$membership
Now take the following affiliation matrix:
P R Q
A 1 1 0
B 1 0 1
C 1 1 0
D 0 1 0
E 1 0 1
F 0 0 1
G 1 1 0
How do I count the number of vertices corresponding to community [[1]] which are affiliated to the "P" in the affiliation matrix?
You can do sum(m[com[[1]],"P"]>0), given that m holds your affiliation matrix. Or lapply(com, function(x) colSums(m[x, ])) for all communities.

How could i calculate the sparsity of a data.frame in R?

i have a data.frame structured like this:
A B C D E
F 1 0 7 0 0
G 0 0 0 1 1
H 1 1 0 0 0
I 1 2 1 0 0
L 1 0 0 0 0
and i want to calculate the sparsity(i.e. the percentage of 0 values) of this data.frame.
How could i do?
sum(df == 0)/(dim(df)[1]*dim(df)[2])
[1] 0.6

Resources