R/Igraph Display edge weights in an edge list? - r

Is there any way to display edge weights when viewing the graph object as an edge list?
I want to do something in the spirit of:
get.edgelist(graph, attr='weight')
so as to view the edge pairings with the weights listed alongside the nodes, but that seems not to be allowed. Only way I know how to view the weights is to view the network data as an adjacency matrix. Hoping that's not the only way.

Using the example in the help page for function get.edgelist in pkg:igraph:
> cbind( get.edgelist(g) , round( E(g)$weight, 3 ))
[,1] [,2] [,3]
[1,] "a" "b" "0.342"
[2,] "b" "d" "0.181"
[3,] "b" "e" "0.403"
[4,] "b" "f" "0.841"
[5,] "d" "f" "0.997"
[6,] "e" "g" "0.029"
[7,] "a" "h" "0.17"
[8,] "b" "j" "0.69"
[9,] "g" "j" "0.422"

Another option is to use get.data.frame() from the igraph package
# create a random graph with weighted edges
g <- erdos.renyi.game(5, 5/10, directed = TRUE)
E(g)$weight <- runif(length(E(g)), 1, 5)
# pull nodes and edge weights
get.data.frame(g)
from to weight
1 1 5 4.716679
2 2 1 4.119414
3 1 2 4.535791
4 2 5 2.486553
5 3 2 4.932118
6 5 2 3.353693
7 1 3 3.003062
8 2 3 3.350118
9 1 4 2.929069
10 2 4 4.929474
11 5 4 4.333134

Related

give names to values in a matrix in R

not sure if this is possible but it should. i want to have a matrix which elements have names just like you can do in a vector like this:
v = 1:10
names(v) = LETTERS[1:10]
result:
A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
I've tried to create a matrix and use the same sintax:
m = matrix(v, ncol=2, nrow=5)
names(m) = letters[1:8]
but the result is not what i hoped for.
result:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
attr(,"names")
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
I dont want it to be two separated entities. is there a way to do this without any libraries? or at all?
Thank you

R function for manipulative field experiment design - Plant ecology

I am a plant ecologist and I would like to design a field manipulative experiment. To achieve a good degree of randomisation in the disposition of plants inside experimental plots, I would like to use R (with which I'm familiar).
The picture is a schematisation of the design: in total there would be 6 plots, each containing 24 different plant species. In total I have 36 plant species, from which I would like to randomly sample 24 species per plot (with high variation between plots). Each species should be present in 4 out of the 6 plots but in different positions, twice along the border (in two of the warm colours) and twice in the core area (once per cold colour); in the picture I visually explain what I mean using 4 different species (A, B, C, D).
Could anyone suggest a way to do it or some insight to write the function?
I borrowed some insight from this thread and this thread. Basically, you make a circular matrix and shuffle groups.
coords <- list(c(1,1),c(1,2),c(2,1),c(3,1),
c(1,3),c(1,4),c(2,4),c(3,4),
c(2,2),c(3,2),c(2,3),c(3,3),
c(4,1),c(5,1),c(6,1),c(6,2),
c(4,2),c(5,2),c(4,3),c(5,3),
c(4,4),c(5,4),c(6,3),c(6,4))
Matrix <- matrix(c(LETTERS,0:9)[1:36][matrix(1:36,36+1,36+1,byrow=T)[c(1,36:2),1:36]],36,36)
PlotLayouts <- Matrix[(1:6*6),1:24][,unlist(lapply(split(1:24,rep(1:6,each=4)),sample,4))]
PlotLayouts <- split(PlotLayouts,sample(1:6,6))
Result <- lapply(PlotLayouts,function(Vector){
Layout <- matrix(NA,nrow=6,ncol=4)
for(i in 1:24){
Layout[coords[[i]][1],coords[[i]][2]] <- Vector[i]
}
Layout
})
#Species Counts
table(unlist(Result))
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Result
$`1`
[,1] [,2] [,3] [,4]
[1,] "I" "H" "M" "L"
[2,] "K" "R" "S" "N"
[3,] "J" "P" "Q" "O"
[4,] "U" "Z" "0" "2"
[5,] "V" "Y" "X" "1"
[6,] "T" "W" "4" "3"
$`2`
[,1] [,2] [,3] [,4]
[1,] "0" "Z" "4" "3"
[2,] "2" "9" "A" "5"
[3,] "1" "7" "8" "6"
[4,] "C" "H" "I" "K"
[5,] "D" "G" "F" "J"
[6,] "B" "E" "M" "L"
...

Creating an edgelist from Patent data in R

I am trying to create an edgelist out of patent data of the form:
PatentID InventorIDs CoinventorIDs
1 A ; B C,D,E ; F,G,H,C
2 J ; K ; L M,O ; N ; P, Q
What I would like is the edgelist below showing the connections between inventors and patents. (the semicolons separate the coinventors associated with each primary inventor):
1 A B
1 A C
1 A D
1 A E
1 B F
1 B G
1 B H
1 B C
2 J K
2 J L
2 J M
2 J O
2 K N
2 L P
2 L Q
Is there an easy way to do this with igraph in R?
I'm confused by the edges going between the inventorIds. But, here is a kind of brute force function that you could just apply by row. There may be a way with igraph, it being a massive library, that is better, but once you have the data in an this form it should be simple to convert to an igraph data structure.
Note that this leaves out the edges between primary inventors.
## A function to make the edges for each row
rowFunc <- function(row) {
tmp <- lapply(row[2:3], strsplit, '\\s*;\\s*')
tmp2 <- lapply(tmp[[2]], strsplit, ',')
do.call(rbind, mapply(cbind, row[[1]], unlist(tmp[[1]]), unlist(tmp2, recursive=FALSE)))
}
## Apply the function by row
do.call(rbind, apply(dat, 1, rowFunc))
# [,1] [,2] [,3]
# [1,] "1" "A" "C"
# [2,] "1" "A" "D"
# [3,] "1" "A" "E"
# [4,] "1" "B" "F"
# [5,] "1" "B" "G"
# [6,] "1" "B" "H"
# [7,] "1" "B" "C"
# [8,] "2" "J" "M"
# [9,] "2" "J" "O"
# [10,] "2" "K" "N"
# [11,] "2" "L" "P"
# [12,] "2" "L" " Q"

Table of all intersections in two data frames

I have two data frames. Each row of the dataframes has a different number of elements (actually gene names) -- I used read.csv("file.csv",fill=TRUE) to read them in, so there some na padding in some of the rows.
Each of the data frames have the same elements, only they've been clustered differently, so they are in different groups. I want to output a table of the intersections from the two dataframes.
So if
df1<-data.frame(c("a","b","NA","NA"),c("c","d","e","f"),c("g","h","i","NA" ),c("j","NA","NA","NA"))
df2<-data.frame(c("c","e","i","NA"),c("f","g","h","NA"),c("a","b","d","j" ))
then I want to get to something like this:
df1[1,] df1[2,] df1[3,] df1[4,]
df2[1,] 0 2 1 0
df2[2,] 0 1 2 0
df2[3,] 2 1 0 1
It seems like it should be something I should be able to do with intersect() and an apply function of some sort. I can't get my head around it though. Using my google-fu the nearest I can find is this :Finding an efficient way to count the number of overlaps between interval sets in two tables?, but that deals with data tables and is looking at numerical overlaps in line segments as best I can tell, not lists of names.
Does anyone have any idea how to do this?
You could do this by looping through the rows of each data frame and then calculating the length of the intersection of the rows, omitting missing values:
apply(df1, 1, function(i) apply(df2, 1, function(j) length(na.omit(intersect(i, j)))))
# [,1] [,2] [,3] [,4]
# [1,] 0 2 1 0
# [2,] 0 1 2 0
# [3,] 2 1 0 1
Sample data:
(df1<-rbind(c("a","b", NA, NA),c("c","d","e","f"),c("g","h","i", NA),c("j", NA, NA, NA)))
# [,1] [,2] [,3] [,4]
# [1,] "a" "b" NA NA
# [2,] "c" "d" "e" "f"
# [3,] "g" "h" "i" NA
# [4,] "j" NA NA NA
(df2<-rbind(c("c","e","i", NA),c("f","g","h", NA),c("a","b","d","j")))
# [,1] [,2] [,3] [,4]
# [1,] "c" "e" "i" NA
# [2,] "f" "g" "h" NA
# [3,] "a" "b" "d" "j"

get connected components using igraph in R

I would like to find all the connected components of a graph where the components have more than one element.
using the clusters gives the membership to different clusters and using cliques does not give connected components.
This is a follow up from
multiple intersection of lists in R
My main goal was to find all the groups of lists which have elements in common with each other.
Thanks in advance!
You can use the results from components to subset your nodes according to the component size.
library(igraph)
# example graph
set.seed(1)
g <- erdos.renyi.game(20, 1/20)
V(g)$name <- letters[1:20]
par(mar=rep(0,4))
plot(g)
# get components
cl <- components(g)
cl
# $membership
# [1] 1 2 3 4 5 4 5 5 6 7 8 9 10 3 5 11 5 3 12 5
#
# $csize
# [1] 1 1 3 2 6 1 1 1 1 1 1 1
#
# $no
# [1] 12
# loop through to extract common vertices
lapply(seq_along(cl$csize)[cl$csize > 1], function(x)
V(g)$name[cl$membership %in% x])
# [[1]]
# [1] "c" "n" "r"
#
# [[2]]
# [1] "d" "f"
#
# [[3]]
# [1] "e" "g" "h" "o" "q" "t"

Resources