let's say I have n igraphs objects g1, g2,.., gn. They are undirected and weighted graphs, i.e. new weight's attribute should be added. I'd like to union n graphs into the weighted graph g.
It is known from the documentation (see ?graph.union) if the n graphs have the weight attribute, it is renamed by adding a _1 and _2 (and _3, etc.) suffix, i.e. weight_1, weight_2,..., weight_n.
I have seen the answer and wrote the code for n=3 graphs (see below).
Edited:
library(igraph)
rm(list=ls(all=TRUE)) # delete all objects
g1 <- graph_from_literal(A1-B1-C1)
g2 <- graph_from_literal(A2-B2-C2)
g3 <- graph_from_literal(A3-B3-C3)
E(g1)$weight <- c(1, 2)
E(g2)$weight <- c(3, 4)
E(g3)$weight <- c(5, 6)
g <- union(g1, g2, g3)
new_attr <- as.list(list.edge.attributes(g))
k <- length(new_attr) # number of new attributes
value_new_attr <- lapply(list.edge.attributes(g),
function(x) get.edge.attribute(g,x))
df <- data.frame()
for (i in 1:k) {df <- rbind(df, value_new_attr[[i]])}
E(g)$weight <- colSums(df, na.rm=TRUE)
g <- delete_edge_attr(g, "weight_1") # 1
g <- delete_edge_attr(g, "weight_2") # 2
g <- delete_edge_attr(g, "weight_3") # 3
Question. How to rewrite the last tree commands with the lapply() function?
My attempt does not work:
g <- lapply(value_new_attr, function(x) {g <- delete_edge_attr(g, x)})
I have found the solution with for-loop
# delete edge attributes with suffix
for (i in 1:k) {g <- delete_edge_attr(g, new_attr[i])}
Related
I have a random graph with 10 nodes where 4 nodes have the zero degree.
It is required to obtain the connected graph by 1) select a node with zero degree and a minimal feature (for exmaple, random number from uniform distribautin) corresponding to each edge and connect it with graph by creation two incident edges to the node and deleting the 3rd edge, 2) repeat step 1 for all zero degree nodes.
The original graph in left, the resulting one in right.
My attempt is:
library(igraph)
######################################################################
set.seed(5)
g <- sample_gnm(10, 4)
xy <- cbind(runif(10), runif(10))
par(mfrow=c(1,2))
plot(g, vertex.size=5, layout=xy)
num_point <- length(V(g)[degree(g)==0])
for(k in 1:num_point){
points = V(g)[degree(g)==0]
for(i in 1:length(E(g))) { # loop over all edges
head <- get.edgelist(g)[i,][1]; h <- c(V(g)[head]$x, V(g)[head]$y)
tail <- get.edgelist(g)[i,][2]; t <- c(V(g)[tail]$x, V(g)[tail]$y)
d <- NULL
# loop over all points
for(j in points) d <- c(d, runif(1))
E(g)[i]$d <- min(d) # local min
E(g)[i]$p <- points[which(d == min(d))]
} # i
ei = which.min(E(g)$d) # edge with the global min
vi = E(g)[ei]$p
# head and tail of edge with global min
head <- get.edgelist(g)[E(g)[ei],][1]; tail <- get.edgelist(g)[E(g)[ei],][2]
g <- add_edges(g, c(head, V(g)[vi],
V(g)[vi],
tail));
g <- delete_edges(g, get.edge.ids(g, c(head, tail) ))
}
plot(g, vertex.size=5, layout=xy)
Question. How to organize the loop over all edges when the number of edges increase by 1 and number of point decrising by 1 evety step? One can see, I don't use the k variable in explicit form.
Instead of for loop, I think you can use repeat plus a termination condition, i.e., no isolated vertices any more
repeat {
points <- V(g)[degree(g) == 0]
for (i in 1:length(E(g))) { # loop over all edges
head <- get.edgelist(g)[i, ][1]
h <- c(V(g)[head]$x, V(g)[head]$y)
tail <- get.edgelist(g)[i, ][2]
t <- c(V(g)[tail]$x, V(g)[tail]$y)
d <- NULL
# loop over all points
for (j in points) d <- c(d, runif(1))
E(g)[i]$d <- min(d) # local min
E(g)[i]$p <- points[which(d == min(d))]
} # i
ei <- which.min(E(g)$d) # edge with the global min
vi <- E(g)[ei]$p
# head and tail of edge with global min
head <- get.edgelist(g)[E(g)[ei], ][1]
tail <- get.edgelist(g)[E(g)[ei], ][2]
g <- add_edges(g, c(
head, V(g)[vi],
V(g)[vi],
tail
))
g <- delete_edges(g, get.edge.ids(g, c(head, tail)))
if (sum(degree(g) == 0) == 0) {
break
}
}
I will recommend you to use recursion for this and drop for loop- using recursion for tree and graph structures will definitely make your life easier.
Answer:
maintain a stack of all the leaf nodes
every time you iterate empty your stack by matching the leaf node values
if there's a new value and count of the stack != to old count.
Now iterate again.
I've got a dataset where each column has 4 binary variables. How do i create 4 x 4 grid with the tally of each pair combination of the variables?
Here's an example data frame:
Person <- c("Bob", "Jim", "Sarah", "Dave")
A <- c(1,0,1,1)
B <- c(1,1,1,0)
C <- c(0,0,0,1)
D <- c(1,0,0,0)
So in the 4x4 grid, the intersection of A and B would have a 2 because Bob and Sarah have 1 for A and B.
For two vectors A and B it will be a cross product:
res <- A %*% B
or
res <- crossprod(A, B)
to make a matrix of all combinations use two level for or apply:
data <- list(A,B,C,D)
res <- matrix(NA, nrow = n, ncol = m, dimnames = dimnames(product.m))
for(i in 1:n) {
for(j in 1:i) {
res[i,j] <- crossprod(data[[i]], data[[j]])
}
}
Here I fill only one half of the matrix. You then can copy the values across like this:
res[upper.tri(res)] <- t(res)[upper.tri(res)]
Suppose we have dataset G2:
data(iris)
G2 <- iris[1:5, -5]
We need to calculate Euclidean distance between x (row in G2) and G2 (excluding x) for all x's in G2, formally
I wonder what is the best way to to this. Here is my initial attempt:
D <- dist(G2)
m1 <- as.matrix(D)
(1 / (5 - 1)) * colSums(m1)
Your notation is a bit confusing because you use D differently in the code and formula. How about
m <- as.matrix(dist(G2, upper=T))
D <- apply(m, 2, mean)
n <- length(D)
D <- n/(n-1)*D
let me explain:
My data is a 5x5 grid of points (so n=25). Now say I want to choose J points. I can work out all possible combination combo using the function combn. But this is a very large matrix, and with what I want to achieve at the end, I can actually define a equivalence class by rotation (90, 180, 270 degree) and reflection. So for example, p1 is equivalent to p2,p3,p4,p5...,p8
data<-expand.grid(1:5,1:5)
J=5 # for example
combo<-combn(25,J)
# rotation symmetry
p1=c(1,6,15,20,25)
p2=c(3,4,5,21,22)
p3=c(1,6,11,20,25)
p4=c(4,5,21,22,23)
# reflection symmetry
p5=c(5,10,11,16,21)
p6=c(1,2,23,24,25)
p7=c(5,10,15,16,21)
p8=c(1,2,3,24,25)
# to help you visualize
par(mfrow=c(4,2))
equiv<-rbind(p1,p2,p3,p4,p5,p6,p7,p8)
fn<-function(x){
p.col=rep(1,25);p.col[x]=2
plot(expand.grid(1:5,1:5),col=p.col,asp=1)}
apply(equiv,1,fn)
After this, I can simply eliminate the equivalent rows, so that my combo is a much smaller matrix.
So basically, I am looking for a script that ultimately gives me the compact version of combo.
Any help is appreciated.
Thanks.
edit: I haven't tried anything yet. I was hoping there will be some R package for graph theory/combinatorics that does this.
For each combination, you can enumerate the other elements of the equivalence class,
compute some numeric quantity that identifies them (say, an MD5 checksum),
and only keep the combination if it has the smallest value.
# Enumerate the transformations (the dihedral group of order 8)
k <- 5
d1 <- expand.grid( 1:k, 1:k )
d2 <- expand.grid( k:1, 1:k )
d3 <- expand.grid( 1:k, k:1 )
d4 <- expand.grid( k:1, k:1 )
o1 <- order(d1[,1], d1[,2])
o2 <- order(d2[,1], d2[,2])
o3 <- order(d3[,1], d3[,2])
o4 <- order(d4[,1], d4[,2])
o5 <- order(d1[,2], d1[,1])
o6 <- order(d2[,2], d2[,1])
o7 <- order(d3[,2], d3[,1])
o8 <- order(d4[,2], d4[,1])
g1 <- function(p) o1[p]
g2 <- function(p) o2[p]
g3 <- function(p) o3[p]
g4 <- function(p) o4[p]
g5 <- function(p) o5[p]
g6 <- function(p) o6[p]
g7 <- function(p) o7[p]
g8 <- function(p) o8[p]
transformations <- list(g1,g2,g3,g4,g5,g6,g7,g8)
# Check that we have all the transformations
op <- par(mfrow=c(3,3), las=2, mar=c(1,1,1,1))
for( f in transformations ) {
plot( d1 )
lines( d1[f(1:10),] )
}
par(op)
# Function to decide whether to keep a value
library(digest)
keep <- function(p, d) {
q0 <- digest( d[ sort(p), , drop=FALSE] )
q <- sapply( transformations, function(f) digest( d[ sort(f(p)), , drop=FALSE ] ) )
q0 == sort(q)[1]
}
# Apply the function on each column
i <- apply(combo, 2, keep, d=d1) # Long...
length(i) / sum(i) # Around 8 (not exactly, because some of those combinations are symmetric)
result <- combo[,i]
In your example, we only keep one of the 8 elements:
apply( equiv, 1, keep, d=d1 )
# p1 p2 p3 p4 p5 p6 p7 p8
# FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
(Very) amateur coder and statistician working on a problem in R.
I have four integer lists: A, B, C, D.
A <- [1:133]
B <- [1:266]
C <- [1:266]
D <- [1:133, 267-400]
I want R to generate all of the permutations from picking 1 item from each of these lists (I know this code will take forever to run), and then take the mean of each of those permutations. So, for instance, [1, 100, 200, 400] -> 175.25.
Ideally what I would have at the end is a list of all of these means then.
Any ideas?
Here's how I'd do this for a smaller but similar problem:
A <- 1:13
B <- 1:26
C <- 1:26
D <- c(1:13, 27:40)
mymat <- expand.grid(A, B, C, D)
names(mymat) <- c("A", "B", "C", "D")
mymat <- as.matrix(mymat)
mymeans <- rowSums(mymat)/4
You'll probably crash R if you just up all the indices, but you could probably set up a loop, something like this (not tested):
B <- 1:266
C <- 1:266
D <- c(1:133, 267:400)
for(A in 1:133) {
mymat <- expand.grid(A, B, C, D)
names(mymat) <- c("A", "B", "C", "D")
mymat <- as.matrix(mymat)
mymeans <- rowSums(mymat)/4
write.table(mymat, file = paste("matrix", A, "txt", sep = "."))
write.table(mymeans, file = paste("means", A, "txt", sep = "."))
rm(mymat, mymeans)
}
to get them all. That still might be too big, in which case you could do a nested loop, or loop over D (since it's the biggest)
Alternatively,
n <- 1e7
A <- sample(133, size = n, replace= TRUE)
B <- sample(266, size = n, replace= TRUE)
C <- sample(266, size = n, replace= TRUE)
D <- sample(x = c(1:133, 267:400), size = n, replace= TRUE)
mymeans <- (A+B+C+D)/4
will give you a large sample of the means and take no time at all.
hist(mymeans)
Even creating a vector of means as large as your permutations will use up all of your memory. You will have to split this into smaller problems, look up writing objects to excel and then removing objects from memory here (both on SO).
As for the code to do this, I've tried to keep it as simple as possible so that it's easy to 'grow' your knowledge:
#this is how to create vectors of sequential integers integers in R
a <- c(1:33)
b <- c(1:33)
c <- c(1:33)
d <- c(1:33,267:300)
#this is how to create an empty vector
means <- rep(NA,length(a)*length(b)*length(c)*length(d))
#set up for a loop
i <- 1
#how you run a loop to perform this operation
for(j in 1:length(a)){
for(k in 1:length(b)){
for(l in 1:length(c)){
for(m in 1:length(d)){
y <- c(a[j],b[k],c[l],d[m])
means[i] <- mean(y)
i <- i+1
}
}
}
}
#and to graph your output
hist(means, col='brown')
#lets put a mean line through the histogram
abline(v=mean(means), col='white', lwd=2)