Find equivalence class of graphs from a matrix of coordinates - r

let me explain:
My data is a 5x5 grid of points (so n=25). Now say I want to choose J points. I can work out all possible combination combo using the function combn. But this is a very large matrix, and with what I want to achieve at the end, I can actually define a equivalence class by rotation (90, 180, 270 degree) and reflection. So for example, p1 is equivalent to p2,p3,p4,p5...,p8
data<-expand.grid(1:5,1:5)
J=5 # for example
combo<-combn(25,J)
# rotation symmetry
p1=c(1,6,15,20,25)
p2=c(3,4,5,21,22)
p3=c(1,6,11,20,25)
p4=c(4,5,21,22,23)
# reflection symmetry
p5=c(5,10,11,16,21)
p6=c(1,2,23,24,25)
p7=c(5,10,15,16,21)
p8=c(1,2,3,24,25)
# to help you visualize
par(mfrow=c(4,2))
equiv<-rbind(p1,p2,p3,p4,p5,p6,p7,p8)
fn<-function(x){
p.col=rep(1,25);p.col[x]=2
plot(expand.grid(1:5,1:5),col=p.col,asp=1)}
apply(equiv,1,fn)
After this, I can simply eliminate the equivalent rows, so that my combo is a much smaller matrix.
So basically, I am looking for a script that ultimately gives me the compact version of combo.
Any help is appreciated.
Thanks.
edit: I haven't tried anything yet. I was hoping there will be some R package for graph theory/combinatorics that does this.

For each combination, you can enumerate the other elements of the equivalence class,
compute some numeric quantity that identifies them (say, an MD5 checksum),
and only keep the combination if it has the smallest value.
# Enumerate the transformations (the dihedral group of order 8)
k <- 5
d1 <- expand.grid( 1:k, 1:k )
d2 <- expand.grid( k:1, 1:k )
d3 <- expand.grid( 1:k, k:1 )
d4 <- expand.grid( k:1, k:1 )
o1 <- order(d1[,1], d1[,2])
o2 <- order(d2[,1], d2[,2])
o3 <- order(d3[,1], d3[,2])
o4 <- order(d4[,1], d4[,2])
o5 <- order(d1[,2], d1[,1])
o6 <- order(d2[,2], d2[,1])
o7 <- order(d3[,2], d3[,1])
o8 <- order(d4[,2], d4[,1])
g1 <- function(p) o1[p]
g2 <- function(p) o2[p]
g3 <- function(p) o3[p]
g4 <- function(p) o4[p]
g5 <- function(p) o5[p]
g6 <- function(p) o6[p]
g7 <- function(p) o7[p]
g8 <- function(p) o8[p]
transformations <- list(g1,g2,g3,g4,g5,g6,g7,g8)
# Check that we have all the transformations
op <- par(mfrow=c(3,3), las=2, mar=c(1,1,1,1))
for( f in transformations ) {
plot( d1 )
lines( d1[f(1:10),] )
}
par(op)
# Function to decide whether to keep a value
library(digest)
keep <- function(p, d) {
q0 <- digest( d[ sort(p), , drop=FALSE] )
q <- sapply( transformations, function(f) digest( d[ sort(f(p)), , drop=FALSE ] ) )
q0 == sort(q)[1]
}
# Apply the function on each column
i <- apply(combo, 2, keep, d=d1) # Long...
length(i) / sum(i) # Around 8 (not exactly, because some of those combinations are symmetric)
result <- combo[,i]
In your example, we only keep one of the 8 elements:
apply( equiv, 1, keep, d=d1 )
# p1 p2 p3 p4 p5 p6 p7 p8
# FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE

Related

How do I add a conditional column based on an inequality with matrix multiplication (in R)?

I have a data frame of values and want to add a column based on an inequality condition that involves matrix multiplication.
The data frame looks like this
# Set possible values for variables
b0 <- b1 <- b2 <- seq(0, 2, by=0.1)
# Create data frame of all the different combos of these variables
df <- setNames(expand.grid(b0, b1, b2), c("b0", "b1", "b2"))
There are a lot of precursor objects I have to define before adding this column:
##### Set n
n = 100
#### Generate (x1i, x2i)
# Install and load the 'MASS' package
#install.packages("MASS")
library("MASS")
# Input univariate parameters
rho <- 0.5
mu1 <- 0; s1 <- 1
mu2 <- 0; s2 <- 1
# Generate parameters for bivariate normal distribution
mu <- c(mu1, mu2)
sigma <- matrix(c(s1^2, s1*s2*rho, s1*s2*rho, s2^2), nrow=2, ncol=2)
# Generate draws from bivariate normal distribution
bvn <- mvrnorm(n, mu=mu, Sigma=sigma ) # from MASS package
x1 <- bvn[, 1]
x2 <- bvn[, 2]
##### Generate error
error <- rnorm(n)
##### Generate dependent variable
y <- 0.5 + x1 + x2 + error
##### Create the model
lm <- lm(y ~ x1 + x2)
# Setup parameters
n <- 100
K <- 3
c <- qf(.95, K, n - K)
# Define necessary objects
sigma_hat_sq <- 1
b0_hat <- summary(lm)$coefficients[1, 1]
b1_hat <- summary(lm)$coefficients[2, 1]
b2_hat <- summary(lm)$coefficients[3, 1]
x <- cbind(1, x1, x2)
I am trying to add this conditional column like this:
# Add a column to the data frame that says whether the condition holds
df <- transform(df, ueq = (
(1/(K*sigma_hat_sq))*
t(matrix(c(b0_hat-b0, b1_hat-b1, b2_hat-b2)))%*%
t(x)%*%x%*%
matrix(c(b0_hat-b0, b1_hat-b1, b2_hat-b2))
<= c
))
...but doing so generates the error message
Error in t(matrix(c(b0_hat - b0, b1_hat - b1, b2_hat - b2))) %*% t(x) :
non-conformable arguments
Mathematically, the condition is [1/(Ksigmahat^2)](Bhat-B)'X'X(Bhat-B) <= c, where, for each triple (b0,b1,b2), (Bhat-B) is a 3x1 matrix with elements {B0hat, B1hat, B2hat}. I'm just not sure how to write this condition in R.
Any help would be greatly appreciated!
In order to only work with one row of df at a time (and get a separate answer for each 1 x 3 matrix, you need a loop.
A simple way to do this in R is mapply.
df <- transform(df, ueq = mapply(\(b0, b1, b2)
(1/(K*sigma_hat_sq)) *
t(c(b0_hat-b0, b1_hat-b1, b2_hat-b2)) %*%
t(x) %*% x %*%
c(b0_hat-b0, b1_hat-b1, b2_hat-b2)
<= c,
b0 = b0, b1 = b1, b2 = b2
))
This leads to 91 TRUE rows.
sum(df$ueq)
[1] 91

How to restore attribute after union n igraphs?

let's say I have n igraphs objects g1, g2,.., gn. They are undirected and weighted graphs, i.e. new weight's attribute should be added. I'd like to union n graphs into the weighted graph g.
It is known from the documentation (see ?graph.union) if the n graphs have the weight attribute, it is renamed by adding a _1 and _2 (and _3, etc.) suffix, i.e. weight_1, weight_2,..., weight_n.
I have seen the answer and wrote the code for n=3 graphs (see below).
Edited:
library(igraph)
rm(list=ls(all=TRUE)) # delete all objects
g1 <- graph_from_literal(A1-B1-C1)
g2 <- graph_from_literal(A2-B2-C2)
g3 <- graph_from_literal(A3-B3-C3)
E(g1)$weight <- c(1, 2)
E(g2)$weight <- c(3, 4)
E(g3)$weight <- c(5, 6)
g <- union(g1, g2, g3)
new_attr <- as.list(list.edge.attributes(g))
k <- length(new_attr) # number of new attributes
value_new_attr <- lapply(list.edge.attributes(g),
function(x) get.edge.attribute(g,x))
df <- data.frame()
for (i in 1:k) {df <- rbind(df, value_new_attr[[i]])}
E(g)$weight <- colSums(df, na.rm=TRUE)
g <- delete_edge_attr(g, "weight_1") # 1
g <- delete_edge_attr(g, "weight_2") # 2
g <- delete_edge_attr(g, "weight_3") # 3
Question. How to rewrite the last tree commands with the lapply() function?
My attempt does not work:
g <- lapply(value_new_attr, function(x) {g <- delete_edge_attr(g, x)})
I have found the solution with for-loop
# delete edge attributes with suffix
for (i in 1:k) {g <- delete_edge_attr(g, new_attr[i])}

Euclidean distance for each row in dataset

Suppose we have dataset G2:
data(iris)
G2 <- iris[1:5, -5]
We need to calculate Euclidean distance between x (row in G2) and G2 (excluding x) for all x's in G2, formally
I wonder what is the best way to to this. Here is my initial attempt:
D <- dist(G2)
m1 <- as.matrix(D)
(1 / (5 - 1)) * colSums(m1)
Your notation is a bit confusing because you use D differently in the code and formula. How about
m <- as.matrix(dist(G2, upper=T))
D <- apply(m, 2, mean)
n <- length(D)
D <- n/(n-1)*D

Speeding up this tricky matrix calculation

As of now I am computing some features from a large matrix and doing it all in a for-loop. As expected it's very slow. I have been able to vectorize part of the code, but I'm stuck on one part.
I would greatly appreciate some advice/help!
s1 <- MyMatrix #dim = c(5167,256)
fr <- MyVector #vector of length 256
tw <- 5
fw <- 6
# For each point S(t,f) we need the sub-matrix of points S_hat(i,j),
# i in [t - tw, t + tw], j in [f - fw, f + fw] for the feature vector.
# To avoid edge effects, I pad the original matrix with zeros,
# resulting in a matrix of size nobs+2*tw x nfreqs+2*fw
nobs <- dim(s1)[1] #note: this is 5167
nf <- dim(s1)[2] #note: this is 256
sp <- matrix(0, nobs+2*tw, nf+2*fw)
t1 <- tw+1; tn <- nobs+tw
f1 <- fw+1; fn <- nf+fw
sp[t1:tn, f1:fn] <- s1 # embed the actual matrix into the padding
nfeatures <- 1 + (2*tw+1)*(2*fw+1) + 1
fsp <- array(NaN, c(dim(sp),nfeatures))
for (t in t1:tn){
for (f in f1:fn){
fsp[t,f,1] <- fr[(f - f1 + 1)] #this part I can vectorize
fsp[t,f,2:(nfeatures-1)] <- as.vector(sp[(t-tw):(t+tw),(f-fw):(f+fw)]) #this line is the problem
fsp[t,f,nfeatures] <- var(fsp[t,f,2:(nfeatures-1)])
}
}
fspec[t1:tn, f1:fn, 1] <- t(matrix(rep(fr,(tn-t1+1)),ncol=(tn-t1+1)))
#vectorized version of the first feature ^
return(fsp[t1:tn, f1:fn, ]) #this is the returned matrix
I assume that the var feature will be easy to vectorize after the 2nd feature is vectorized

Linear Independence of Large Sparse Matrices in R

I have three large matrices: I, G, and G^2. These are 4Million x 4Million matrices and they are sparse. I would like to check if they are linearly independent and I would like to do this in R.
For small matrices, a way to this is to vectorize each matrix: stack columns on top of each other and test if the matrix formed by the three stacked vectors has rank three.
However, due to the size of my problem I am not sure how to proceed.
(1) Is there a way to vectorize a Large Sparse Matrix into a Very Large Sparse Vector in R?
(2) Is there any other solution to the problem that could make this test efficient ?
Thanks in advance
When converting your matrices to vectors, you can keep only the non-zero elements.
# Sample data
n <- 4e6
k <- n
library(Matrix)
I <- spMatrix(n, n, 1:n, 1:n, rep(1,n))
G <- spMatrix(n, n,
sample(1:n, k, replace=TRUE),
sample(1:n, k, replace=TRUE),
sample(0:9, k, replace=TRUE)
)
G2 <- G %*% G
G2 <- as(G2, "dgTMatrix") # For the j slot
# Only keep elements that are non-zero in one of the 3 matrices
i <- as.integer( c(G#i, G2#i, I#i) + 1 )
j <- as.integer( c(G#j, G2#j, I#j) + 1 )
ij <- cbind(i,j)
rankMatrix( cbind( G2[ij], G[ij], I[ij] ) ) # 3
# Another example
m <- ceiling(n/2)-1
G <- spMatrix(n, n,
c(1:n, 2*(1:m)),
c(1:n, 2*(1:m)+1),
rep(1, n+m)
)
G2 <- as(G %*% G, "dgTMatrix")
i <- c(G#i, G2#i, I#i) + 1
j <- c(G#j, G2#j, I#j) + 1
ij <- cbind(i,j)
rankMatrix( cbind( G2[ij], G[ij], I[ij] ) ) # 2
(To speed things up, you could take only a small part of those vectors:
if the rank is already 3, you know that they are independent,
if it is 2, you can check if the linear dependence relation also holds for the large vectors.)

Resources