Compute degree of each vertex from data frame - r

I have the following dataset:
V1 V2
2 1
3 1
3 2
4 1
4 2
4 3
5 1
6 1
7 1
7 5
7 6
I tried to compute the degree of each vertex with the code
e<-read.table("ex.txt")
library(igraph)
g1<-graph.data.frame(e, directed=FALSE)
adj<- get.adjacency(g1,type=c("both", "upper", "lower"),attr=NULL, names=TRUE, sparse=FALSE)
d<-rowSums(adj)
e$degreeOfV1<-d[e$V1]
e$degofV2<-d[e$V2]
the degree given by this code is not correct.

The problem with this code is that the nodes have be inputted into your graph in a different order than you expected:
V(g1)
# + 7/7 vertices, named:
# [1] 2 3 4 5 6 7 1
The first node in the graph (corresponding to element 1 of your d object) is actually node number 2 in e, element 2 is node number 3 in e, etc.
You can deal with this by using the node names instead of the node numbers when calculating the degrees:
d <- degree(g1)
e$degreeOfV1 <- d[as.character(e$V1)]
e$degreeOfV2 <- d[as.character(e$V2)]
# V1 V2 degreeOfV1 degreeOfV2
# 1 2 1 3 6
# 2 3 1 3 6
# 3 3 2 3 3
# 4 4 1 3 6
# 5 4 2 3 3
# 6 4 3 3 3
# 7 5 1 2 6
# 8 6 1 2 6
# 9 7 1 3 6
# 10 7 5 3 2
# 11 7 6 3 2
Basically the way this works is that degree(g1) returns a named vector of the degrees of each node in your graph:
(d <- degree(g1))
# 2 3 4 5 6 7 1
# 3 3 3 2 2 3 6
When you index by strings (as.character(e$V1) instead of e$V1), then you get the node by name instead of by index number.

Related

Transforming a looping factor variable into a sequence of numerics

I have a factor variable with 6 levels, which simplified looks like:
1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 1 1 1 2 2 2 2... 1 1 1 2 2... (with n = 78)
Note, that each number is repeated mostly but not always three times.
I need to transform this variable into the following pattern:
1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 8...
where each repetition of the 6 levels continuous counting ascending.
Is there any way / any function that lets me do that?
Sorry for my bad description!
Assuming that you have a numerical vector that represents your simplified version you posted. i.e. x = c(1,1,1,2,2,3,3,3,1,1,2,2), you can use this:
library(dplyr)
cumsum(x != lag(x, default = 0))
# [1] 1 1 1 2 2 3 3 3 4 4 5 5
which compares each value to its previous one and if they are different it adds 1 (starting from 1).
Maybe you can try rle, i.e.,
v <- rep(seq_along((v<-rle(x))$values),v$lengths)
Example with dummy data
x = c(1,1,1,2,2,3,3,3,4,4,5,6,1,1,2,2,3,3,3,4,4)
then we can get
> v
[1] 1 1 1 2 2 3 3 3 4 4 5 6 7 7 8 8 9 9
[19] 9 10 10
In base you can use diff and cumsum.
c(1, cumsum(diff(x)!=0)+1)
# [1] 1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 8
Data:
x <- c(1,1,2,2,2,3,3,3,4,4,4,4,5,5,5,6,6,6,1,1,1,2,2,2,2)

Sequentially remove vector elements

I want to replicate a vector with one value within this vector is missing (sequentially).
For example, my vector is
value <- 1:7
First, the series is without 1, second without 2, and so on. In the end, the series is in one vector.
The intended output looks like
2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 6
Is there any smart way to do this?
You could use the diagonal matrix to set up a logical vector, using it to remove the appropriate values.
n <- 7
rep(1:n, n)[!diag(n)]
# [1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5
# [36] 7 1 2 3 4 5 6
Well, you can certainly do it as a one-liner but I am not sure it qualifies as smart. For example:
x <- 1:7
do.call("c", lapply(as.list(-1:-length(x)), function(a)x[a]))
This simple uses lapply to create a list of copies of x with each of its entries deleted, and then concatenates them using c. The do.call function applies its first argument (a function) to its second argument (a list of arguments to the function).
For fun, it's also possible to just use rep:
> n <- 7
> rep(1:n, n)[rep(c(FALSE, rep(TRUE, n)), length.out=n^2)]
[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2
[39] 3 4 5 6
But lapply is cleaner, I think.
You could also do:
n <- 7
rep(seq(n), n)[-seq(1,n*n,n+1)]
#[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2 3 4 5 6

Create set of matrices from concatenating columns of another matrix in r

I have two matrices A and B of dimension 5 by 3 and 5 by 2, respectively. I want to produce series of matrices combining each column of matrix B to A. The dimensions of the resulting matrices would be 5 by 4
Let A be
1 2 3
4 5 6
7 8 9
2 3 1
4 1 5
and B be
1 2
2 5
3 8
6 3
2 1
Then the resulting matrices are
1 2 3 1
4 5 6 2
7 8 9 3
2 3 1 6
4 1 5 2
and
1 2 3 2
4 5 6 5
7 8 9 8
2 3 1 3
4 1 5 1
Use our old friend the assignment operator. Assigning 1st column of B to 4th of A:
A[, 4] <- B[, 1]
> A
V1 V2 V3 V4
1 1 2 3 1
2 4 5 6 2
3 7 8 9 3
4 2 3 1 6
5 4 1 5 2
Then A[, 4] <- B[, 2], etc.

How to reverse a column in R

I have a dataframe as described below. Now I want to reverse the order of column B without hampering the total order of the dataframe. So now the column B has 5,4,3,2,1. I want to change it to 1,2,3,4,5. I don't want to sort as it will hamper the total ordering.
A B C
1 5 6
2 4 8
3 3 5
4 2 5
5 1 3
You can replace just that column:
x$B <- rev(x$B)
On your data:
> x$B <- rev(x$B)
> x
A B C
1 1 1 6
2 2 2 8
3 3 3 5
4 4 4 5
5 5 5 3
transform is also handy for this:
> transform(x, B = rev(B))
A B C
1 1 1 6
2 2 2 8
3 3 3 5
4 4 4 5
5 5 5 3
This doesn't modify x so you need to assign the result to something (perhaps back to x).

Interpreting the result of 'cutree' from hclust/heatmap.2

I have the following code that perform hiearchical clustering and plot
them in heatmap.
set.seed(538)
# generate data
y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""),
paste("t", 1:5, sep="")))
# the actual data is much larger that the above
# perform hiearchical clustering and plot heatmap
test <- heatmap.2(y)
What I want to do is to print the cluster member from each hierarchy
of in the plot. I'm not sure what's the good way to do it.
I tried this:
cutree(as.hclust(test$rowDendrogram), 1:dim(y)[1])
But having problem in interpreting the result.
What's the meaning of each value in the matrix?
For example g9-9=8 . What does 8 mean here?
1 2 3 4 5 6 7 8 9 10
g1 1 1 1 1 1 1 1 1 1 1
g2 1 2 2 2 2 2 2 2 2 2
g3 1 2 2 3 3 3 3 3 3 3
g4 1 2 2 2 2 2 2 2 2 4
g5 1 1 1 1 1 1 1 4 4 5
g6 1 2 3 4 4 4 4 5 5 6
g7 1 2 2 2 2 5 5 6 6 7
g8 1 2 3 4 5 6 6 7 7 8
g9 1 2 3 4 4 4 7 8 8 9
g10 1 2 3 4 5 6 6 7 9 10
Your expert advice will be greatly appreciated.
Column j tells you how your gs should be grouped if you wanted exactly j groups.
Columns 1 and 10 are not very useful, but maybe column 2 is a good example. It is telling you that if you wanted exactly two groups then they would be:
group1: {g1, g5}
group2: {g2, g3, g4, g6, g7, g8, g9, g10}

Resources