Produce a triangular matrix of integers increasing by 1 - r

I am trying to produce a matrix of variable dimensions of the form below (i.e. integers increasing by 1 at a time, with a lower triangle of NAs)
NA 1 2 3 4
NA NA 5 6 7
NA NA NA 8 9
NA NA NA NA 10
NA NA NA NA 11
I have used the below code
sample_vector <- c(1:(total_nodes^2))
sample_matrix <- matrix(sample_vector, nrow=total_nodes, byrow=FALSE)
sample_matrix[lower.tri(sample_matrix, diag = TRUE)] <- NA
However the matrix I get with this method is of the form:
NA 2 3 4 5
NA NA 8 9 10
NA NA NA 14 15
NA NA NA NA 20
NA NA NA NA 25

How about this
total_nodes <- 5
sample_matrix <- matrix(NA, nrow=total_nodes, ncol=total_nodes)
sample_matrix[lower.tri(sample_matrix)]<-1:sum(lower.tri(sample_matrix))
sample_matrix <- t(sample_matrix)
sample_matrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA 1 2 3 4
# [2,] NA NA 5 6 7
# [3,] NA NA NA 8 9
# [4,] NA NA NA NA 10
# [5,] NA NA NA NA NA

I'm using the diag function to construct a matrix and upper.tri to turn it into a "target" aas well as a logical indexing tool:
upr5 <- upper.tri(diag(5))
upr5
upr5[upr5] <- 1:sum(upr5)
upr5[upr5==0] <- NA # would otherwise have been zeroes
upr5
[,1] [,2] [,3] [,4] [,5]
[1,] NA 1 2 4 7
[2,] NA NA 3 5 8
[3,] NA NA NA 6 9
[4,] NA NA NA NA 10
[5,] NA NA NA NA NA

Related

How to incorporate adjacency matrix into equation (network simulation)

I've generated a small world network with 16 agents with igraph:
myNetwork <- sample_smallworld(dim = 1, nei = 1, size = 16, p = 0.1) #generate small world
plot(myNetwork, vertex.size=20, vertex.label=c(1:16), layout=layout_in_circle) #inspect the network
In a separate dataframe, stack, I have each of these agents' opinion (opinion1):
> stack
agent opinion1
1 1 0.71979146
2 2 0.25040406
3 3 0.50866647
4 4 0.53713674
5 5 0.53954982
6 6 0.23903034
7 7 0.03989347
8 8 0.29350197
9 9 0.85441826
10 10 0.44565889
11 11 0.28223782
12 12 0.39748249
13 13 0.17488017
14 14 0.08804374
15 15 0.61174168
16 16 0.30949636
I now want to calculate each agent's updated opinion (let's call it opinion2) by applying this equation, where networkNeighborsOpinion1 refers to the opinion1s of the agents that are connected in myNetwork:
opinion2 <- 0.5 * opinion1 * 0.5 * (mean(networkNeighborsOpinion1))
Given myNetwork and DF$opinion1, how can I efficiently apply this equation to each agent?
Here's my thinking so far...
From myNetwork, the corresponding adjacency matrix can be retrieved like so:
adjMatrix <- as.matrix(as_adjacency_matrix(myNetwork, names = TRUE, edges = FALSE))
adjMatrix[adjMatrix == 0] <- NA #turn all 0s into NAs
> adjMatrix
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA
[2,] 1 NA 1 NA NA NA NA NA NA NA NA NA NA NA NA
[3,] NA 1 NA NA NA NA NA NA NA NA NA NA NA 1 NA
[4,] NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA
[5,] NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA
[6,] NA NA NA NA NA NA 1 NA NA NA NA NA NA 1 NA
[7,] NA NA NA NA NA 1 NA 1 NA NA NA NA NA NA NA
[8,] NA NA NA NA NA NA 1 NA 1 NA NA NA NA NA NA
[9,] NA NA NA NA NA NA NA 1 NA 1 NA NA NA NA NA
[10,] NA NA NA NA NA NA NA NA 1 NA 1 NA NA NA NA
[11,] NA NA NA NA NA NA NA NA NA 1 NA 1 NA NA NA
[12,] NA NA NA NA NA NA NA NA NA NA 1 NA 1 NA NA
[13,] NA NA NA NA NA NA NA NA NA NA NA 1 NA 1 NA
[14,] NA NA 1 NA NA 1 NA NA NA NA NA NA 1 NA 1
[15,] NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA
[16,] 1 NA NA NA NA NA NA NA NA NA NA NA NA NA 1
Each agent is represented by a row in adjMatrix, and each network connection is indicated by a value of 1.
Then, it seems like there should be way to use each row of adjMatrix to call the appropriate values from stack$opinion1 and generate a vector of networkNeighborsOpinion1, which could then be used to compute an opinion2 for each agent. Note that I've changed the 0s in adjMatrix to NAs, which follows my thinking that each row could by multiplied by the corresponding values in stack$opinion1 (i.e., each opinion1 is either multiplied by 1 or NA, which could then be input as mean(networkNeighborsOpinion1, na.rm = TRUE))
Any direction on this would be appreciated. Perhaps a for loop or function?
Multiply the adjacency matrix by opinion1 and divide by the sum of corresponding rows in the adjacency matrix. Then average that with opinion1.
adjMatrix <- as.matrix(as_adjacency_matrix(myNetwork, names = TRUE, edges = FALSE))
0.5 * stack$opinion1 + 0.5 * (adjMatrix %*% stack$opinion1) / rowSums(adjMatrix)
Note
stack is reproducible form is:
Lines <- " agent opinion1
1 1 0.71979146
2 2 0.25040406
3 3 0.50866647
4 4 0.53713674
5 5 0.53954982
6 6 0.23903034
7 7 0.03989347
8 8 0.29350197
9 9 0.85441826
10 10 0.44565889
11 11 0.28223782
12 12 0.39748249
13 13 0.17488017
14 14 0.08804374
15 15 0.61174168
16 16 0.30949636"
stack <- read.table(text = Lines)

Shift each row in dataframe increasingly to the right

I would like to have a dataframe like this for example:
example=data.frame(a=c(1,2,3,4,5,6,7,8), b=c(1,2,3,4,5,6,7,8), c=c(1,2,3,4,5,6,7,8), d = c(1,2,3,4,5,6,7,8))
a b c d
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
Be transformed so that the first row stays in the same position, but each row after that shifts one column to the right from the previous row, for example:
a b c d X X.1 X.2 X.3 X.4 X.5 X.6
1 1 1 1 1 NA NA NA NA NA NA NA
2 NA 2 2 2 2 NA NA NA NA NA NA
3 NA NA 3 3 3 3 NA NA NA NA NA
4 NA NA NA 4 4 4 4 NA NA NA NA
5 NA NA NA NA 5 5 5 5 NA NA NA
6 NA NA NA NA NA 6 6 6 6 NA NA
7 NA NA NA NA NA NA 7 7 7 7 NA
8 NA NA NA NA NA NA NA 8 8 8 8
This is for the purpose that each column can then be summed (by which I mean that for each new column, the rows will be added together, but there's no meaning to the column titles so moving them doesn't matter), so the column names don't particularly matter.
Any help would be much appreciated as i've yet to stumble across anything to achieve this kind of data transformation.
EDIT: Thanks to everyone who answered, all the solutions worked great!
Here is another base R for loop. I construct the matrix first, and then fill it in.
# build matrix of missing values
myMat <- matrix(NA, nrow(example), ncol(example) + nrow(example) - 1)
# fill it in by row with vector pulled from row of example and simiplified with `unlist`
for(i in seq_len(nrow(myMat))) {
myMat[i, i:(ncol(example) + i - 1)] <- unlist(example[i,], use.names=FALSE)
}
This returns
myMat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 1 1 1 1 NA NA NA NA NA NA NA
[2,] NA 2 2 2 2 NA NA NA NA NA NA
[3,] NA NA 3 3 3 3 NA NA NA NA NA
[4,] NA NA NA 4 4 4 4 NA NA NA NA
[5,] NA NA NA NA 5 5 5 5 NA NA NA
[6,] NA NA NA NA NA 6 6 6 6 NA NA
[7,] NA NA NA NA NA NA 7 7 7 7 NA
[8,] NA NA NA NA NA NA NA 8 8 8 8
A simple approach to create a matrix with NA and fill in the values you want.
Shifted = matrix(NA, nrow=nrow(example),
ncol=nrow(example) + ncol(example) - 1)
for(i in 1:nrow(example)) {
Shifted[i, i:(i+ncol(example)-1)] = unlist(example[i,]) }
If you really want a data.frame, you can finish off with
as.data.frame(Shifted)

Index matrix with randomly sampled columns

I'm trying to fill the columns of an index matrix with samples from 1:whatever using a for loop. The purpose of this is for a bootstrap coding problem. The issue I'm getting is that the for loop wont run correctly once it reaches a number that is not a multiple of the row length. For some reason it thinks I want an equal representation of number in each column. How do I get this to stop?
index.mat=matrix(NA,nr=12,nc=10,byrow=FALSE)
for(i in 1:5)
{
index.mat[,i] <- sample(1:i, i, replace=TRUE)
print(index.mat)
}
will print
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 2 1 NA NA NA NA NA NA
[2,] 1 1 2 4 NA NA NA NA NA NA
[3,] 1 1 2 1 NA NA NA NA NA NA
[4,] 1 1 2 2 NA NA NA NA NA NA
[5,] 1 1 2 1 NA NA NA NA NA NA
[6,] 1 1 2 4 NA NA NA NA NA NA
[7,] 1 1 2 1 NA NA NA NA NA NA
[8,] 1 1 2 2 NA NA NA NA NA NA
[9,] 1 1 2 1 NA NA NA NA NA NA
[10,] 1 1 2 4 NA NA NA NA NA NA
[11,] 1 1 2 1 NA NA NA NA NA NA
[12,] 1 1 2 2 NA NA NA NA NA NA
as the final matrix before giving the error
Error in index.mat[, i] <- sample(1:i, i, replace = TRUE) :
number of items to replace is not a multiple of replacement length
Just use sample(i, size = 12, replace = TRUE).
Your LHS is index.mat[,i] which has length 12.
Your RHS is sample(1:i, i, replace = TRUE), which has length i.
By nature, R will recycle the RHS when the lengths don't match up -- this means, when i=2, your RHS is length 2 and it will simply be repeated 6 times to match the LHS length 12.
In this particular case, if the RHS length isn't a divisor of the LHS length, you get an error -- which happens first when, you guessed it, i=5 (since 1, 2, 3, and 4 all divide 12 evenly).

Create a Triangular Matrix from a Vector performing sequential operations

I have been trying to solve the following problem.
Suppose I have the following vector:
aux1<-c(0,0,0,4,5,0,7,0,0,10,11,12) where the numbers represent the number of the row.
I want to calculate the distance between the differents elements of this vector fixing the first component, then the second and so on.
If the element is zero, I do not want to count it, so I put a NA instead. The output I want should look like this:
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
NA NA NA NA NA
1 NA NA NA NA
NA NA NA NA NA
3 2 NA NA NA
NA NA NA NA NA
NA NA NA NA NA
6 5 3 NA NA
7 6 4 1
8 7 5 2 1
In the first column, I have the difference between the first element different from zero and all other elements, i.e., Matrix[5,1]=5-4=1 and Matrix[12,1]=12-4=8. Also, Matrix[7,2]=7-5=2, where 5 is the second element in the vector non-equal to zero. Notice that Matrix[10,3]=10-7=3, where 7 is third element non-equal to zero, but the seventh element in my vector.
I have tried to do this in a loop. My current code looks like this:
M=matrix(nrow=N-1, ncol=N-1))
for (i in 1:N-1){
for (j in 1:N-1){
if(j<=i)
next
else
if(aux1[j]>0)
M[j,i]=aux1[j]-aux1[i]
else
M[j,i]=0
}
}
Unfortunately. I have not been able to solve my problem. Any help would be greatly appreciated.
You could try something like the following (with generous help from #thela)
res <- outer(aux1, head(aux1[aux1 > 0], -1), `-`)
is.na(res) <- res <= 0
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA NA NA NA NA
# [2,] NA NA NA NA NA
# [3,] NA NA NA NA NA
# [4,] NA NA NA NA NA
# [5,] 1 NA NA NA NA
# [6,] NA NA NA NA NA
# [7,] 3 2 NA NA NA
# [8,] NA NA NA NA NA
# [9,] NA NA NA NA NA
# [10,] 6 5 3 NA NA
# [11,] 7 6 4 1 NA
# [12,] 8 7 5 2 1
Using sapply and ifelse :
sapply(head(vv[vv>0],-1),function(y)ifelse(vv-y>0,vv-y,NA))
You loop over the positive values (you should also remove the last element), then you extract each value from the original vector. I used ifelse to replace negative values.
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA NA NA NA NA
# [2,] NA NA NA NA NA
# [3,] NA NA NA NA NA
# [4,] NA NA NA NA NA
# [5,] 1 NA NA NA NA
# [6,] NA NA NA NA NA
# [7,] 3 2 NA NA NA
# [8,] NA NA NA NA NA
# [9,] NA NA NA NA NA
# [10,] 6 5 3 NA NA
# [11,] 7 6 4 1 NA
# [12,] 8 7 5 2 1

How to order columns and rows to create a relatively dense submatrix

I have a large matrix which comprises 1,2 and missing (coded as NA) values. The matrix has 500000 rows by 10000 columns. There are approximately 0.05% 1- or 2-values, and the remaining values are NA.
I would like to reorder the rows and columns of the matrix so that the top left corner of the matrix comprises a relatively high number of 1s and 2s compared to the rest of the matrix. In other words, I would like to create a relatively datarich subset of the matrix, by reordering the matrix rows and columns.
Is there an efficient method of achieving this in R?
In particular, I'm interested in solutions where sorting by the number of non-NA values in each row and column is not sufficient to produce a dense corner.
In addition, I'll add a constraint. The size of the dense corner will be pre-defined.
In the following example, the goal is to reorder the rows and columns so that the top leftmost 3x3 submatrix is relatively dense (i.e. few or no NA values).
m1 <- matrix(c(rep(c(rep(NA, 3), rep(1, 7)), 1),
rep(c(rep(2, 3), rep(NA, 7)), 7),
rep(c(rep(NA, 3), rep(1, 7)), 2)
), nrow=10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] NA NA NA 1 1 1 1 1 1 1
[2,] 2 2 2 NA NA NA NA NA NA NA
[3,] 2 2 2 NA NA NA NA NA NA NA
[4,] 2 2 2 NA NA NA NA NA NA NA
[5,] 2 2 2 NA NA NA NA NA NA NA
[6,] 2 2 2 NA NA NA NA NA NA NA
[7,] 2 2 2 NA NA NA NA NA NA NA
[8,] 2 2 2 NA NA NA NA NA NA NA
[9,] NA NA NA 1 1 1 1 1 1 1
[10,] NA NA NA 1 1 1 1 1 1 1
The rows and columns are ordered by the number of non-NA values (using code from an answer below):
m1 <- m1[order(rowSums(is.na(m1))), order(colSums(is.na(m1)))]
However, this does not result in a dense 3x3 top leftmost corner:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] NA NA NA 1 1 1 1 1 1 1
[2,] NA NA NA 1 1 1 1 1 1 1
[3,] NA NA NA 1 1 1 1 1 1 1
[4,] 2 2 2 NA NA NA NA NA NA NA
[5,] 2 2 2 NA NA NA NA NA NA NA
[6,] 2 2 2 NA NA NA NA NA NA NA
[7,] 2 2 2 NA NA NA NA NA NA NA
[8,] 2 2 2 NA NA NA NA NA NA NA
[9,] 2 2 2 NA NA NA NA NA NA NA
[10,] 2 2 2 NA NA NA NA NA NA NA
I thought that there maybe a set of optimisation procedures that I could implement as my working matrix is too large to do the reorganisation by eye.

Resources