How to transform a item set matrix in R - r

How to transform a matrix like
A 1 2 3
B 3 6 9
c 5 6 9
D 1 2 4
into form like:
1 2 3 4 5 6 7 8 9
1 0 2 1 1 0 0 0 0 0
2 0 0 1 1 0 0 0 0 0
3 0 0 0 0 0 1 0 0 1
4 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 1 0 0 1
6 0 0 0 0 0 0 0 0 2
7 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0
I have some implement for it ,but it use the for loop
I wonder if there has some inner function in R (for example "apply")
add:
Sorry for the confusion.The first matrix just mean items sets, every set of items come out pairs ,for example the first set is "1 2 3" , and will become (1,2),(1,3),(2,3), correspond the second matrix.
and another question :
If the matrix is very large (10000000*10000000)and is sparse
should I use sparse matrix or big.matrix?
Thanks!

Removing the row names from M gives this:
m <- matrix(c(1,3,5,1,2,6,6,2,3,9,9,4), nrow=4)
> m
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 3 6 9
## [3,] 5 6 9
## [4,] 1 2 4
# The indicies that you want to increment in x, but some are repeated
# combn() is used to compute the combinations of columns
indices <- matrix(t(m[,combn(1:3,2)]),,2,byrow=TRUE)
# Count repeated rows
ones <- rep(1,nrow(indices))
cnt <- aggregate(ones, by=as.data.frame(indices), FUN=sum)
# Set each value to the appropriate count
x <- matrix(0, 9, 9)
x[as.matrix(cnt[,1:2])] <- cnt[,3]
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0 2 1 1 0 0 0 0 0
## [2,] 0 0 1 1 0 0 0 0 0
## [3,] 0 0 0 0 0 1 0 0 1
## [4,] 0 0 0 0 0 0 0 0 0
## [5,] 0 0 0 0 0 1 0 0 1
## [6,] 0 0 0 0 0 0 0 0 2
## [7,] 0 0 0 0 0 0 0 0 0
## [8,] 0 0 0 0 0 0 0 0 0
## [9,] 0 0 0 0 0 0 0 0 0

Related

Matrix of 10x10 with the same values on principal diagonal and anti-diagonal are the same and for the otherwise is zero whit for in R

I would create in R a square matrix where the values on main diagonal and anti-diagonal is the same. It's 2. The otherwise value is 0.
I would use the function "for" but I have no idea how to apply it.
This is that i would, but the way is wrong because i must use the function "for"
a <- matrix(0 , 10,10)
diag(a) <- 2
a <- data.frame(a)
a <- as.matrix(data.frame(lapply(a , rev)))
diag(a) <- 2
colnames(a) <- NULL
a
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 2 0 0 0 0 0 0 0 0 2
#> [2,] 0 2 0 0 0 0 0 0 2 0
#> [3,] 0 0 2 0 0 0 0 2 0 0
#> [4,] 0 0 0 2 0 0 2 0 0 0
#> [5,] 0 0 0 0 2 2 0 0 0 0
#> [6,] 0 0 0 0 2 2 0 0 0 0
#> [7,] 0 0 0 2 0 0 2 0 0 0
#> [8,] 0 0 2 0 0 0 0 2 0 0
#> [9,] 0 2 0 0 0 0 0 0 2 0
#> [10,] 2 0 0 0 0 0 0 0 0 2
Here's possibly the quickest way to do it with a for-loop.
m <- matrix(0, 10, 10)
for(i in 0:9) m[11*i+1] <- m[10+i*9] <- 2
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 2 0 0 0 0 0 0 0 0 2
#> [2,] 0 2 0 0 0 0 0 0 2 0
#> [3,] 0 0 2 0 0 0 0 2 0 0
#> [4,] 0 0 0 2 0 0 2 0 0 0
#> [5,] 0 0 0 0 2 2 0 0 0 0
#> [6,] 0 0 0 0 2 2 0 0 0 0
#> [7,] 0 0 0 2 0 0 2 0 0 0
#> [8,] 0 0 2 0 0 0 0 2 0 0
#> [9,] 0 2 0 0 0 0 0 0 2 0
#> [10,] 2 0 0 0 0 0 0 0 0 2
This works because a matrix can be indexed with a single number representing the entries in column 1 (1:10), then column 2 (11:20) and so on. The diagonal starts at position 1 and repeats every 11 entries. The anti-diagonal starts at 10 and repeats every 9 entries.
If this is a homework assignment, your teacher will probably want you to use the [row, column] notation for subsetting and use nested for loops, so you would be safer submitting this:
m <- matrix(0, 10, 10)
for(i in 1:10) {
for(j in 1:10) {
if(i == j || i == 11 - j) {
m[i, j] <- 2
}
}
}
m
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 2 0 0 0 0 0 0 0 0 2
#> [2,] 0 2 0 0 0 0 0 0 2 0
#> [3,] 0 0 2 0 0 0 0 2 0 0
#> [4,] 0 0 0 2 0 0 2 0 0 0
#> [5,] 0 0 0 0 2 2 0 0 0 0
#> [6,] 0 0 0 0 2 2 0 0 0 0
#> [7,] 0 0 0 2 0 0 2 0 0 0
#> [8,] 0 0 2 0 0 0 0 2 0 0
#> [9,] 0 2 0 0 0 0 0 0 2 0
#> [10,] 2 0 0 0 0 0 0 0 0 2
Though it would be fun watching your teacher getting their head around the first version...
Created on 2022-06-08 by the reprex package (v2.0.1)

Is there R function to count the number of matching pairs in 2 vectors

I have 2 vectors A and B. Both are column vectors and both contain 1000 values.
For example:
A = c(1,2,5,1,6,2,8,2,9)
B = c(3,4,2,3,7,4,5,4,8)
I wish to find out how many times that (A,B) match in a matrix form.
How many times elements of vector A(as rows) is matched with elements vector B(as columns),
eg. [1,3] is 1 from vector A is match with 3 from vector B two times
A\B 1 2 3 4 5 6 7 8
1 0 0 2 0 0 0 0 0
2 0 0 0 3 0 0 0 0
3 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0
5 0 1 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 0 0 0 0 1
I know that I need to create a for loop and I was hoping the unique command would work but as yet I have been unable to generate a working code.
In MATLAB there is accumarray() function, does R have something similar?
You can use a loop if you want, but it is slower than Rui Barradas solution in the comments. Perhaps others have faster loop options.
Am = matrix(0, nrow=length(A),ncol=length(A))
for( i in seq_along(A)) {
Am[A[i],B[i]] <- Am[A[i],B[i]] + 1
}
Output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0 0 2 0 0 0 0 0 0
[2,] 0 0 0 3 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0
[5,] 0 1 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 1 0 0
[7,] 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 1 0 0 0 0
[9,] 0 0 0 0 0 0 0 1 0

Matrix generation in R without loop

I am trying to create a matrix of the following kind in R: the number of rows is equal to n (supplied); in row i, for all i=1:n, the elements at positions n(i-1)+1 through n(i-1)+n inclusive are 1, all other elements are 0.
For example, if n=3, the matrix looks like
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
Or for n=4:
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
Is there any way of constructing this matrix in R, for general n, without using for loops (or any other kind of loop preferably)?
The simplest / most efficient method (in base R) would be ideal.
Solution 1: diag returns the diagonal of a matrix. Repeat each element 3 times and (re-)coerce it into a matrix:
matrix(rep(diag(3), each=3), nrow=3, byrow=TRUE)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> [1,] 1 1 1 0 0 0 0 0 0
#> [2,] 0 0 0 1 1 1 0 0 0
#> [3,] 0 0 0 0 0 0 1 1 1
Solution 2: table interprets the two vectors as factors and counts the combinations of their levels. Since each combination only exists once, you get the same result:
table(rep(1:3, each = 3), 1:9)
#>
#> 1 2 3 4 5 6 7 8 9
#> 1 1 1 1 0 0 0 0 0 0
#> 2 0 0 0 1 1 1 0 0 0
#> 3 0 0 0 0 0 0 1 1 1
Created on 2021-02-21 by the reprex package (v1.0.0)

Populate vector down or up with unique element value (like na.locf)

I have a large dataframe with each column containing one flag from the set {-1,1}, all the rest of the values are set to zero. I want to fill up or down the rest of the column entries with a value corresponding to that flag value. for example, given a vector to represent 1 column, I have
v <- rep(0,15)
v[12] <- 1
#I'd want a function that is something like:
f <- function(v,flag){
for(i in 2:length(v)){ if(v[i-1]==flag) v[i] <- flag else v[i]<-v[i]}
v
}
> v
[1] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
> f(v,1)
[1] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
The example works fine for filling forward some v and a flag 1. I'd also want to be able to fill backwards with 1 based on a -1 flag. The obvious solution that comes to mind is na.locf, except I can't get it to work with a 1 in the middle and filling forward and backwards. Even if I populate the 0 elements with NA, it will still not partially fill up or down based on a flag.
Are there any simple and fast vectorized functions that could do this with a matrix or zoo object populated with all zeros, except where there is one element with 1 or -1 in each column, telling it to fill down or up with 1s depending on the value?
edit: thinking about it a bit more, I came up with a possible solution, that along with an illustration, (hopefully) makes it more clear what I want.
Also, the overall goal is to create a mask for Additions/Deletions to a fund index, by date, that fill forwards for additions (+1) and fill backwards for removals (-1). Also, why I thought of na.locf right away. Still not sure if this is the best approach for this block, though. Any thoughts appreciated.
#generate random matrix of flags
v.mtx <- matrix(0,15,10)
for(i in 1:10){
v.mtx[sample(1:15,1),i] <- sample(c(-1,1),1)
}
fill.flag <- function(v) {
if(any(-1 %in% v)) {v[1:which(v!=0)] <- 1}
else
if(any(1 %in% v)) {v[which(v!=0):length(v)] <- 1}
v
}
> v.mtx
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 1 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 1 0 -1 0 0 0
[7,] 0 0 0 -1 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 1 0 -1
[10,] 0 0 0 0 0 0 0 0 -1 0
[11,] 0 0 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0 0 0
[13,] 0 0 1 0 0 0 0 0 0 0
[14,] 0 0 0 0 0 0 0 0 0 0
[15,] 1 -1 0 0 0 0 0 0 0 0
> apply(v.mtx,2,fill.flag)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 1 0 1 0 1 1 0 1 1
[2,] 0 1 0 1 0 1 1 0 1 1
[3,] 0 1 0 1 0 1 1 0 1 1
[4,] 0 1 0 1 0 1 1 0 1 1
[5,] 0 1 0 1 0 1 1 0 1 1
[6,] 0 1 0 1 1 1 1 0 1 1
[7,] 0 1 0 1 1 1 0 0 1 1
[8,] 0 1 0 0 1 1 0 0 1 1
[9,] 0 1 0 0 1 1 0 1 1 1
[10,] 0 1 0 0 1 1 0 1 1 0
[11,] 0 1 0 0 1 1 0 1 0 0
[12,] 0 1 0 0 1 1 0 1 0 0
[13,] 0 1 1 0 1 1 0 1 0 0
[14,] 0 1 1 0 1 1 0 1 0 0
[15,] 1 1 1 0 1 1 0 1 0 0
As #G. Grothendieck commented, you can try cummax and cummin, i.e.
f1 <- function(x){
if(sum(x) == 1){
return(cummax(x))
}else{
return(rev(cummin(rev(x)))* -1)
}
}
#apply as usual
apply(v.mtx, 2, f1)

multiple adjacency matrices for one edgelist R

I have the following edge list with a number that associates the edge with a path number. This is given by the following matrix which I call Totallist:
`
Begin edge end edge path number
1 3 1
3 4 1
4 5 1
6 3 2
3 2 2`
I want to construct adjacency matrices for each of the paths. In this example, I want two matrices, but there could be more. I have written the following but it only finds the matrix for the first path. I am unsure how to write something that will work for any number of paths that I throw at it:
X<-as.data.frame(table(Totallist[,3]))
nlines<-nrow(X)
nlines
freq<-X[1,2]
diameterofmatrix<-max(Totallist)
X1<-get.adjacency(graph.edgelist(as.matrix(Totallist[1:X[1,2],1:2]), directed=FALSE))
X1<-rbind(X1, 0)
X1<-cbind(X1, 0)
X1
I also need the matrices to all be the same dimension so that is why I added an extra row and column. I could continue using my method but it seems quite ugly. Thank you very much for any help.
To extract the adjacency matrices into a list you can do the following (I generate some fake data):
set.seed(42)
df <- data.frame(beginEdge = sample(1:10, 10, replace = TRUE),
endEdge = sample(1:10, 10, replace=TRUE),
pathNum = rep(c(1,2), each=5))
df
beginEdge endEdge pathNum
1 10 5 1
2 10 8 1
3 3 10 1
4 9 3 1
5 7 5 1
6 6 10 2
7 8 10 2
8 2 2 2
9 7 5 2
10 8 6 2
paths <- unique(df$pathNum) # get the paths to iterate through
If we make the nodes factors, and set the levels of the factors to all the nodes in the population, then the adjacency matrices will be computed for the population in your network. I am assuming here the network is ten actors. If your observed data contains all the nodes you want to work with set the levels to unique(c(df$beginEdge,df$endEdge)), or whatever the set of nodes are that you prefer.
df$beginEdge <- factor(df$beginEdge, levels=1:10)
df$endEdge <- factor(df$endEdge, levels=1:10)
We now go across the list of paths and create matrices storing them as a list:
list.of.adj.mats <- lapply(paths, function(i){
matrix(as.numeric((
table(df$beginEdge[df$pathNum==i],
df$endEdge[df$pathNum==i])+
table(df$endEdge[df$pathNum==i],
df$beginEdge[df$pathNum==i]))>0),
nrow=length(levels(df$beginEdge)))})
list.of.adj.mats
[[1]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 1 1
[4,] 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 1 0 0 1
[6,] 0 0 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 1 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 1
[9,] 0 0 1 0 0 0 0 0 0 0
[10,] 0 0 1 0 1 0 0 1 0 0
[[2]]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 0 0 0
[2,] 0 1 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 1 0 0 0
[6,] 0 0 0 0 0 0 0 1 0 1
[7,] 0 0 0 0 1 0 0 0 0 0
[8,] 0 0 0 0 0 1 0 0 0 1
[9,] 0 0 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 1 0 1 0 0

Resources