Matrix generation in R without loop - r

I am trying to create a matrix of the following kind in R: the number of rows is equal to n (supplied); in row i, for all i=1:n, the elements at positions n(i-1)+1 through n(i-1)+n inclusive are 1, all other elements are 0.
For example, if n=3, the matrix looks like
1 1 1 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
Or for n=4:
1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
Is there any way of constructing this matrix in R, for general n, without using for loops (or any other kind of loop preferably)?
The simplest / most efficient method (in base R) would be ideal.

Solution 1: diag returns the diagonal of a matrix. Repeat each element 3 times and (re-)coerce it into a matrix:
matrix(rep(diag(3), each=3), nrow=3, byrow=TRUE)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#> [1,] 1 1 1 0 0 0 0 0 0
#> [2,] 0 0 0 1 1 1 0 0 0
#> [3,] 0 0 0 0 0 0 1 1 1
Solution 2: table interprets the two vectors as factors and counts the combinations of their levels. Since each combination only exists once, you get the same result:
table(rep(1:3, each = 3), 1:9)
#>
#> 1 2 3 4 5 6 7 8 9
#> 1 1 1 1 0 0 0 0 0 0
#> 2 0 0 0 1 1 1 0 0 0
#> 3 0 0 0 0 0 0 1 1 1
Created on 2021-02-21 by the reprex package (v1.0.0)

Related

Is there R function to count the number of matching pairs in 2 vectors

I have 2 vectors A and B. Both are column vectors and both contain 1000 values.
For example:
A = c(1,2,5,1,6,2,8,2,9)
B = c(3,4,2,3,7,4,5,4,8)
I wish to find out how many times that (A,B) match in a matrix form.
How many times elements of vector A(as rows) is matched with elements vector B(as columns),
eg. [1,3] is 1 from vector A is match with 3 from vector B two times
A\B 1 2 3 4 5 6 7 8
1 0 0 2 0 0 0 0 0
2 0 0 0 3 0 0 0 0
3 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0
5 0 1 0 0 0 0 0 0
6 0 0 0 0 0 0 1 0
7 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0
9 0 0 0 0 0 0 0 1
I know that I need to create a for loop and I was hoping the unique command would work but as yet I have been unable to generate a working code.
In MATLAB there is accumarray() function, does R have something similar?
You can use a loop if you want, but it is slower than Rui Barradas solution in the comments. Perhaps others have faster loop options.
Am = matrix(0, nrow=length(A),ncol=length(A))
for( i in seq_along(A)) {
Am[A[i],B[i]] <- Am[A[i],B[i]] + 1
}
Output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0 0 2 0 0 0 0 0 0
[2,] 0 0 0 3 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0
[5,] 0 1 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 1 0 0
[7,] 0 0 0 0 0 0 0 0 0
[8,] 0 0 0 0 1 0 0 0 0
[9,] 0 0 0 0 0 0 0 1 0

Build a matrix to identify a step-wise change in integers for a vector

I'd like to build a matrix that records the change from one integer value to another for a vector.
Example Vector
a <- c(NA,1,3,4,2,6,5,3,7,7,NA,3,NA,5,5,NA,2,3,1,4)
Conceptual Matrix Design
Where I would tally every time a value in the vector a changes (or doesn't change) from one integer to another.
To
1 2 3 4 5 6 7
1
2
3
From 4
5
6
7
Desired Output
Note that NA's matter. E.g., 7,NA,3 in a does not count for from 7 to 3.
To
1 2 3 4 5 6 7
1 0 0 1 1 0 0 0
2 0 0 1 0 0 1 0
3 1 0 0 1 0 0 1
From 4 0 1 0 0 0 0 0
5 0 0 1 0 1 0 0
6 0 0 0 0 1 0 0
7 0 0 0 0 0 0 1
Using table
table(dplyr::lag(a),a)
a
1 2 3 4 5 6 7
1 0 0 1 1 0 0 0
2 0 0 1 0 0 1 0
3 1 0 0 1 0 0 1
4 0 1 0 0 0 0 0
5 0 0 1 0 1 0 0
6 0 0 0 0 1 0 0
7 0 0 0 0 0 0 1
dict = sapply(2:length(a), function(i) toString(a[(i-1):i]))
unq = sort(unique(a))
+t(sapply(unq, function(x) sapply(unq, function(y) toString(c(x, y)) %in% dict)))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,] 0 0 1 1 0 0 0
#[2,] 0 0 1 0 0 1 0
#[3,] 1 0 0 1 0 0 1
#[4,] 0 1 0 0 0 0 0
#[5,] 0 0 1 0 1 0 0
#[6,] 0 0 0 0 1 0 0
#[7,] 0 0 0 0 0 0 1
An option with tidyverse
library(tidyverse)
tibble(a, a1 = lag(a)) %>%
dplyr::count(a, a1) %>%
filter(!is.na(a), !is.na(a1)) %>%
spread(a1, n, fill = 0) %>%
column_to_rownames('a')
# 1 2 3 4 5 6 7
#1 0 0 1 0 0 0 0
#2 0 0 0 1 0 0 0
#3 1 1 0 0 1 0 0
#4 1 0 1 0 0 0 0
#5 0 0 0 0 1 1 0
#6 0 1 0 0 0 0 0
#7 0 0 1 0 0 0 1

Populate vector down or up with unique element value (like na.locf)

I have a large dataframe with each column containing one flag from the set {-1,1}, all the rest of the values are set to zero. I want to fill up or down the rest of the column entries with a value corresponding to that flag value. for example, given a vector to represent 1 column, I have
v <- rep(0,15)
v[12] <- 1
#I'd want a function that is something like:
f <- function(v,flag){
for(i in 2:length(v)){ if(v[i-1]==flag) v[i] <- flag else v[i]<-v[i]}
v
}
> v
[1] 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
> f(v,1)
[1] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
The example works fine for filling forward some v and a flag 1. I'd also want to be able to fill backwards with 1 based on a -1 flag. The obvious solution that comes to mind is na.locf, except I can't get it to work with a 1 in the middle and filling forward and backwards. Even if I populate the 0 elements with NA, it will still not partially fill up or down based on a flag.
Are there any simple and fast vectorized functions that could do this with a matrix or zoo object populated with all zeros, except where there is one element with 1 or -1 in each column, telling it to fill down or up with 1s depending on the value?
edit: thinking about it a bit more, I came up with a possible solution, that along with an illustration, (hopefully) makes it more clear what I want.
Also, the overall goal is to create a mask for Additions/Deletions to a fund index, by date, that fill forwards for additions (+1) and fill backwards for removals (-1). Also, why I thought of na.locf right away. Still not sure if this is the best approach for this block, though. Any thoughts appreciated.
#generate random matrix of flags
v.mtx <- matrix(0,15,10)
for(i in 1:10){
v.mtx[sample(1:15,1),i] <- sample(c(-1,1),1)
}
fill.flag <- function(v) {
if(any(-1 %in% v)) {v[1:which(v!=0)] <- 1}
else
if(any(1 %in% v)) {v[which(v!=0):length(v)] <- 1}
v
}
> v.mtx
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 1 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 1 0 -1 0 0 0
[7,] 0 0 0 -1 0 0 0 0 0 0
[8,] 0 0 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 1 0 -1
[10,] 0 0 0 0 0 0 0 0 -1 0
[11,] 0 0 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0 0 0
[13,] 0 0 1 0 0 0 0 0 0 0
[14,] 0 0 0 0 0 0 0 0 0 0
[15,] 1 -1 0 0 0 0 0 0 0 0
> apply(v.mtx,2,fill.flag)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 1 0 1 0 1 1 0 1 1
[2,] 0 1 0 1 0 1 1 0 1 1
[3,] 0 1 0 1 0 1 1 0 1 1
[4,] 0 1 0 1 0 1 1 0 1 1
[5,] 0 1 0 1 0 1 1 0 1 1
[6,] 0 1 0 1 1 1 1 0 1 1
[7,] 0 1 0 1 1 1 0 0 1 1
[8,] 0 1 0 0 1 1 0 0 1 1
[9,] 0 1 0 0 1 1 0 1 1 1
[10,] 0 1 0 0 1 1 0 1 1 0
[11,] 0 1 0 0 1 1 0 1 0 0
[12,] 0 1 0 0 1 1 0 1 0 0
[13,] 0 1 1 0 1 1 0 1 0 0
[14,] 0 1 1 0 1 1 0 1 0 0
[15,] 1 1 1 0 1 1 0 1 0 0
As #G. Grothendieck commented, you can try cummax and cummin, i.e.
f1 <- function(x){
if(sum(x) == 1){
return(cummax(x))
}else{
return(rev(cummin(rev(x)))* -1)
}
}
#apply as usual
apply(v.mtx, 2, f1)

Matrix from rows with delimited items in R

I have a such database with semicolon delimited values in rows:
A;1;3;5;7;9
B;1;2;3
C;1;3;5
D;2;4;8
There is different count of items in each row. Each item is only once in each row (no repeating).
I'd like to make a matrix for item base collaborative filtering. The first column with letters is deleted and the numbers are transformed like this:
1 2 3 4 5 6 7 8 9
-----------------
1 0 1 0 1 0 1 0 1
1 1 1 0 0 0 0 0 0
1 0 1 0 1 0 0 0 0
0 1 0 1 0 0 0 0 0
Can you please give me an advice how to manage it?
Here is an option. We read in the string into a character vector, strsplit on ;, initialize the empty matrix, and then assign for each row using a matrix index of the row with all the column values:
DAT <- readLines(textConnection("A;1;3;5;7;9
B;1;2;3
C;1;3;5
D;2;4;8"))
DAT.NUM <- lapply(strsplit(DAT, ";"), function(x) as.integer(x[-1]))
RES <- matrix(0L, length(DAT), max(unlist(DAT.NUM)))
for(i in seq_along(DAT)) RES[cbind(i, DAT.NUM[[i]])] <- 1L
Produces:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 0 1 0 1 0 1 0 1
[2,] 1 1 1 0 0 0 0 0 0
[3,] 1 0 1 0 1 0 0 0 0
[4,] 0 1 0 1 0 0 0 1 0
Alternatively, inspired by #user227710, you can:
t(table(stack(setNames(DAT.NUM, seq_along(DAT.NUM)))))
Which produces:
values
ind 1 2 3 4 5 7 8 9
1 1 0 1 0 1 1 0 1
2 1 1 1 0 0 0 0 0
3 1 0 1 0 1 0 0 0
4 0 1 0 1 0 0 1 0

How to transform a item set matrix in R

How to transform a matrix like
A 1 2 3
B 3 6 9
c 5 6 9
D 1 2 4
into form like:
1 2 3 4 5 6 7 8 9
1 0 2 1 1 0 0 0 0 0
2 0 0 1 1 0 0 0 0 0
3 0 0 0 0 0 1 0 0 1
4 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 1 0 0 1
6 0 0 0 0 0 0 0 0 2
7 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0
9 0 0 0 0 0 0 0 0 0
I have some implement for it ,but it use the for loop
I wonder if there has some inner function in R (for example "apply")
add:
Sorry for the confusion.The first matrix just mean items sets, every set of items come out pairs ,for example the first set is "1 2 3" , and will become (1,2),(1,3),(2,3), correspond the second matrix.
and another question :
If the matrix is very large (10000000*10000000)and is sparse
should I use sparse matrix or big.matrix?
Thanks!
Removing the row names from M gives this:
m <- matrix(c(1,3,5,1,2,6,6,2,3,9,9,4), nrow=4)
> m
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 3 6 9
## [3,] 5 6 9
## [4,] 1 2 4
# The indicies that you want to increment in x, but some are repeated
# combn() is used to compute the combinations of columns
indices <- matrix(t(m[,combn(1:3,2)]),,2,byrow=TRUE)
# Count repeated rows
ones <- rep(1,nrow(indices))
cnt <- aggregate(ones, by=as.data.frame(indices), FUN=sum)
# Set each value to the appropriate count
x <- matrix(0, 9, 9)
x[as.matrix(cnt[,1:2])] <- cnt[,3]
x
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0 2 1 1 0 0 0 0 0
## [2,] 0 0 1 1 0 0 0 0 0
## [3,] 0 0 0 0 0 1 0 0 1
## [4,] 0 0 0 0 0 0 0 0 0
## [5,] 0 0 0 0 0 1 0 0 1
## [6,] 0 0 0 0 0 0 0 0 2
## [7,] 0 0 0 0 0 0 0 0 0
## [8,] 0 0 0 0 0 0 0 0 0
## [9,] 0 0 0 0 0 0 0 0 0

Resources