rowsums for matrix over randomly specified subsets of columns in R - r

I have this matrix
mu<-1:100
sigma<-100:1
sample.size<-10
toy.mat<-mapply(function(x,y){rnorm(x,y,n=sample.size)},x=mu,y=sigma)
colnames(toy.mat) <- c(rep(1,10),rep(2,10), rep(3,10), rep(4,10), rep(5,10),
rep(6,10), rep(7,10), rep(8,10), rep(9,10), rep(10,10) )
For the 10 columns named (1) I like to randomly select 5 pairs and rowsums each pair to generate 5 columns named (1a, 1b, 1c, 1d, 1e). I will do the same with columns named 2, 3 to 10.
Is there a data.table method to do this?

I'm still unsure about what you're trying to do.
This is what I understood.
I first split toy.mat into a list of 10 matrices (chunks). This is for convenience.
# Split toy.mat into list of matrices
lst <- lapply(seq(1, 100, by = 10), function(i) toy.mat[, i:(i+9)]);
Next, generate 5 random pairs, by sampling 10 numbers from the sequence 1:10 and coercing them into a 5x2 matrix. Repeat for all 10 matrix chunks.
# Generate 5 random pairs
set.seed(2017); # For reproducibility of results
rand <- replicate(10, matrix(sample(1:10, 10), ncol = 5), simplify = FALSE);
head(rand, n = 2);
#[[1]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 4 9 1 6
#[2,] 5 3 8 2 7
#
#[[2]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 7 9 3 5 10
#[2,] 1 4 2 6 8
Select corresponding columns based on pairs from rand and calculate the rowSums. Do that for every matrix chunk.
# Select column pairs and calculate rowSums
lst.rand <- lapply(1:10, function(i)
sapply(as.data.frame(rand[[i]]), function(w) rowSums(lst[[i]][, w])));
Bind list elements into matrix, and set column names.
# Bind into
mat <- do.call(cbind, lst.rand);
colnames(mat) <- as.vector(sapply(1:10, function(i) paste0(i, letters[1:5])));
mat[1:5, 1:6];
# 1a 1b 1c 1d 1e 2a
#[1,] 21.410826 34.90337 -11.297396 -50.56332 -115.82456 51.32369
#[2,] 5.323713 -144.26640 169.697538 -58.35540 96.25637 -78.95717
#[3,] -78.925937 -45.32790 -177.546469 251.69348 -52.85132 123.38741
#[4,] -33.673704 -95.64937 3.561921 -253.95046 -136.88182 -10.20650
#[5,] 51.080564 -180.87033 -161.861342 108.41120 188.07454 52.34226

Related

R: Row and column mean produces the same matrix as used for input

I randomly tried some commands in R and came across this one:
m <- matrix(c(1,2,3,4), nrow = 2)
#[,1] [,2]
#[1,] 1 3
#[2,] 2 4
apply(m, c(1,2), mean)
#[,1] [,2]
#[1,] 1 3
#[2,] 2 4
According to the documentation:
c(1, 2) indicates rows and columns
Why would this produce the same matrix as used as input?

Reshape each row of a data.frame to be a matrix in R

I am working with the hand-written zip codes dataset. I have loaded the dataset like this:
digits <- read.table("./zip.train",
quote = "",
comment.char = "",
stringsAsFactors = F)
Then I get only the ones:
ones <- digits[digits$V1 == 1, -1]
Right now, in ones I have 442 rows, with 256 column. I need to transform each row in ones to a 16x16 matrix. I think what I am looking for is a list of 16x16 matrix like the ones in this question:
How to create a list of matrix in R
But I tried with my data and did not work.
At first I tried ones <- apply(ones, 1, matrix, nrow = 16, ncol = 16) but is not working as I thought it was. I also tried lapply with no luck.
An alternative is to just change the dims of your matrix.
Consider the following matrix "M":
M <- matrix(1:12, ncol = 4)
M
# [,1] [,2] [,3] [,4]
# [1,] 1 4 7 10
# [2,] 2 5 8 11
# [3,] 3 6 9 12
We are looking to create a three dimensional array from this, so you can specify the dimensions as "row", "column", "third-dimension". However, since the matrix is constructed by column, you first need to transpose it before changing the dimensions.
`dim<-`(t(M), c(2, 2, nrow(M)))
# , , 1
#
# [,1] [,2]
# [1,] 1 7
# [2,] 4 10
#
# , , 2
#
# [,1] [,2]
# [1,] 2 8
# [2,] 5 11
#
# , , 3
#
# [,1] [,2]
# [1,] 3 9
# [2,] 6 12
though there are probably simple ways, you can try with lapply:
ones_matrix <- lapply(1:nrow(ones), function(i){matrix(ones[i, ], nrow=16)})

Matrix without the matrix function

I m trying to create a matrix in R without using matrix function I tried
this but it works just for 2 rows how do I specify nrows I have no idea
matrix2<-function(n)
{
d<-n/2
v1<-c(1:d)
v2<-c(d +1:n)
x<- rbind(v1,v2)
return(x)
}
I want to create a matrix without using the matrix function and byrow not bycolmun
exemple
a function I enter number of columns and the dimension N in my exemple and in return it creates a matrix
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
[4,] 7 8
for exepmle
This will give you a matrix for a specified number of columns. I wasn't sure what you meant with dimension N.
matrix2 <- function(N, columns){
d<-ceiling(N/columns) # rounds up to first integer
x <- c()
i <- 1
for(row in 1:d){
x <- rbind(x, c(i:(i+columns-1)))
i <- i+columns
}
return(x)
}
> matrix2(8,2)
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
[4,] 7 8
You can also use an indirection via a list. Then you can also set both, the column and the row number. And how the matrix is filled, row wise or column wise.
matrix2<-function(n,m,V,byRow=T){
if(length(V) != n*m ) warning("length of the vector and rows*columns differs" )
# split the vector
if(byRow) r <- n
if(!byRow) r <- m
fl <- 1:r # how often you have to split
fg <- sort(rep_len(fl,length(V))) # create a "splitting-vector"
L <- split(V,fg)
# convert the list to a matrix
if(byRow) res <- do.call(rbind,L)
if(!byRow) res <- do.call(cbind,L)
rownames(res) <- colnames(res) <- NULL
res #output
}
matrix2(2,4,1:8,F)
[,1] [,2] [,3] [,4]
[1,] 1 3 5 7
[2,] 2 4 6 8
matrix2(2,4,1:8,T)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8

Create a special matrix with data from two lists in R

i have here a minimal sample data to understand my final matrix:
test <- list( c(1, 2, 3, 4) )
test2 <- list( c(2, 3) )
and my matrix should be:
2 4 6 8
3 6 9 12
it's like a nestes for loop. I go over each row and in each i use the value from it and sum it with column value.
after a few houres I have this:
sapply(2, function(j) lapply(seq_along(test), function(i) test[[i]] * test2[[i]][j]))
it gives the final simulated row two: (param for row is '2' after sapply)
[[1]]
[1] 3 6 9 12
The going over rows could be done with seq_along(test2) but i don't know how to save data after each row ... i was last testing this: .. and fail..
a=matrix(data=0, nrow=2, ncol=4)
lapply(seq_along(test2), function(k) a[k,]<-unlist(sapply(2, function(j) lapply(seq_along(test), function(i) test[[i]] * test2[[i]][j])) ) )
output:
[1] 3 6 9 12
Later on, i would like to have more vectors in input lists and repeat the hole action descriped on top.
We can use outer after unlisting the list
t(outer(unlist(test), unlist(test2)))
# [,1] [,2] [,3] [,4]
#[1,] 2 4 6 8
#[2,] 3 6 9 12
You mean matrix multiplication? Quick example:
> t(matrix(unlist(test)) %*% matrix(unlist(test2), nrow = 1))
[,1] [,2] [,3] [,4]
[1,] 2 4 6 8
[2,] 3 6 9 12

Selecting rows conditional on column value from multiple matrices

I have
mat1 = matrix(c(2, 4, 3, 6, 7, 8), nrow=2, ncol=3)
mat2 = matrix(c(5, 6, 7, 1, 2, 3), nrow=2, ncol=3)
mat3 = matrix(c(8, 5, 8, 6, 7, 9), nrow=2, ncol=3)
which gives me 3 matrices:
[,1] [,2] [,3]
[1,] 2 3 7
[2,] 4 6 8
[,1] [,2] [,3]
[1,] 5 7 2
[2,] 6 1 3
[,1] [,2] [,3]
[1,] 8 8 7
[2,] 5 6 9
What I would like to do is compare the three matrices per row per first column, and select the row of the matrix that has the highest value on the first column.
For example: in row 1 column 1, matrix3 has the highest value (8) compared to matrix1 (2) and matrix2 (5). In row 2 column 1, matrix2 has the highest value (6). I would like to create a new matrix that copies the row of the matrix that has that highest value, resulting in:
[,1] [,2] [,3]
[1,] 8 8 7 <- From mat3
[2,] 6 1 3 <- From mat2
I know how to get a vector with the highest values from column 1, but I cannot get the whole row of the matrix copied into a new matrix. I have:
mat <- (mat1[1,])
which just copies the first row of the first matrix
[1] 2 3 7
I can select which number is the maximum number:
max(mat1[,1],mat2[,1],mat3[,1])
[1] 8
But I cannot seem to combine the two to return a matrix with the whole row.
Getting the code to loop for each row will be no problem, but I cannot seem to get it to work for the first row and as such, I am missing the essential code. Any help would be greatly appreciated. Thank you.
Are you working interactively? Do you manipulate multiple matrices spread in your workspace? A straightforward answer to your problem could be:
#which matrices have the largest element of column 1 in each row?
max.col(cbind(mat1[, 1], mat2[, 1], mat3[, 1]))
#[1] 3 2
rbind(mat3[1, ], mat2[2, ]) #use the above information to get your matrix
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3
On a more ganeral use-case, a way could be:
mat_ls = list(mat1, mat2, mat3) #put your matrices in a "list"
which_col = 1 #compare column 1
which_mats = max.col(do.call(cbind, lapply(mat_ls, function(x) x[, which_col])))
do.call(rbind, lapply(seq_along(which_mats),
function(i) mat_ls[[which_mats[i]]][i, ]))
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3
Probably not the prettiest solution
temp <- rbind(mat1, mat2, mat3)
rbind(temp[c(T,F),][which.max(temp[c(T,F),][, 1]),],
temp[c(F,T),][which.max(temp[c(F,T),][, 1]),])
## [,1] [,2] [,3]
## [1,] 8 8 7
## [2,] 6 1 3
You may also try:
a2 <- aperm(simplify2array( mget(ls(pattern="mat"))),c(3,2,1)) #gets all matrices with name `mat`
t(sapply(1:(dim(a2)[3]), function(i) {x1 <- a2[,,i]; x1[which.max(x1[,1]),]}))
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3

Resources