How do I split a matrix with different column sizes? - r

n <- c(12,24)
mu<-c(6.573,6.5)
sigma<-sqrt(0.25)
Diseased.Data<-round(rnorm(n[1], mu[1], sigma), 4)
Healthy.Data<-round(rnorm(n[2], mu[2], sigma), 4)
g <- c(2,3,4)
for(i in 1:3){
pool.dis.data <- matrix(NA,n[1]/g[i],n[1]/g[i])
for(j in n[1]/g[i]){
pool.dis.data <- replicate(n[1]/g[i],mean(sample(Diseased.Data,g[i])))
}
}
When I run the code above, I get only the answer from the last element of g. What I need is each column in the matrix to have an element from g. For example, the matrix should look like:
m <- cbind(a1,a2,a3,a4,a5,a6)
m1 <- cbind(b1,b2,b3,b4)
m2 <- cbind(c1,c2,c3)

Related

How can I fill a matrix with a repeat loop and delete columns by condition in R?

My R-Code:
l <- list()
for(i in 1:5){
n <- 1
mat <- matrix(0L,500,10)
repeat{
a <- rnorm(10)
b <- rnorm(10)
c <- a+b
mat[n,] <- c
mat <- mat[mat[,10] >= 0 + (i/10) & mat[,1] >= 0 +(i/10),]
n <- n +1
if(mat[500,] != 0){
break
}
}
l[[i]] <- mat
}
l
I would like to get 5 Matrices, which are stored in a list. Each matrix should have exactly 500 rows and should not have negative values in its rows at position [,1] or [,10].
I tried to build a repeat loop:
Calculate Vector
Store vector in matrix
delete if condition is met
repeat if there arent 500 rows
Unfortunately, there's something wrong and it doesn't work. What can I do? Thanks!
If you add an if-clause that tests your condition before adding the line to your matrix, it should work:
l <- list()
for(i in 1:5){
n <- 1
mat <- matrix(0L,500,10)
repeat{
a <- rnorm(10)
b <- rnorm(10)
c <- a+b
if(!any(c[c(1,10)] < 0 + i/10)){
mat[n,] <- c
n <- n +1
}
if(n==501){
break
}
}
l[[i]] <- mat
}

Get correlations for all combinations between two differently sized dataframes

Is there an R function to calculate all possible correlations and provide p-values between rows in two data frames (with similar number of columns but varying rows), similar as to the cor() function in R?
I found cor.test(), but it only takes a dataframe of similar size.
To the best of my knowledge, the function cor.test only accepts vectors of numeric values that have the same length.
You can achieve what you are looking for with, e.g., the function corrplot::cor.mtest.
Here is a reproducible example. First load the library and create the fake data...
library(corrplot)
nbgene1 <- 100
nbgene2 <- 200
n <- 10
df1 <- matrix(rnorm(nbgene1 * n), nbgene1, n)
rownames(df1) <- paste0("Df1_gene", 1:nbgene1)
colnames(df1) <- paste0("Subject", 1:n)
df2 <- matrix(rnorm(nbgene2 * n), nbgene2, n)
rownames(df2) <- paste0("Df2_gene", 1:nbgene2)
colnames(df2) <- paste0("Subject", 1:n)
The function cor.mtest only accepts a single data-frame, with individuals as rows and variables as columns, so you need to combine the two data-frames...
df_combined <- rbind(df1, df2)
... and input the transposed data-frame to cor.mtest (because in your case, rows are genes and columns are individuals).
res_cortest <- cor.mtest(t(df_combined))
Then all you need to do is extract the correct p-values from the result.
pval <- res_cortest$p[1:nbgene1, (nbgene1+1):(nbgene1+nbgene2)]
You may want to rename the rows and columns of this matrix for a more interpretable result.
dimnames(pval) <- list(rownames(df1), rownames(df2))
Also, don't forget to correct for multiple testing !
# For example with Banjamini and Hochberg's method
padj <- matrix(p.adjust(pval, "BH"), nbgene1, nbgene2, dimnames = dimnames(pval))
What's even more interesting than using cor.mtest is to look at what's inside!
> corrplot::cor.mtest
function (mat, ...)
{
mat <- as.matrix(mat)
n <- ncol(mat)
p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)
diag(p.mat) <- 0
diag(lowCI.mat) <- diag(uppCI.mat) <- 1
for (i in 1:(n - 1)) {
for (j in (i + 1):n) {
tmp <- cor.test(x = mat[, i], y = mat[, j], ...)
p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
if (!is.null(tmp$conf.int)) {
lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]
uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]
}
}
}
list(p = p.mat, lowCI = lowCI.mat, uppCI = uppCI.mat)
}
It's a simple for loop!
An equivalent of this loop in the context of our reproducible example would be...
pval <- matrix(NA, nbgene1, nbgene2,
dimnames = list(rownames(df1),
rownames(df2)))
for (i in 1:nbgene1) {
for (j in 1:nbgene2) {
pval[i, j] <- cor.test(df1[i, ], df2[j, ])$p.value
}
}
The multiple correction step is the same.

printing intermediate multiplication using loop

I have data frame 'df' which has 8*8 rows and columns.
here i am getting the answer directly the 5th multiplication, i want all the intermediate multiplications answers.
And i also want the code in loop for 15 times, so there will be 15 intermediate multiplication outputs.
Code:
p <- eigen(df)$vector
d <- eigen(df)$values
n <- 5
p %*% diag(d^n) %*% solve(p)
expected output will: if i am multiplying n = 15 times, then there should be 15 matrices for each intermediate multiplication.
please help.
Assuming that you mean power (X^n) can do the following:
mat <- matrix(1:9, nrow=3)
n <- 5
pows <- list()
pows[[1]] <- mat
for (i in 2:n) {
pows[[i]] <- pows[[i - 1]] %*% pows[[1]]
}
p <- eigen(mat)$vector
d <- eigen(mat)$values
res <- p %*% diag(d^n) %*% solve(p)
all(res - pows[[n]] < 1e-6)
Can also use:
library(expm)
mat %^% n

For loop returning just one output in R

I am getting a 49*49 matrix as input from a csv and trying to print the sum as a 49*49 matrix but I am getting just one value as output for sum.
w <- read.csv(file="ma.csv", header=F, sep=",");
sum <- 4
for(i in 1:49){
for(j in 1:49)
{
sum = sum + w[i,j];
}
}
May be
w1 <- matrix(NA, ncol=3, nrow=3)
sum1 <- 4
for(i in 1:3){
for(j in 1:3){
w1[i,j] = sum1 + w[i,j];
}
}
w1[] <- cumsum(w1)
Or without any loop
w2 <- w
w2[] <- cumsum(w+sum1)
identical(w2, w1)
#[1] TRUE
data
set.seed(24)
w <- matrix(sample(0:20, 3*3, replace=TRUE), ncol=3)

R: Generate matrix from function

In R I'm interested in the general case to generate a matrix from a formula such as:
X = some other matrix
Y(i, j) = X(i, j) + Y(i - 1, j - 1)
Unfortunately I can't find how to account for the matrix self-referencing.
Obviously order of execution and bounds checking are factors here, but I imagine these could be accounted for by the matrix orientation and formula respetively.
Thanks.
This solution assumes that you want Y[1,n] == X[1,n] and Y[n,1] == X[n,1]. If not, you can apply the same solution on the sub-matrix X[-1,-1] to fill in the values of Y[-1,-1]. It also assumes that the input matrix is square.
We use the fact that Y[N,N] = X[N,N] + X[N-1, N-1] + ... + X[1,1] plus similar relations for off-diagonal elements. Note that off-diagonal elements are a diagonal of a particular sub-matrix.
# Example input
X <- matrix(1:16, ncol=4)
Y <- matrix(0, ncol=ncol(X), nrow=nrow(X))
diag(Y) <- cumsum(diag(X))
Y[1,ncol(X)] <- X[1,ncol(X)]
Y[nrow(X),1] <- X[nrow(X),1]
for (i in 1:(nrow(X)-2)) {
ind <- seq(i)
diag(Y[-ind,]) <- cumsum(diag(X[-ind,])) # lower triangle
diag(Y[,-ind]) <- cumsum(diag(X[,-ind])) # upper triangle
}
Well, you can always use a for loop:
Y <- matrix(0, ncol=3, nrow=3)
#boundary values:
Y[1,] <- 1
Y[,1] <- 2
X <- matrix(1:9, ncol=3)
for (i in 2:nrow(Y)) {
for (j in 2:ncol(Y)) {
Y[i, j] <- X[i, j] + Y[i-1, j-1]
}
}
If that is too slow you can translate it to C++ (using Rcpp) easily.

Resources