Interleaving results from many objects in Rcpp - r

I need to write to a file row by row of matrices and sparse matrices that appears in a list and I am doing something like this:
#include <RcppArmadillo.h>
// [[Rcpp::export]]
bool write_rows (Rcpp::List data, Rcpp::CharacterVector clss, int n) {
int len = data.length();
for(int i = 0; i<n; i++) {
for(int j=0; j<len; j++) {
if (clss[j] == "matrix") {
Rcpp::NumericMatrix x = data[j];
auto row = x.row(i);
// do something with row i
} else if (clss[j] == "dgCMatrix") {
arma::sp_mat x = data[j];
auto row = x.row(i);
// do something different with row i
}
}
}
return true;
}
This function can be called in R with:
data <- list(
x = Matrix::rsparsematrix(nrow = 1000, ncol = 1000, density = 0.3),
y = matrix(1:10000, nrow = 1000, ncol = 10)
)
clss <- c("dgCMatrix", "matrix")
write_rows(data, clss, 1000)
The function receives a list of matrices or sparse matrices with the same number of rows and writes those matrices row by row, ie. first writes first rows of all elements in data then the second row of all elements and etc.
My problem is that it seems that this line arma::sp_mat x = data[i]; seems to have a huge impact in performance since it seems that I am implicitly casting the list element data[j] to an Armadillo Sparse Matrix n times.
My question is: is there anyway I could avoid this? Is there a more efficient solution? I tried to find a solution by looking into readr's source code, since they also write list elements row by row, but they also do a cast for each row (in this line for example, but maybe this doesn't impact the performance because they deal with SEXPS?

With the clarification, it seems that the result should interleave the rows from each matrix. You can still do this while avoiding multiple conversions.
This is the original code, modified to generate some actual output:
// [[Rcpp::export]]
arma::mat write_rows(Rcpp::List data, Rcpp::CharacterVector clss, int nrows, int ncols) {
int len = data.length();
arma::mat result(nrows*len, ncols);
for (int i = 0, k = 0; i < nrows; i++) {
for (int j = 0; j < len; j++) {
arma::rowvec r;
if (clss[j] == "matrix") {
Rcpp::NumericMatrix x = data[j];
r = x.row(i);
}
else {
arma::sp_mat x = data[j];
r = x.row(i);
}
result.row(k++) = r;
}
}
return result;
}
The following code creates a vector of converted objects, and then extracts the rows from each object as required. The conversion is only done once per matrix. I use a struct containing a dense and sparse mat because it's a lot simpler than dealing with unions; and I don't want to drag in boost::variant or require C++17. Since there's only 2 classes we want to deal with, the overhead is minimal.
struct Matrix_types {
arma::mat m;
arma::sp_mat M;
};
// [[Rcpp::export]]
arma::mat write_rows2(Rcpp::List data, Rcpp::CharacterVector clss, int nrows, int ncols) {
const int len = data.length();
std::vector<Matrix_types> matr(len);
std::vector<bool> is_dense(len);
arma::mat result(nrows*len, ncols);
// populate the structs
for (int j = 0; j < len; j++) {
is_dense[j] = (clss[j] == "matrix");
if (is_dense[j]) {
matr[j].m = Rcpp::as<arma::mat>(data[j]);
}
else {
matr[j].M = Rcpp::as<arma::sp_mat>(data[j]);
}
}
// populate the result
for (int i = 0, k = 0; i < nrows; i++) {
for (int j = 0; j < len; j++, k++) {
if (is_dense[j]) {
result.row(k) = matr[j].m.row(i);
}
else {
arma::rowvec r(matr[j].M.row(i));
result.row(k) = r;
}
}
}
return result;
}
Running on some test data:
data <- list(
a=Matrix(1.0, 1000, 1000, sparse=TRUE),
b=matrix(2.0, 1000, 1000),
c=Matrix(3.0, 1000, 1000, sparse=TRUE),
d=matrix(4.0, 1000, 1000)
)
system.time(z <- write_rows(data, sapply(data, class), 1000, 1000))
# user system elapsed
# 185.75 35.04 221.38
system.time(z2 <- write_rows2(data, sapply(data, class), 1000, 1000))
# user system elapsed
# 4.21 0.05 4.25
identical(z, z2)
# [1] TRUE

Related

How to make R code with an array to be more efficient?

I have a following R code which is not efficient. I would like to make this efficient using Rcpp. Particularly, I am not used to dealing with array in Rcpp. Any help would be appreciated.
myfunc <- function(n=1600,
m=400,
p = 3,
time = runif(n,min=0.05,max=4),
qi21 = rnorm(n),
s0c = rnorm(n),
zc_min_ecox_multi = array(rnorm(n*n*p),dim=c(n,n,p)),
qi=matrix(0,n,n),
qi11 = rnorm(p),
iIc_mat = matrix(rnorm(p*p),p,p)){
for (j in 1:n){
u<-time[j]
ind<-1*(u<=time)
locu<-which(time==u)
qi2<- sum(qi21*ind) /s0c[locu]
for (i in 1:n){
qi1<- qi11%*%iIc_mat%*%matrix(zc_min_ecox_multi[i,j,],p,1)
qi[i,j]<- -(qi1+qi2)/m
}
}
}
Computing time is about 7.35 secs. I need to call this function over and over again, maybe 20 times.
system.time(myfunc())
user system elapsed
7.34 0.00 7.35
First thing to do would be to profile your code: profvis::profvis({myfunc()}).
What you can do is precompute qi11 %*% iIc_mat once.
You get (with minor improvements):
precomp <- qi11 %*% iIc_mat
for (j in 1:n) {
u <- time[j]
qi2 <- sum(qi21[u <= time]) / s0c[time == u]
for (i in 1:n) {
qi1 <- precomp %*% zc_min_ecox_multi[i, j, ]
qi[i, j] <- -(qi1 + qi2) / m
}
}
that is twice as fast (8 sec -> 4 sec).
Vectorizing the i loop then seems straightforward:
q1_all_i <- tcrossprod(precomp, zc_min_ecox_multi[, j, ])
qi[, j] <- -(q1_all_i + qi2) / m
(12 times as fast now)
And if you want to try it in Rcpp, you will first need a function to multiply the matrices...
#include<Rcpp.h>
#include<numeric>
// [[Rcpp::plugins("cpp11")]]
Rcpp::NumericMatrix mult(const Rcpp::NumericMatrix& lhs,
const Rcpp::NumericMatrix& rhs)
{
if (lhs.ncol() != rhs.nrow())
Rcpp::stop ("Incompatible matrices");
Rcpp::NumericMatrix out(lhs.nrow(),rhs.ncol());
Rcpp::NumericVector rowvec, colvec;
for (int i = 0; i < lhs.nrow(); ++i)
{
rowvec = lhs(i,Rcpp::_);
for (int j = 0; j < rhs.ncol(); ++j)
{
colvec = rhs(Rcpp::_,j);
out(i, j) = std::inner_product(rowvec.begin(), rowvec.end(),
colvec.begin(), 0.);
}
}
return out;
}
Then port your function...
// [[Rcpp::export]]
Rcpp::NumericMatrix myfunc_rcpp( int n, int m, int p,
const Rcpp::NumericVector& time,
const Rcpp::NumericVector& qi21,
const Rcpp::NumericVector& s0c,
const Rcpp::NumericVector& zc_min_ecox_multi,
const Rcpp::NumericMatrix& qi11,
const Rcpp::NumericMatrix& iIc_mat)
{
Rcpp::NumericMatrix qi(n, n);
Rcpp::NumericMatrix outermat = mult(qi11, iIc_mat);
for (int j = 0; j < n; ++j)
{
double qi2 = 0;
for(int k = 0; k < n; ++k)
{
if(time[j] <= time[k]) qi2 += qi21[k];
}
qi2 /= s0c[j];
for (int i = 0; i < n; ++i)
{
Rcpp::NumericMatrix tmpmat(p, 1);
for(int z = 0; z < p; ++z)
{
tmpmat(z, 0) = zc_min_ecox_multi[i + n*j + z*n*n];
}
Rcpp::NumericMatrix qi1 = mult(outermat, tmpmat);
qi(i,j) -= (qi1(0,0) + qi2)/m;
}
}
return qi;
}
Then in R:
my_rcpp_func <- function(n=1600,
m=400,
p = 3,
time = runif(n,min=0.05,max=4),
qi21 = rnorm(n),
s0c = rnorm(n),
zc_min_ecox_multi = array(rnorm(n*n*p),dim=c(n,n,p)),
qi11 = rnorm(p),
iIc_mat = matrix(rnorm(p*p),p,p))
{
myfunc_rcpp(n, m, p, time, qi21, s0c, as.vector(zc_min_ecox_multi),
matrix(qi11,1,p), iIc_mat)
}
This is certainly faster, and gives the same results as your own function, but it's no quicker than the in-R optimizations suggested by F Privé. Maybe optimizing the C++ code could get things even faster, but ultimately you are multiplying 2 reasonably large matrices together over 2.5 million times, so it's never going to be all that fast. R is optimized pretty well for this kind of calculation after all...

Allocate Rcpp List of n NumericMatrix

Is there a way to allocate an Rcpp List of length n, where each element of the List will be filled with a NumericMatrix, but the size of each NumericMatrix can change?
I have an idea for doing this using std::list and push_back(), but the size of the list may be quite large and I want to avoid the overhead of creating an extra copy of the list when I return from the function.
The below R code gives an idea of what I hope to do:
myvec = function(n) {
x = vector("list", n)
for (i in seq_len(n)) {
nc = sample(1:3, 1)
nr = sample(1:3, 1)
x[[i]] = matrix(rbinom(nc * nr, size = 1, prob = 0.5),
nrow = nr, ncol = nc)
}
x
}
This could result in something like:
> myvec(2)
[[1]]
[,1]
[1,] 0
[2,] 1
[[2]]
[,1] [,2] [,3]
[1,] 0 1 0
[2,] 0 1 1
Update: based on the comments of #Dirk and #Ralf, I created functions based on Rcpp::List and std::list with a wrap at the end. Speed comparisons don't seem to favor one version over the other, but perhaps there's an inefficiency I'm not aware of.
src = '
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List myvec(int n) {
Rcpp::RNGScope rngScope;
Rcpp::List x(n);
// Rcpp::IntegerVector choices = {1, 2 ,3};
Rcpp::IntegerVector choices = Rcpp::seq_len(50);
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x(i) = Rcpp::NumericMatrix(nc, nr, entries.begin());
}
return x;
}
// [[Rcpp::export]]
Rcpp::List myvec2(int n) {
Rcpp::RNGScope scope;
std::list< Rcpp::NumericMatrix > x;
// Rcpp::IntegerVector choices = {1, 2 ,3};
Rcpp::IntegerVector choices = Rcpp::seq_len(50);
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x.push_back( Rcpp::NumericMatrix(nc, nr, entries.begin()));
}
return Rcpp::wrap(x);
}
'
sourceCpp(code = src)
Resulting benchmarks on my computer are:
> library(microbenchmark)
> rcpp_list = function() {
+ set.seed(10);myvec(105)
+ }
> std_list = function() {
+ set.seed(10);myvec2(105)
+ }
> microbenchmark(rcpp_list(), std_list(), times = 1000)
Unit: milliseconds
expr min lq mean median uq
rcpp_list() 1.8901 1.92535 2.205286 1.96640 2.22380
std_list() 1.9164 1.95570 2.224941 2.00555 2.32315
max neval cld
7.1569 1000 a
7.1194 1000 a
The fundamental issue that Rcpp objects are R objects governed my R's memory management where resizing is expensive: full copies.
So when I have tasks similar to yours where sizes may change, or are unknown, I often work with different data structures -- the STL gives us plenty -- and only convert to R(cpp) at the return step at the end.
The devil in the detail here (as always). Profile, experiment, ...
Edit: And in the narrower sense of "can we return a List of NumericMatrix objects with varying sizes" the answer is of course we can because that is what List objects do. You can also insert other types.
As Dirk said, it is of course possible to create a list with matrices of different size. To make it a bit more concrete, here a translation of your R function:
#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::List myvec(int n) {
Rcpp::List x(n);
Rcpp::IntegerVector choices = {1, 2 ,3};
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x(i) = Rcpp::NumericMatrix(nc, nr, entries.begin());
}
return x;
}
/***R
myvec(2)
*/
The main difference to the R code are the explicitly named vectors choices and entries, which are only implicit in the R code.

C code with openmp called from R gives inconsistent results

Below is a piece of C code run from R used to compare each row of a matrix to a vector. The number of identical values is stored in the first column of a two-column matrix.
I know it can easily be done in R (as done to check the results), but this is a first step for a more complex use case.
When openmp is not used, it works ok. When openmp is used, it give correlated (0.99) but inconsistent results.
Question1: What am I doing wrong?
Question2: I use a double for loop to fill the output matrix (ret) with zeros. What would be a better solution?
Also, inconsistencies were observed when the code was used in a package. I tried to make the code reproducible using inline, but it does not recognize the openmp statements (I tried to include 'omp.h', in the parameters of cfunction, ...).
Question3: How can we make this code work with inline?
I'm (too?) far outside my comfort zone on this topic.
library(inline)
compare <- cfunction(c(x = "integer", vec = "integer"), "
const int I = nrows(x), J = ncols(x);
SEXP ret;
PROTECT(ret = allocMatrix(INTSXP, I, 2));
int *ptx = INTEGER(x), *ptvec = INTEGER(vec), *ptret = INTEGER(ret);
for (int i=0; i<I; i++)
for (int j=0; j<2; j++)
ptret[j * I + i] = 0;
int i, j;
#pragma omp parallel for default(none) shared(ptx, ptvec, ptret) private(i,j)
for (j=0; j<J; j++)
for (i=0; i<I; i++)
if (ptx[i + I * j] == ptvec[j]) {++ptret[i];}
UNPROTECT(1);
return ret;
")
N = 3e3
M = 1e4
m = matrix(sample(c(-1:1), N*M, replace = TRUE), nc = M)
v = sample(-1:1, M, replace = TRUE)
cc = compare(m, v)
cr = rowSums(t(t(m) == v))
all.equal(cc[,1], cr)
Thanks to the comments above, I reconsidered the data race issue.
IIUC, my loop was parallelized on j (the columns). Then, each thread had its own value of i (the rows), but possible identical values across threads, that were then trying to increment ptret[i] at the same time.
To avoid this, I now loop on i first, so that only a single thread will increment each row.
Then, I realized that I could move the zero-initialization of ptret within the first loop.
It seems to work. I get identical results, increased CPU usage, and 3-4x speedup on my laptop.
I guess that solves questions 1 and 2. I will have a closer look at the inline/openmp problem.
Code below, fwiw.
#include <omp.h>
#include <R.h>
#include <Rinternals.h>
#include <stdio.h>
SEXP c_compare(SEXP x, SEXP vec)
{
const int I = nrows(x), J = ncols(x);
SEXP ret;
PROTECT(ret = allocMatrix(INTSXP, I, 2));
int *ptx = INTEGER(x), *ptvec = INTEGER(vec), *ptret = INTEGER(ret);
int i, j;
#pragma omp parallel for default(none) shared(ptx, ptvec, ptret) private(i, j)
for (i = 0; i < I; i++) {
// init ptret to zero
ptret[i] = 0;
ptret[I + i] = 0;
for (j = 0; j < J; j++)
if (ptx[i + I * j] == ptvec[j]) {
++ptret[i];
}
}
UNPROTECT(1);
return ret;
}

Problems with scale() and the Multidimensional Lp-Norm

Today I was trying to debug my code and stumbled across something that renders my solutions useless. What i am generally trying to calculate is the multidimensional L2-Norm for the following two matrices. As long as I am not using scale() everything is working fine. Nonetheless, as soon as I scale the matrices the solutions of the three used approaches are not the same anymore. What am I missing here?
set.seed(655)
df.a <- data.frame(A = sample(100:124, 24), B = sample(1:24, 24), C = sample(1:24, 24), D = rep(0, times=24))
df.b <- data.frame(A = sample(125:148, 24), B = sample(25:48, 24), C = sample(1:24, 24), D = sample(1:100, 24))
For this reason I have three different approaches:
sapply-function and sqrt of rowSums
sse <- function(x1, x2) sum((x1 - x2) ^ 2)
distanceChangeByTech <- function(x) {
sse(df.a[,x], df.b[,x])
}
help1 <- t(data.frame(sapply(colnames(df.a), distanceChangeByTech)))
dist_sap <- sqrt(rowSums(help1))
multidimensional Euclidean distance using RCPP:
multiEucl <- cxxfunction(signature(x="matrix", y="matrix"), plugin="Rcpp",
body='
Rcpp::NumericMatrix dx(x);
Rcpp::NumericMatrix dy(y);
const int N = dx.nrow();
const int M = dx.ncol();
double sum = 0;
for(int i=0; i<N; i++){
for(int j=0; j<M; j++){
sum = sum + pow(dx(i,j) - dy(i,j), 2);
}
}
return wrap(sqrt(sum));
')
multidimensional Lp-Norm using RCPP:
multiPNorm <- cxxfunction(signature(x="matrix", y="matrix", p="numeric"), plugin="Rcpp",
body='
Rcpp::NumericMatrix dx(x);
Rcpp::NumericMatrix dy(y);
double dp = Rcpp::as<double>(p);
const int N = dx.nrow();
const int M = dx.ncol();
double sum = 0;
double rsum = 0;
for(int i=0; i<N; i++){
for(int j=0; j<M; j++){
sum = sum + pow(abs(dx(i,j) - dy(i,j)), dp);
}
}
rsum = pow(sum, 1/dp);
return wrap(rsum);
')
When I tried this at first all worked well.
> multiEucl(as.matrix(df.a), as.matrix(df.b))
[1] 366.1543
> multiPNorm(as.matrix(df.a), as.matrix(df.b), 2)
[1] 366.1543
> sqrt(rowSums(help1)) sapply.colnames.df.a...distanceChangeByTech.
366.1543
But as soon as I scale the matrices, which I want to do because I will do a Clustering based on these distancemeasures, there is a fault. The solutions are not the same anymore?! What is causing this? I am using these commands to scale.
df.a <- as.data.frame(scale(df.a))
df.a[is.na(df.a)] <- 0
df.b <- as.data.frame(scale(df.b))
df.b[is.na(df.b)] <- 0
> multiEucl(as.matrix(df.a), as.matrix(df.b))
[1] 12.51781
> multiPNorm(as.matrix(df.a), as.matrix(df.b), 2)
[1] 8.944272
> sqrt(rowSums(help1))
sapply.colnames.df.a...distanceChangeByTech.
12.51781
You used abs() which is documented eg here but you meant to use fabs() which is documented here.
The cmath.h header provides overloaded abs() as well, but you probably didn't include that.
It seems that abs() is not doing the right thing here. Instead I changed my coding of the multiPNorm and the changes seem to work.
multiPNorm <- cxxfunction(signature(x="matrix", y="matrix", p="numeric"), plugin="Rcpp",
body='
Rcpp::NumericMatrix dx(x);
Rcpp::NumericMatrix dy(y);
double dp = Rcpp::as<double>(p);
const int N = dx.nrow();
const int M = dx.ncol();
double sum = 0;
double rsum = 0;
double help = 0;
for(int i=0; i<N; i++){
for(int j=0; j<M; j++){
help = dx(i,j) - dy(i,j);
if (help < 0) {
help = - help;
}
sum = sum + pow(help, dp);
}
}
rsum = pow(sum, 1/dp);
return wrap(rsum);
')

R: How to compute correlation between rows of a matrix without having to transpose it?

I have a big matrix and am interested in computing the correlation between the rows of the matrix. Since the cor method computes correlation between the columns of a matrix, I am transposing the matrix before calling cor. But since the matrix is big, transposing it is expensive and is slowing down my program. Is there a way to compute the correlations among the rows without having to take transpose?
EDIT: thanks for the responses. thought i'd share some findings. my input matrix is 16 rows by 239766 cols and comes from a .mat file. I wrote C# code to do the same thing using the csmatio library. it looks like this:
foreach (var file in Directory.GetFiles(path, interictal_pattern))
{
var reader = new MatFileReader(file);
var mla = reader.Data[0] as MLStructure;
convert(mla.AllFields[0] as MLNumericArray<double>, data);
double sum = 0;
for (var i = 0; i < 16; i++)
{
for (var j = i + 1; j < 16; j++)
{
sum += cor(data, i, j);
}
}
var avg = sum / 120;
if (++count == 10)
{
var t2 = DateTime.Now;
var t = t2 - t1;
Console.WriteLine(t.TotalSeconds);
break;
}
}
static double[][] createArray(int rows, int cols)
{
var ans = new double[rows][];
for (var row = 0; row < rows; row++)
{
ans[row] = new double[cols];
}
return ans;
}
static void convert(MLNumericArray<double> mla, double[][] M)
{
var rows = M.Length;
var cols = M[0].Length;
for (int i = 0; i < rows; i++)
for (int j = 0; j < cols; j++)
M[i][j] = mla.Get(i, j);
}
static double cor(double[][] M, int i, int j)
{
var count = M[0].Length;
double sum1 = 0, sum2 = 0;
for (int ctr = 0; ctr < count; ctr++)
{
sum1 += M[i][ctr];
sum2 += M[j][ctr];
}
var mu1 = sum1 / count;
var mu2 = sum2 / count;
double numerator = 0, sumOfSquares1 = 0, sumOfSquares2 = 0;
for (int ctr = 0; ctr < count; ctr++)
{
var x = M[i][ctr] - mu1;
var y = M[j][ctr] - mu2;
numerator += x * y;
sumOfSquares1 += x * x;
sumOfSquares2 += y * y;
}
return numerator / Math.Sqrt(sumOfSquares1 * sumOfSquares2);
}
this gave a throughput of 22.22s for 10 files or 2.22s/file
Then I profiled my R code:
ptm=proc.time()
for(file in files)
{
i = i + 1;
mat = readMat(paste(path,file,sep=""))
a = t(mat[[1]][[1]])
C = cor(a)
correlations[i] = mean(C[lower.tri(C)])
}
print(proc.time()-ptm)
to my surprise its running faster than C# and is giving throughput of 5.7s per 10 files or 0.6s/file (an improvement of almost 4x!). The bottleneck in C# is the methods inside csmatio library to parse double values from input stream.
and if i do not convert the csmatio classes into a double[][] then the C# code runs extremely slow (order of magnitude slower ~20-30s/file).
Seeing that this problem arises from a data input issue whose details are not stated (and only hinted at in a comment), I will assume this is a comma-delimited file of unquoted numbers with the number of columns= Ncol. This does the transposition on input.
in.mat <- matrix( scan("path/to/the_file/fil.txt", what =numeric(0), sep=","),
ncol=Ncol, byrow=TRUE)
cor(in.nmat)
One dirty work-around would be to apply cor-functions row-wise and produce the correlation matrix from the results. You could try if this is any more efficient (which I doubt, though you could fine-tune it by not double computing everything or the redundant diagonal cases):
# Apply 2-fold nested row-wise functions
set.seed(1)
dat <- matrix(rnorm(1000), nrow=10)
cormat <- apply(dat, MARGIN=1, FUN=function(z) apply(dat, MARGIN=1, FUN=function(y) cor(z, y)))
cormat[1:3,1:3] # Show few first
# [,1] [,2] [,3]
#[1,] 1.000000000 0.002175792 0.1559263
#[2,] 0.002175792 1.000000000 -0.1870054
#[3,] 0.155926259 -0.187005418 1.0000000
Though, generally I would expect the transpose to have a really, really efficient implementation, so it's hard to imagine when that would be the bottle-neck. But, you could also dig through the implementation of 'cor' function and call the correlation C-function itself by first making sure your rows are suitable. Type 'cor' in the terminal to see the implementation, which is mostly a wrapper that makes input suitable for the C-function:
# Row with C-call from the implementation of 'cor':
# if (method == "pearson")
# .Call(C_cor, x, y, na.method, FALSE)
You can use outer:
outer(seq(nrow(mat)), seq(nrow(mat)),
Vectorize(function(x, y) cor(mat[x , ], mat[y , ])))
where mat is the name of your matrix.

Resources