Rcpp and R: pass by reference - r

Working with Rcpp and R I observed the following behaviour, which I do not understand at the moment. Consider the following simple function written in Rcpp
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix hadamard_product(NumericMatrix & X, NumericMatrix & Y){
unsigned int ncol = X.ncol();
unsigned int nrow = X.nrow();
int counter = 0;
for (unsigned int j=0; j<ncol; j++) {
for (unsigned int i=0; i<nrow; i++) {
X[counter++] *= Y(i, j);
}
}
return X;
}
This simply returns the component-wise product of two matrices. Now I know that the arguments to this function are passed by reference, i.e., calling
M <- matrix(rnorm(4), ncol = 2)
N <- matrix(rnorm(4), ncol = 2)
M_copy <- M
hadamard_product(M, N)
will overwrite the original M. However, it also overwrites M_copy, which I do not understand. I thought that M_copy <- M makes a copy of the object M and saves it somewhere in the memory and not that this assignment points M_copy to M, which would be the behaviour when executing
x <- 1
y <- x
x <- 2
for example. This does not change y but only x.
So why does the behaviour above occur?

No, R does not make a copy immediately, only if it is necessary, i.e., copy-on-modify:
x <- 1
tracemem(x)
#[1] "<0000000009A57D78>"
y <- x
tracemem(x)
#[1] "<0000000009A57D78>"
x <- 2
tracemem(x)
#[1] "<00000000099E9900>"
Since you modify M by reference outside R, R can't know that a copy is necessary. If you want to ensure a copy is made, you can use data.table::copy. Or avoid the side effect in your C++ code, e.g., make a deep copy there (by using clone).

Related

Negative subscripts in matrix indexing

In Rcpp/RcppArmadillo I want to do the following: From an n x n matrix A, I would like to extract a submatrix A[-j, -j] where j is a vector of indices: In R it can go like
A = matrix(1:16, 4, 4)
j = c(2, 3)
A[-j, -j]
Seems that this functionality is not available in Rcpp or RcppArmadillo - sorry if I have overlooked something. One approach in R is
pos = setdiff(1:nrow(A), j)
A[pos, pos]
That will carry over to RcppArmadillo, but it seems akward having to create the vector pos as the complement of j - and I am not sure how to do it efficiently.
Does anyone have an idea for an efficient implementation / or a piece of code to share?
The armadillo docs have the function .shed which takes an argument(s) that "... contains the indices of rows/columns/slices to remove". From my reading to remove both rows and columns will take two calls of .shed().
Using your example
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// [[Rcpp::export]]
arma::mat fun(arma::mat X, arma::uvec row, arma::uvec col) {
X.shed_cols(col); // remove columns
X.shed_rows(row); // remove rows
return(X);
}
/***R
A = matrix(1:16, 4, 4)
j = c(2, 3)
A[-j, -j]
# minus one for zero indexing
fun(A, j-1, j-1)
*/

How to implement a system in deSolve, R, with N equations a N+m parameters?

I am coding a SIR model in a metapopulation in R, I want to integrate the systema and for that I am using de deSolve with C compiled code, I have used this before but in the case having a few parameters, now I would have Nxm parameters where N is the dimension of the system,so I would like that
/* file age3classp.c */
#include <R.h>
static double parms[3];
static double forc[1];
#define N parms[0]
#define N1 parms[1]
#define gam3 parms[2]
That this parms are vector or matrix NxN
It is posible?
In C my model will be of the form:
# SIR metapopulation model:
SIR <- function(t, state, parameters) {
with(as.list(c(state, parameters)),{
dS = c()
dI = c()
dR = c()
for(i in c(1:dim)){
dS[i] <- delta_N[i]*(S[i]+I[i]+R[i])
dI[i] <- 10
dR[i] <- 10
}
list(c(dS, dI, dR))
})
}
population <- c(S <- matrix(100,ncol=N,nrow =1 ), I <- matrix(10,ncol=N,nrow =1 ),
R <- matrix(0,ncol=N,nrow =1 ))
z <- ode(population, times, SIR, parameters)
In this way it does not recognise S[i] or the others as variables just as initial condition values.
How can I do in order to recognise it as a variable?
Yes this is possible, and there are different ways to do it, depending on your C programming skills. The easiest is to put both states and parameters in two long vectors and then split it at the C level using numbered parameter and variable indices. The equations are then formulated as for-loops.
To improve readability, it is also possible to use
preprocessor constants for the indices or
unions and structs (see below)
a vector for the states and a list for the parameters
The states (y) are always treated as vector, both on the R and the C level, but parameters (p) can also be passed down as lists and then split up at the C level. This can be tricky and requires some understanding of R's data structures.
However, I recommend to start vectorization on the R level. R is quite fast with vectorized models, so the speedup may not compensate for the C programming effort. An example how to implement a vectorized predator-prey model can be found here.
Another idea is to use a code generator, so you may have a look at the CRAN package rodeo that creates fast Fortran code from equations formulated as tabular (i.e. LibreOffice or Excel) tables. Usage does not require kowledge of Fortran.
More about rodeo can be found in a paper (https://doi.org/10.1016/j.envsoft.2017.06.036) and the package documentation at https://dkneis.github.io/
If one really wants to program it in C here a small implementation of a Lotka-Volterra-Competition model see Wikipedia with 3 states. The parameters are handed over as parameter vector p at the C level while a union is used to improve readability:
/* file model.c */
#include <R.h>
union parvec {
struct {
double r[3], a[6];
};
double value[9];
} p;
/* initializer */
void initmod(void (* odeparms)(int *, double *))
{
int N = 9; /* total number of parameters */
odeparms(&N, p.value);
}
/* Derivatives */
void derivs (int *neq, double *t, double *y, double *ydot,
double *yout, int *ip) {
double y_sum = 0;
for (int i = 0; i < *neq; i++) {
y_sum = 0;
for (int j = 0; j < *neq; j++) y_sum += p.a[i + *neq * j] * y[j];
ydot[i] = p.r[i] * y[i] * (1 - y_sum);
}
}
And here the calling R code:
# file call_model.R
library(deSolve)
system("R CMD SHLIB model.c")
dyn.load("model.dll")
p <- c(r = c(0.1, 0.3, 0.04), A = c(0.2, 0.3, 0.3, 0.5, 0.4, 0.2))
y <- c(X = c(2, 2, 2))
times <- seq(0, 200, by = 0.1)
out <- ode(y, times, func = "derivs", parms = p,
dllname = "model", initfunc = "initmod")
matplot.0D(out)
dyn.unload("model.dll")
More elaborated solutions are possible, of course.

Compute product of large 3-D arrays in R

I am working on an optimization problem, and to supply the analytic gradient to the routine, I need to compute the gradient of large 3D arrays with respect to parameters. The largest of these arrays s are of dimensions [L,N,J] where L,J ~ 2000, and N= 15. L and N stand for nodes over which the arrays are then aggregated up with some fixed weights w to vectors of length J. Computing the gradient naively generates a [L,N,J,J] arrays x whose elements are x(l,n,j,k) = -s(l,n,j)s(l,n,k) if j=/=k and x(l,n,j,j) = s(l,n,j)(1-s(l,n,j)).
Several functions in the procedure would use x as input, but as of right now I cannot keep x in memory due to its size. My approach so far has been to compute and directly aggregate up x over L and N to only ever store JxJ matrices, but the downside is that I cannot reuse x in other functions. This is what the following code does:
arma::mat agg_dsnode_ddelta_v3(arma::cube s_lnj,
arma::mat w_ln,
arma::vec w_l){
// Normal Matrix dimensions
unsigned int L = s_lnj.n_rows;
unsigned int N = s_lnj.n_cols;
unsigned int J = s_lnj.n_slices;
//resulting matrix
arma::mat ds_ddelta_jj = arma::mat(J,J, arma::fill::zeros);
for (unsigned int l = 0; l < L; l++) {
for (unsigned int n = 0; n < N; n++) {
arma::vec s_j = s_lnj.subcube(arma::span(l), arma::span(n), arma::span());
ds_ddelta_jj += - arma::kron(w_l(l) * w_ln(l,n) * s_j, s_j.as_row()) + arma::diagmat(w_l(l) * w_ln(l,n) * s_j);
}
}
return ds_ddelta_jj;
}
Alternatively, the 4-D array x could for instance be computed with sparseMatrix, but this approach does not scale up when the L and J increase
library(Matrix)
L = 2
N = 3
J = 4
s_lnj <- array(rnorm(L*N*J), dim=c(L,N,J))
## create spare Matrix with s(l,n,:) vertically on the diagonal
As_lnj = A = sparseMatrix(i=c(1:(L*N*J)),j=rep(1:(L*N), each=J),x= as.vector(aperm(s_lnj, c(3, 1, 2))))
## create spare Matrix with s(l,n,:) horizontally on the diagonal
Bs_lnj = sparseMatrix(i=rep(1:(L*N), each=J),j=c(1:(L*N*J)),x= as.vector(aperm(s_lnj, c(3, 1, 2))))
## create spare Matrix with s(l,n,:) diagonnally
Cs_lnj = sparseMatrix(i=c(1:(L*N*J)),j=c(1:(L*N*J)),x= as.vector(aperm(s_lnj, c(3, 1, 2))))
## compute 4-D array with sparseMatrix product
x = -(As_lnj %*% Bs_lnj) + Cs_lnj
I was wondering if you knew of faster way to implement the first code, or alternatively of an approach that would make the second one scalable.
Thank you in advance

How can I speed up my Rcpp code, which only carries out simple operations?

I'm trying to write a function that takes in a matrix and computes a value for every pair of columns. The matrix always has 2000 rows, but can potentially have a very large number of columns (up to 100,000 or so). The R code I started with is as follows:
x_dist <- data.frame(array(0,dim=c(ncol(x),ncol(x))))
cs <- colSums(x)
for (i in 1:ncol(x)) {
p_i <- x[,i]
for (j in 1:ncol(x)) {
p_j <- x[,j]
s <- p_i+p_j
fac <- cs[i]/(cs[i]+cs[j])
N1 <- fac*s
N2 <- (1-fac)*s
d1 <- (p_i+1)/(N1+1)
d2 <- (p_j+1)/(N2+1)
x_dist[i,j] <- sum(N1+N2-N1*d1-N2*d2+p_i*log(d1)+p_j*log(d2))
}
}
This function is quite slow. When there are only 400 columns in the matrix x, it takes about 32 seconds, and obviously grows quadratically in the number of columns.
Since I've heard Rcpp is good for speeding up for loops and matrix operations, I decided to give that a try. I am completely new to it, but ended up putting together the following function:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix wdist(NumericMatrix x) {
int nrow = x.nrow(),ncol=x.ncol();
NumericMatrix m = no_init_matrix(ncol,ncol);
NumericVector v1 = no_init_vector(nrow);
NumericVector v2 = no_init_vector(nrow);
NumericVector s = no_init_vector(nrow);
NumericVector N1 = no_init_vector(nrow);
NumericVector N2 = no_init_vector(nrow);
NumericVector d1 = no_init_vector(nrow);
NumericVector d2 = no_init_vector(nrow);
for(int i=0; i<ncol; ++i){
v1 = x(_,i);
for(int j=0; j<i; ++j){
v2 = x(_,j);
s = v1+v2;
N1 = sum(v1)*s/(sum(v1)+sum(v2));
N2 = s-N1;
d1 = (v1+1)/(N1+1);
d2 = (v2+1)/(N2+1);
m(i,j) = sum(N1+N2-N1*d1-N2*d2+v1*log(d1)+v2*log(d2));
}
}
return m;
}
This certainly makes a big difference. Now with 400 columns, this takes about 8 seconds. I am pleased by the improvement, but this is still intractably slow for my current test case of interest, which is 32,000 columns. I feel like I am doing some relatively simple operations, so it's confusing to me why my code is still this slow. I've tried to do some reading on writing efficient Rcpp code, but haven't found anything that helps address my issue. Please let me know if there is anything I'm doing wrong or any improvements I can look into to make my code faster (or even the R code itself, if that can be made faster than the Rcpp code!)
Some example data could be:
set.seed(121220)
x <- array(rpois(2000*400,3),dim=c(2000,400))
I refactored your base R code and hope it could speed up somewhat
f <- function(...) {
p <- x[, t(...)]
N <- matrix(rowSums(p), ncol = 1) %*% colSums(p) / sum(p)
d <- (p + 1) / (N + 1)
sum(N - N * d + p * log(d))
}
x_dist <- diag(0, ncol(x))
x_dist[lower.tri(x_dist)] <- combn(ncol(x), 2, FUN = f)
x_dist <- pmax(x_dist, t(x_dist))
To speed up your Rcpp code, you can try the following nested for loops after initializing your matrix m as a all-zero matrix:
for(int i=0; i<ncol-1; ++i){
v1 = x(_,i);
for(int j=i+1; j<ncol; ++j){
v2 = x(_,j);
s = v1+v2;
N1 = sum(v1)*s/sum(s);
N2 = s-N1;
d1 = (v1+1)/(N1+1);
d2 = (v2+1)/(N2+1);
val = sum(N1+N2-N1*d1-N2*d2+v1*log(d1)+v2*log(d2));
m(i,j) = val;
m(j,i) = val;
}
}
which applies the property that the matrix is symmetry and thus reduce computational complexity by half.

How can I get Z*Z^T using GSL, where Z is column vector?

I am looking through GSL functions to calculate Z*Z^T, where Z is n*1 column vector, but I could not find any fit function, every help is much appreciated.
GSL supports BLAS (basic linear algebra subprograms),
see [http://www.gnu.org/software/gsl/manual/html_node/GSL-BLAS-Interface.html][1].
The functions are classified by the complexity of the operation:
level 1: vector-vector operations
level 2: matrix-vector operations
level 3: matrix-matrix operations
Most functions come in different versions for float, double and complex numbers. Your operation is basically an outer product of the vector Z with itself.
You can initialize the vector as a column vector (here double precision numbers):
gsl_matrix * Z = gsl_matrix_calloc (n,1);
and then use the BLAS function gsl_blas_dgemm to compute
Z * Z^T. The first arguments of this function determine, whether or not the input matrices should be transposed before the matrix multiplication:
gsl_blas_dgemm (CblasNoTrans, CblasTrans, 1.0, Z, Z, 0.0, C);
Here's a working test program (you may need to link it against gsl and blas):
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>
int main(int argc, char ** argv)
{
size_t n = 4;
gsl_matrix * Z = gsl_matrix_calloc (n,1);
gsl_matrix * C = gsl_matrix_calloc (n,n);
gsl_matrix_set(Z,0,0,1);
gsl_matrix_set(Z,1,0,2);
gsl_matrix_set(Z,2,0,0);
gsl_matrix_set(Z,3,0,1);
gsl_blas_dgemm (CblasNoTrans,
CblasTrans, 1.0, Z, Z, 0.0, C);
int i,j;
for (i = 0; i < n; i++)
{
for (j = 0; j < n; j++)
{
printf ("%g\t", gsl_matrix_get (C, i, j));
}
printf("\n");
}
gsl_matrix_free(Z);
gsl_matrix_free(C);
return 0;
}

Resources