I had implemented a function in R which was long to run. I have succeeded in improving it in R but now I would like to speed it up more by using Rcpp package.
I have created the following Rcpp code. Unfortunately, it takes approximately the same time to run as the R code. I would like thus to improve it. Has anyone an idea on how to improve this piece of code?
Thanks a lot!
#include <math.h>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double kernelcpp(NumericVector a, NumericVector b, int N){
int i;
double sum=0.0;
for (i=0;i<N;i++){
if (a[i] > b[i])
sum+= a[i] - b[i];
else
sum+= b[i] - a[i];
}
return(exp( - sum));
}
// [[Rcpp::export]]
NumericVector testFromontcpp(NumericMatrix z1, NumericMatrix z2, int Nbootstrap){
// first element of TKeps = TK
int i,j,k,t;
int dim1 = z1.nrow();
int dim2 = z2.nrow();
double n1 = (double) dim1;
double n2 = (double) dim2;
int dimension = z1.ncol();
int N = dim1 + dim2;
NumericVector TKeps(Nbootstrap+1);
Rcpp::NumericMatrix bb(N,N);
double cc = 1 / (n1*n2*(n1+n2-2));
double a = sqrt(1/(n1*n1-n1)-cc);
double b = - sqrt(1/(n2*n2-n2)-cc);
for (i=0 ; i<N ; i++){
for (j=0 ; j<N ; j++){
if (i != j){
if (i < dim1) {
if (j < dim1){
bb(i,j) = kernelcpp(z1(i,_),z1(j,_),dimension);
} else {
bb(i,j) = kernelcpp(z1(i,_),z2(j-dim1,_),dimension);
}
}
else{
if (j < dim1){
bb(i,j) = kernelcpp(z2(i-dim1,_),z1(j,_),dimension);
} else {
bb(i,j) = kernelcpp(z2(i-dim1,_),z2(j-dim1,_),dimension);
}
}
}
}
}
TKeps(0)=0.0;
for (i=0 ; i<N ; i++){
for (j=0 ; j<N ; j++){
if (i != j){
if (i < dim1) {
if (j < dim1){
TKeps(0) += bb(i,j)* (a*a + cc);
} else {
TKeps(0) += bb(i,j) * (a*b + cc);
}
}
else{
if (j < dim1){
TKeps(0) += bb(i,j) * (a*b + cc);
} else {
TKeps(0) += bb(i,j) * (b*b + cc);
}
}
}
}
}
for (k=1 ; k<=Nbootstrap ; k++){
TKeps(k)=0;
int R[N];
for (i = 0 ; i < N ; i++)
R[i] = i;
for (i = 0; i < N - 1 ; i++) {
int j = i + rand() / (RAND_MAX / (N - i) + 1);
t = R[j];
R[j] = R[i];
R[i] = t;
}
for (i=0 ; i<N ; i++){
for (j=0 ; j<N ; j++){
if (i != j){
if (R[i] < n1) {
if (R[j] < n1){
TKeps(k) += bb(i,j) * (a*a + cc);
} else {
TKeps(k) += bb(i,j) * (a*b + cc);
}
} else{
if (R[j] < n1){
TKeps(k) += bb(i,j) * (b*a + cc);
} else {
TKeps(k) += bb(i,j) * (b*b + cc);
}
}
}
}
}
}
return(TKeps);
}
Since I do not exactly know what your code does, I can see two things from the scratch:
The function you call from your R environment is testFromontcpp(...). I suggest that this function should have SEXP values as parameters. Those S-Expressions are pointer to the memory of R. If you don't use SEXP, then both matrices will be copied:
Consider a 1000x1000 matrix, this means you have 1 million entries saved in R, which are copied to C++. To do so write:
testFromontcpp(SEXP x, SEXP y, SEXP z) {
NumericMatrix z1(x), z2(y);
int *Nbootstrap = INTEGER(z);
...
}
Be careful: In the for-loop you cannot use i<Nbootstrap. You have to write i<*Nbootstrap!!!
Secondly...and more important: Since R's matrices are saved as pointer to column and from the column pointer to the row, C's matrices are saved the other way round. What I want to say is that it costs a lot to jump into memory and jump back the whole time instead of following the memory path. My suggestion for this is: Switch the for-loops, so first iterate over the row of a specific column and not the other way round.
To the last point: In a task at university I had the problem with iterating over matrices, too. In my case it was way cheaper to transpose the matrix and then do calculations.
I hope I could help you.
Best,
Michael
PS: Referring to point 1...I just benchmarked your code with your implementation and with using SEXP. With SEXP it is slightly quicker for a 100x100 matrix with random numbers between 1 to 10.
Related
I understand that any iterative function can be written recursively, but I still don't quite understand how to write it as a recursive function.
This function receives an array (elements) and a int (number of elements), and sorts them from lower to higher, but I don't get how to make it recursive, any help would be appreciated!
void ShellSort(int *arr, int num){
for (int i = num / 2; i > 0; i = i / 2)
{
for (int j = i; j < num; j++)
{
for(int k = j - i; k >= 0; k = k - i)
{
if (arr[k+i] >= arr[k])
{
break;
}
else
{
arr[k]=arr[k] + arr[k+i];
arr[k+i]=arr[k] - arr[k+i];
arr[k]=arr[k] - arr[k+i];
}
}
}
}
return ;
}
I have a following R code which is not efficient. I would like to make this efficient using Rcpp. Particularly, I am not used to dealing with array in Rcpp. Any help would be appreciated.
myfunc <- function(n=1600,
m=400,
p = 3,
time = runif(n,min=0.05,max=4),
qi21 = rnorm(n),
s0c = rnorm(n),
zc_min_ecox_multi = array(rnorm(n*n*p),dim=c(n,n,p)),
qi=matrix(0,n,n),
qi11 = rnorm(p),
iIc_mat = matrix(rnorm(p*p),p,p)){
for (j in 1:n){
u<-time[j]
ind<-1*(u<=time)
locu<-which(time==u)
qi2<- sum(qi21*ind) /s0c[locu]
for (i in 1:n){
qi1<- qi11%*%iIc_mat%*%matrix(zc_min_ecox_multi[i,j,],p,1)
qi[i,j]<- -(qi1+qi2)/m
}
}
}
Computing time is about 7.35 secs. I need to call this function over and over again, maybe 20 times.
system.time(myfunc())
user system elapsed
7.34 0.00 7.35
First thing to do would be to profile your code: profvis::profvis({myfunc()}).
What you can do is precompute qi11 %*% iIc_mat once.
You get (with minor improvements):
precomp <- qi11 %*% iIc_mat
for (j in 1:n) {
u <- time[j]
qi2 <- sum(qi21[u <= time]) / s0c[time == u]
for (i in 1:n) {
qi1 <- precomp %*% zc_min_ecox_multi[i, j, ]
qi[i, j] <- -(qi1 + qi2) / m
}
}
that is twice as fast (8 sec -> 4 sec).
Vectorizing the i loop then seems straightforward:
q1_all_i <- tcrossprod(precomp, zc_min_ecox_multi[, j, ])
qi[, j] <- -(q1_all_i + qi2) / m
(12 times as fast now)
And if you want to try it in Rcpp, you will first need a function to multiply the matrices...
#include<Rcpp.h>
#include<numeric>
// [[Rcpp::plugins("cpp11")]]
Rcpp::NumericMatrix mult(const Rcpp::NumericMatrix& lhs,
const Rcpp::NumericMatrix& rhs)
{
if (lhs.ncol() != rhs.nrow())
Rcpp::stop ("Incompatible matrices");
Rcpp::NumericMatrix out(lhs.nrow(),rhs.ncol());
Rcpp::NumericVector rowvec, colvec;
for (int i = 0; i < lhs.nrow(); ++i)
{
rowvec = lhs(i,Rcpp::_);
for (int j = 0; j < rhs.ncol(); ++j)
{
colvec = rhs(Rcpp::_,j);
out(i, j) = std::inner_product(rowvec.begin(), rowvec.end(),
colvec.begin(), 0.);
}
}
return out;
}
Then port your function...
// [[Rcpp::export]]
Rcpp::NumericMatrix myfunc_rcpp( int n, int m, int p,
const Rcpp::NumericVector& time,
const Rcpp::NumericVector& qi21,
const Rcpp::NumericVector& s0c,
const Rcpp::NumericVector& zc_min_ecox_multi,
const Rcpp::NumericMatrix& qi11,
const Rcpp::NumericMatrix& iIc_mat)
{
Rcpp::NumericMatrix qi(n, n);
Rcpp::NumericMatrix outermat = mult(qi11, iIc_mat);
for (int j = 0; j < n; ++j)
{
double qi2 = 0;
for(int k = 0; k < n; ++k)
{
if(time[j] <= time[k]) qi2 += qi21[k];
}
qi2 /= s0c[j];
for (int i = 0; i < n; ++i)
{
Rcpp::NumericMatrix tmpmat(p, 1);
for(int z = 0; z < p; ++z)
{
tmpmat(z, 0) = zc_min_ecox_multi[i + n*j + z*n*n];
}
Rcpp::NumericMatrix qi1 = mult(outermat, tmpmat);
qi(i,j) -= (qi1(0,0) + qi2)/m;
}
}
return qi;
}
Then in R:
my_rcpp_func <- function(n=1600,
m=400,
p = 3,
time = runif(n,min=0.05,max=4),
qi21 = rnorm(n),
s0c = rnorm(n),
zc_min_ecox_multi = array(rnorm(n*n*p),dim=c(n,n,p)),
qi11 = rnorm(p),
iIc_mat = matrix(rnorm(p*p),p,p))
{
myfunc_rcpp(n, m, p, time, qi21, s0c, as.vector(zc_min_ecox_multi),
matrix(qi11,1,p), iIc_mat)
}
This is certainly faster, and gives the same results as your own function, but it's no quicker than the in-R optimizations suggested by F Privé. Maybe optimizing the C++ code could get things even faster, but ultimately you are multiplying 2 reasonably large matrices together over 2.5 million times, so it's never going to be all that fast. R is optimized pretty well for this kind of calculation after all...
I am trying to get some results by using Rcpp and this is the code.
#include <Rcpp.h>
#include <math.h>
using namespace Rcpp;
enter code here
// [[Rcpp::export]]
double compssr(NumericMatrix dist, NumericVector x, int n, int p) {
double ssr = 0; double del_sq = 0; double del_ij = 0;
int i, j, ip;
for (i = 0; i < n; i++) {
for (j = 0; j < i; j++) {
for (ip = 0; ip < p; ip++) {
del_sq = del_sq + (x(i, ip) - x(j, ip))*(x(i, ip) - x(j, ip));
if (i == j) del_sq = 0;
}
del_ij = sqrt(del_sq);
ssr = ssr + (dist(i, j) - del_ij)*(dist(i, j) - del_ij);
}}
return ssr;
}
NumericMatrix Scaling_X(NumericVector xbar, NumericMatrix x, double n, double p) {
NumericMatrix Sig_x(p, p);
int i, ii, ip, ip2;
for (ii = 0; ii < n; ii++) {
for (i = 0; i < p; i++) {
x(ii, i) = x(ii, i) - xbar(i);
}}
for (i = 0; i < n; i++) {
for (ip = 0; ip < p; ip++) {
for (ip2 = 0; ip2 < p; ip2++) {
Sig_x(ip, ip2) = Sig_x(ip, ip2) + x(i, ip)*x(i, ip2);
}}}
for (i = 0; i < Sig_x.ncol(); i++) {
for (ii = 0; ii < Sig_x.nrow(); ii++) {
Sig_x(i, ii) = Sig_x(i, ii) / n;
}}
return Sig_x;
}
In fact there are some more functions and the file name of this code is "test.cpp"
And I called this file in R by using
sourceCpp("test.cpp")
There was no error and I could use the function "compssr" the first function(return type: double)
But I couldn't call the function Scaling_X
Is there any error in my code?
I made other functions and I could use the function with return type double, but I couldn't use others(NumericMatrix, NumericVector, List)
You are missing the
// [[Rcpp::export]]
in front of function Scaling_X so the compileAttributes() function does as it has been told: compile both functions, make just one available.
I am trying to code in R a(centered) weighted moving average function that returns me a vector of the same size than the input vector.
The following code almost gives me what I want but it does not work for the first and last values of my vector
set.seed(0)
len=10
x=floor(l*runif(l))
weights=c(1,3,0,3,1)
weights=weights/sum(weights)
rollapply(x,width=length(weights), function(x) sum(x*weights),align="center")
na.omit(filter(x,sides=2,weights))
Setting partial=TRUE in the rollapply function is sort of what I want to do. Anyway it does not work since my function does not support an x of changing sizes.
I could the latter and manually add the sides computations with a loop. It would work but I would like to find a nicer (computationally faster) way to do it.
For a more rigorous description of my needs here is a mathematical version
r is the vector my function would return
x and the weights w as inputs :
With Rcpp, you can do:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector roll_mean(const NumericVector& x,
const NumericVector& w) {
int n = x.size();
int w_size = w.size();
int size = (w_size - 1) / 2;
NumericVector res(n);
int i, ind_x, ind_w;
double w_sum = Rcpp::sum(w), tmp_wsum, tmp_xwsum, tmp_w;
// beginning
for (i = 0; i < size; i++) {
tmp_xwsum = tmp_wsum = 0;
for (ind_x = i + size, ind_w = w_size - 1; ind_x >= 0; ind_x--, ind_w--) {
tmp_w = w[ind_w];
tmp_wsum += tmp_w;
tmp_xwsum += x[ind_x] * tmp_w;
}
res[i] = tmp_xwsum / tmp_wsum;
}
// middle
int lim2 = n - size;
for (; i < lim2; i++) {
tmp_xwsum = 0;
for (ind_x = i - size, ind_w = 0; ind_w < w_size; ind_x++, ind_w++) {
tmp_xwsum += x[ind_x] * w[ind_w];
}
res[i] = tmp_xwsum / w_sum;
}
// end
for (; i < n; i++) {
tmp_xwsum = tmp_wsum = 0;
for (ind_x = i - size, ind_w = 0; ind_x < n; ind_x++, ind_w++) {
tmp_w = w[ind_w];
tmp_wsum += tmp_w;
tmp_xwsum += x[ind_x] * tmp_w;
}
res[i] = tmp_xwsum / tmp_wsum;
}
return res;
}
I use this function in one of my packages.
Just put that in a .cpp file and source it with Rcpp::sourceCpp.
I am novice in R. I want to know how we can write the below for loop in an efficient way. I am getting correct answer by the below code for small dataset.
data <- data.frame(x1=c(rep('a',12)),
x2=c(rep('b',12)),
x3=c(rep(as.Date('2017-03-09'),4),rep(as.Date('2017-03-10'),4),rep(as.Date('2017-03-11'),4)),
value1= seq(201,212),
x4=c(as.Date('2017-03-09'),as.Date('2017-03-10'),as.Date('2017-03-11'),as.Date('2017-03-12')
,as.Date('2017-03-10'),as.Date('2017-03-11'),as.Date('2017-03-12'),as.Date('2017-03-13')
,as.Date('2017-03-11'),as.Date('2017-03-12'),as.Date('2017-03-13'),as.Date('2017-03-14')),
value2= seq(101,112), stringsAsFactors = FALSE)
Below for loop script:
for (i in 1:length(data$x3)){
print(i)
if (!is.na(data$x4[i])){
if(data$x4[i] == data$x3[i] && data$x2[i]==data$x2[i] && data$x1[i]==data$x1[i]){
data$diff[i] <- data$value1[i] - data$value2[i]
}
else{
print("I am in else")
for (j in 1:length(data$x3)){
print(c(i,j))
# print(a$y[i])
if(data$x4[i]==data$x3[j] && data$x1[i]==data$x1[j] && data$x2[i]==data$x2[j]){
# print(a$x[j])
data$diff[i] <- data$value1[j] - data$value2[i]
break
}
}
}
}
}
If you want performance, the answer is often Rcpp.
Translating your R code in Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector f_Rcpp(List data) {
StringVector x1 = data["x1"];
StringVector x2 = data["x2"];
NumericVector x3 = data["x3"];
NumericVector x4 = data["x4"];
NumericVector value1 = data["value1"];
NumericVector value2 = data["value2"];
int n = value1.size();
NumericVector diff(n, NA_REAL);
int i, j;
for (i = 0; i < n; i++) {
Rprintf("%d\n", i);
if (x4[i] != NA_REAL) {
if (x4[i] == x3[i]) {
diff[i] = value1[i] - value2[i];
} else {
Rprintf("I am in else\n");
for (j = 0; j < n; j++) {
Rprintf("%d %d\n", i, j);
if (x4[i] == x3[j] && x1[i] == x1[j] && x2[i] == x2[j]) {
diff[i] = value1[j] - value2[i];
break;
}
}
}
}
}
return diff;
}
/*** R
f_Rcpp(data)
*/
Put that in a .cpp file and source it.
You can do this:
data$diff <- sapply(seq_along(data$x3), function(i) {
if (!is.na(data$x4[i])){
ind <- which(data$x4[i] == data$x3 & data$x1[i] == data$x1 & data$x2[i] == data$x2)
j <- `if`(i %in% ind, i, min(ind))
data$value1[j] - data$value2[i]
} else {
NA
}
})
Beware in your code, if column $diff doesn't exist yet, doing data$diff[1] <- 100 will put all the values of the column at 100.