I am trying to produce a function that can compute a series of weighted products
where W is a diagonal matrix. There are many W matrices but only a single X matrix.
To be efficient I can represent W as an array (w) containing the diagonal part. Then in R this would be
crossprod(X, w*X)
or just
crossprod(X * sqrt(w))
I could for loop over the series of W's, but that seems inefficient. The entire product can be though of as Only the w changes so the products X_i * X_j for column i and j can be recycled. The function I'd like to produce looks like this
Rcpp::List Crossprod_sparse(Eigen::MappedSparseMatrix<double> X, Eigen::Map<Eigen::MatrixXd> W) {
int K = W.cols();
int p = X.cols();
Rcpp::List crossprods(W.cols());
for (int k = 0; k < K; k++) {
Eigen::SparseMatrix<double> matprod(p, p);
for (int i = 0; i < p; i++) {
Eigen::SparseVector<double> prod = X.col(i).cwiseProduct(W.col(k));
for (int j = i; j < p; j++) {
double out = prod.dot(X.col(j));
matprod.coeffRef(i,j) = out;
matprod.coeffRef(j,i) = out;
}
}
matprod.makeCompressed();
crossprods[k] = matprod;
}
return crossprods;
}
which returns the correct products, and should be efficient because of operating on the intermediate prod variable. However, for looping in R using crossprod seems to still be much faster, despite not taking advantage of recycling. How can I optimize this function more?
You may try calculating the Cholesky decomposition of your weight matrix, multiply your matrix by this decomposition, and then calculate the crossproduct as listed in the RcppEigen documentation. Some example code using RcppEigen could be
#include <RcppEigen.h>
using Eigen::MatrixXd;
using Eigen::VectorXd;
//[[Rcpp::depends(RcppEigen)]]
// [[Rcpp::export]]
MatrixXd weightedCovariance(MatrixXd & X, MatrixXd & W) {
int p = X.cols(); //assuming each row is a unique observation
MatrixXd L = W.llt().matrixL();
MatrixXd XtWX = MatrixXd(p, p).setZero().selfadjointView<Eigen::Lower>().rankUpdate(X.transpose() * L);
return(XtWX);
}
// [[Rcpp::export]]
MatrixXd diag_weightedCovariance(MatrixXd & X, VectorXd & W) {
int p = X.cols(); //assuming each row is a unique observation
VectorXd w = W.cwiseSqrt();
MatrixXd XtWX = MatrixXd(p, p).setZero().selfadjointView<Eigen::Lower>().rankUpdate(X.transpose() * w.asDiagonal());
return(XtWX);
}
Eigen does a lot of optimization under the hood, so telling it that the result is symmetric should speed things up. Checking timings in R with microbenchmark:
set.seed(23847) #for reproducibility
require(microbenchmark)
#Create R version of Cpp function
Rcpp::sourceCpp('weighted_covar.cpp')
#generate data
p <- 100
n <- 1000
X <- matrix(rnorm(p*n), nrow=n, ncol=p)
W <- diag(1, n, n)
w <- diag(W)
R_res <- crossprod(chol(W) %*% X ) #general weighted covariance
R_res_diag <- crossprod(sqrt(w) * X ) #utilizing your optimization, if we know it's diagonal
Cpp_res <- weightedCovariance(X, W)
Cpp_res_diag <- diag_weightedCovariance(X, w)
#make sure all equal
all.equal(R_res, Cpp_res)
#[1] TRUE
all.equal(R_res, R_res_diag)
#[1] TRUE
all.equal(Cpp_res_diag, R_res_diag)
#[1] TRUE
#check timings
microbenchmark(crossprod(chol(W) %*% X ))
# Unit: milliseconds
# expr min lq mean median uq max neval
# crossprod(chol(W) %*% X) 251.6066 262.739 275.1719 268.615 276.4994 479.9318 100
microbenchmark(crossprod(sqrt(w) * X ))
# Unit: milliseconds
# expr min lq mean median uq max neval
# crossprod(sqrt(w) * X) 5.264319 5.394289 5.499552 5.430885 5.496387 6.42099 100
microbenchmark(weightedCovariance(X, W))
# Unit: milliseconds
# expr min lq mean median uq max neval
# weightedCovariance(X, W) 26.64534 27.84632 31.99341 29.44447 34.59631 51.39726 100
microbenchmark(diag_weightedCovariance(X, w), unit = "ms")
# Unit: milliseconds
# expr min lq mean median uq max neval
# diag_weightedCovariance(X, w) 0.67571 0.702567 0.7469946 0.713579 0.7405515 1.321888 100
I also haven't used your sparse structure in this implementation so you may get more speed after accounting for that.
Generally, if you have a diagonal matrix in a product, you should pass just the diagonal coefficients w and use them as w.asDiagonal():
Eigen::MatrixXd foo(Eigen::SparseMatrix<double> const & X, Eigen::VectorXd const & w)
{
return X.transpose() * w.asDiagonal() * X;
}
If you want to pre-compute everything except the multiplication with w, you can try storing the outer products of each row of X and accumulate them on demand:
class ProductHelper
{
std::vector<Eigen::SparseMatrix<double> > matrices;
public:
ProductHelper(Eigen::SparseMatrix<double> const& X_)
{
// The loop below is much more efficient with row-major X
Eigen::SparseMatrix<double, Eigen::RowMajor> const &X = X_;
matrices.reserve(X.rows());
for(int i=0; i<X.rows(); ++i)
{
matrices.push_back(X.row(i).transpose()*X.row(i));
}
}
Eigen::MatrixXd multiply(Eigen::VectorXd const& w) const
{
assert(w.size()==matrices.size());
assert(w.size()>0);
Eigen::MatrixXd A = w[0]*matrices[0];
for(int i=1; i<w.size(); ++i)
{
A+=w[i]*matrices[i];
}
return A;
}
};
Related
I have got an R function which I need to calculate approximately one million times for vectors of length ~ 5000. Is there any possibily to speed it up by implementing it in Rcpp? I hardly worked with Rcpp before and the code below does not to work:
set.seet(1)
a <- rt(5e3, df = 2)
b <- rt(5e3, df = 2.5)
c <- rt(5e3, df = 3)
d <- rt(5e3, df = 3.5)
sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax)))
#[1] -367780.1
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double f_outer(NumericVector u, NumericVector v, NumericVector x, NumericVector y) {
double result = sum((1 - Rcpp::outer(u, v, Rcpp::pmax)) * (1 - Rcpp::outer(x, y, Rcpp::pmax)));
return(result);
}
Thank you very much!
F. Privé is right -- we'll want to go with loops here; I've got the following C++ code in a file so-answer.cpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double f_outer(NumericVector u, NumericVector v, NumericVector x, NumericVector y) {
// We'll use the size of the first and second vectors for our for loops
int n = u.size();
int m = v.size();
// Make sure the vectors are appropriately sized for what we're doing
if ( (n != x.size() ) || ( m != y.size() ) ) {
::Rf_error("Vectors not of compatible sizes.");
}
// Initialize a result variable
double result = 0.0;
// And use loops instead of outer
for ( int i = 0; i < n; ++i ) {
for ( int j = 0; j < m; ++j ) {
result += (1 - std::max(u[i], v[j])) * (1 - std::max(x[i], y[j]));
}
}
// Then return the result
return result;
}
Then we see in R that the C++ code gives the same answer as your R code, but runs much faster:
library(Rcpp) # for sourceCpp()
library(microbenchmark) # for microbenchmark() (for benchmarking)
sourceCpp("so-answer.cpp") # compile our C++ code and make it available in R
set.seed(1) # for reproducibility
a <- rt(5e3, df = 2)
b <- rt(5e3, df = 2.5)
c <- rt(5e3, df = 3)
d <- rt(5e3, df = 3.5)
sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax)))
#> [1] -69677.99
f_outer(a, b, c, d)
#> [1] -69677.99
# Same answer, so looking good. Which one's faster?
microbenchmark(base = sum((1 - outer(a, b, pmax)) * (1 - outer(c, d, pmax))),
rcpp = f_outer(a, b, c, d))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> base 3978.9201 4119.6757 4197.9292 4131.3300 4144.4524 10121.5558 100
#> rcpp 118.8963 119.1531 129.4071 119.4767 122.5218 909.2744 100
#> cld
#> b
#> a
Created on 2018-12-13 by the reprex package (v0.2.1)
I have this loop that assigns the value to an element in branch.from that represents the index of the closest (while less than) whose value is less than the value of the corresponding element in branch.
for (j in 2:length(branch)) {
branch.from[j]<-max(which(branch[1:(j-1)]<=branch[j]))
}
branch has just over 8 million elements, so this takes too long for my liking. Is there a faster way?
For example,
branch[1:20]<-c(1,54,25,54,22,54,36,54,43,54,40,54,27,54,34,54,26,54,32,54)
The above code gives,
branch.from[1:20]<-c(1,1,1,3,1,5,5,7,7,9,7,11,5,13,13,15,5,17,17,19)
An Rcpp function
I'm not sure if there is any easy way to vectorize the code, so Rcpp may be the best option:
library(Rcpp); library(inline)
fun2 <- cppFunction(
'std::vector<int> branchFrom(NumericVector branch)
{
std::vector<int> branch_from;
for(int j = branch.size() - 1; j > 0; j--) {
int val = -1;
for(int k = j - 1; k > -1; k--){
if(branch[j] >= branch[k]){
val = k;
break;
}
}
branch_from.push_back(val + 1);
}
branch_from.push_back(1);
std::reverse(branch_from.begin(), branch_from.end());
return branch_from;
}')
Note the the second for-loop won't necessarily iterate through all of k, since it stops once a single value x[k] <= x[j] is found.
Analysis
Using microbenchmark() from the microbenchmark package where the original implementation is encapsulated in base, I get the following:
Unit: microseconds
expr min lq mean median uq max neval
base 124.232 130.3555 152.7990 133.941 141.176 1048.724 100
fun2 5.105 5.8145 8.0211 7.137 7.766 79.508 100
This indicates a significant speed-up from the original implementation.
I got a document term matrix of ~1600 documents x ~120 words. I would like to compute the cosine similarity between all these vectors, but we are speaking about ~1,300,000 comparisons [n * (n - 1) / 2].
I used parallel::mclapply with 8 but it still takes forever.
Which other solution do you suggest?
Thanks
Here's my take on it.
If I define cosine similarity as
coss <- function(x) {crossprod(x)/(sqrt(tcrossprod(colSums(x^2))))}
(I think that is about as quickly as I can make it with base R functions and the often overseen crossprod which is a little gem). If I compare it with an RCpp function using RCppArmadillo (slightly updated as suggested by #f-privé)
NumericMatrix cosine_similarity(NumericMatrix x) {
arma::mat X(x.begin(), x.nrow(), x.ncol(), false);
// Compute the crossprod
arma::mat res = X.t() * X;
int n = x.ncol();
arma::vec diag(n);
int i, j;
for (i=0; i<n; i++) {
diag(i) = sqrt(res(i,i));
}
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
res(i, j) /= diag(i)*diag(j);
return(wrap(res));
}
(this might possibly be optimised with some of the specialized functions in the armadillo library - just wanted to get some timing measurements).
Comparing those yields
> XX <- matrix(rnorm(120*1600), ncol=1600)
> microbenchmark::microbenchmark(cosine_similarity(XX), coss(XX), coss2(XX), times=50)
> microbenchmark::microbenchmark(coss(x), coss2(x), cosine_similarity(x), cosine_similarity2(x), coss3(x), times=50)
Unit: milliseconds
expr min lq mean median uq max
coss(x) 173.0975 183.0606 192.8333 187.6082 193.2885 331.9206
coss2(x) 162.4193 171.3178 183.7533 178.8296 184.9762 319.7934
cosine_similarity2(x) 169.6075 175.5601 191.4402 181.3405 186.4769 319.8792
neval cld
50 a
50 b
50 a
which is really not that bad. The gain in computing the cosine similarity using C++ is super small (with # f-privé's solution being fastest) so I'm guessing your timing issues are due to what you are doing to convert the text from the words to numbers and not when calculating the cosine similarity. Without knowing more about your specific code it is hard for us to help you.
I very agree with #ekstroem on the use of crossprod but I think there are unnecessary computations in his implementation. I think by the way that coss is giving a wrong result.
Comparing his answer with mine you can use this cpp file:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix cosine_similarity(NumericMatrix x) {
arma::mat X(x.begin(), x.nrow(), x.ncol(), false);
arma::mat rowSums = sum(X % X, 0);
arma::mat res;
res = X.t() * X / sqrt(rowSums.t() * rowSums);
return(wrap(res));
}
// [[Rcpp::export]]
NumericMatrix& toCosine(NumericMatrix& mat,
const NumericVector& diag) {
int n = mat.nrow();
int i, j;
for (j = 0; j < n; j++)
for (i = 0; i < n; i++)
mat(i, j) /= diag(i) * diag(j);
return mat;
}
/*** R
coss <- function(x) {
crossprod(x)/(sqrt(crossprod(x^2)))
}
coss2 <- function(x) {
cross <- crossprod(x)
toCosine(cross, sqrt(diag(cross)))
}
XX <- matrix(rnorm(120*1600), ncol=1600)
microbenchmark::microbenchmark(
cosine_similarity(XX),
coss(XX),
coss2(XX),
times = 20
)
*/
Unit: milliseconds
expr min lq mean median uq max neval
cosine_similarity(XX) 172.1943 176.4804 181.6294 181.6345 185.7542 199.0042 20
coss(XX) 262.6167 270.9357 278.8999 274.4312 276.1176 337.0531 20
coss2(XX) 134.6742 137.6013 147.3153 140.4783 146.5806 204.2115 20
So, I will definility go for computing the crossprod in base R and then do the scaling in Rcpp.
PS: If you have a very sparse matrix, you could use package Matrix to convert your matrix to a sparse matrix. This new class of matrix also have the crossprod method so you could use coss2 as well.
The coop package's coop::cosine function is probably the best way to do this now. It is implemented in Rcpp, but also has a different approach than lsa::cosine, and also has lower memory overhead. Its use is exactly the same as lsa::cosine, just switch out the package names.
For further speedups, you may want to change your BLAS library. The coop manual has a few basic details and suggestions.
simple problem.
I want to check if the difference of two points (i, j) is greater than a threshold (diff).
If the difference between the points exceeds the threshold the index should be returned and the next distance is measured but from the new datapoint. It is a simple cutofffilter where all datapoints under a predefined threshold are filtered. The only trick is, that the measurement is performed from always the "last" point (that was "far enough away" from the point before).
I first wrote it as two nested loops like:
x <- sample(1:100)
for(i in 1:(length(x)-1)){
for(j in (i+1):length(x)){
if(abs(x[i] - x[j]) >= cutoff) {
print(j)
i <- j # set the index to the current datapoint
break }
}}
This solution is kind of intuitive. But does not work proper. I think the assignment of i and j is not valid. The first loop just ignores to jump and loops through all datapoints.
Well, I did not want to waste time with debugging and just thought I can do the same with a recursive function.
So I wrote it like:
checkCutOff.f <- function(x,cutoff,i = 1) {
options(expressions=500000)
# Loops through the data and comperes the temporally fixed point 'i with the looping points 'j
for(j in (i+1):length(x)){
if( abs(x[i] - x[j]) >= cutoff ){
break
}
}
# Recursive function to update the new 'i - stops at the end of the dataset
if( j<length(x) ) return(c(j,checkCutOff.f(x,cutoff,j)))
else return(j)
}
x<-sample(1:100000)
checkCutOff.f(x,1)
This code works. But I get a stack overflow with big datasets. That's why I ask myself if this code is efficient.
For me is increasing limits etc. always a hint for inefficient code...
So my question is:
What kind of solution is really efficient?
Thanks!
You should avoid growing your return value with c. That's inefficient. Allocate to the maximum size and subset to the needed size in the end.
Note that your function always includes length(x) in your result, which is wrong:
set.seed(42)
x<-sample(1:10)
checkCutOff.f(x, 100)
#[1] 10
Here is an R solution with a loop:
checkCutOff.f1 <- function(x,cutoff) {
i <- 1
j <- 1
k <- 1
result <- integer(length(x))
while(j < length(x)) {
j <- j + 1
if (abs(x[i] - x[j]) >= cutoff) {
result[k] <- j
k <- k + 1
i <- j
}
}
result[seq_len(k - 1)]
}
all.equal(checkCutOff.f(x, 4), checkCutOff.f1(x, 4))
#[1] TRUE
#the correct solution includes length(x) here (by chance)
It's easy to translate to Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector checkCutOff_f1cpp(NumericVector x, double cutoff) {
int i = 0;
int j = 1;
int k = 0;
IntegerVector result(x.size());
while(j < x.size()) {
if (std::abs(x[i] - x[j]) >= cutoff) {
result[k] = j + 1;
k++;
i = j;
}
j++;
}
result = result[seq_len(k)-1];
return result;
}
Then in R:
all.equal(checkCutOff.f(x, 4), checkCutOff_f1cpp(x, 4))
#[1] TRUE
Benchmarks:
library(microbenchmark)
y <- sample(1:1000)
microbenchmark(
checkCutOff.f(y, 4),
checkCutOff.f1(y, 4),
checkCutOff_f1cpp(y, 4)
)
#Unit: microseconds
# expr min lq mean median uq max neval cld
# checkCutOff.f(y, 4) 3665.105 4681.6005 7798.41776 5323.068 6635.9205 41028.930 100 c
# checkCutOff.f1(y, 4) 1384.524 1507.2635 1831.43236 1769.031 2070.7225 3012.279 100 b
# checkCutOff_f1cpp(y, 4) 8.765 10.7035 26.40709 14.240 18.0005 587.958 100 a
I'm sure this can be improved further and more testing should be done.
This post is about speeding up R code using Rcpp package to avoid recursive loops.
My input is define by the following example (length 7) which is part of the data.frame (length 51673) that I used :
S=c(906.65,906.65,906.65,906.65,906.65,906.65,906.65)
T=c(0.1371253,0.1457896,0.1248953,0.1261278,0.1156931,0.0985253,0.1332596)
r=c(0.013975,0.013975,0.013975,0.013975,0.013975,0.013975,0.013975)
h=c(0.001332596,0.001248470,0.001251458,0.001242143,0.001257921,0.001235755,0.001238440)
P=c(3,1,5,2,1,4,2)
A= data.frame(S=S,T=T,r=r,h=h,P=P)
S T r h Per
1 906.65 0.1971253 0.013975 0.001332596 3
2 906.65 0.1971253 0.013975 0.001248470 1
3 906.65 0.1971253 0.013975 0.001251458 5
4 906.65 0.1971253 0.013975 0.001242143 2
5 906.65 0.1971253 0.013975 0.001257921 1
6 906.65 0.1971253 0.013975 0.001235755 4
7 906.65 0.1971253 0.013975 0.001238440 2
The parameters are :
w=0.001; b=0.2; a=0.0154; c=0.0000052; neta=-0.70
I have the following code of the function that I want to use :
F<-function(x,w,b,a,c,neta,S,T,r,P){
u=1i*x
nu=(1/(neta^2))*(((1-2*neta)^(1/2))-1)
# Recursion back to time t
# Terminal condition for the A and B
A_Q=0
B_Q=0
steps<-round(T*250,0)
for (j in 1:steps){
A_Q= A_Q+ r*u + w*B_Q-(1/2)*log(1-2*a*(neta^4)*B_Q)
B_Q= b*B_Q+u*nu+ (1/neta^2)*(1-sqrt((1-2*a*(neta^4)*B_Q)*( 1- 2*c*B_Q - 2*u*neta)))
}
F= exp(log(S)*u + A_Q + B_Q*h[P])
return(F)
}
S = A$S ; r= A$r ; T= A$T ; P=A$P; h= A$h
Then I want to apply the previous function using my Data.set a the vector of length N= 100000 :
Z=length(S); N=100000 ; alpha=2 ; delta= 0.25
lambda=(2*pi)/(N*delta)
res = matrix(nrow=N, ncol=Z)
for (i in 1:N){
for (j in 1:Z){
res[i,j]= Re(F(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j]))
}
}
But it is taking a lot of time: it takes 20 seconds to execute this line of code for N=100 but I want to execute it for N= 100000 times, the overall run time can take hours. How to fine tune the above code using Rcpp, to reduce the execution time and to obtain an Efficient program?
Is it possible to reduce the execution time and if so, please suggest me a solution even with out Rcpp.
Thanks.
Your function F can be converted to C++ pretty easily by taking advantage of the vec and cx_vec classes in the Armadillo library (accessed through the RcppArmadillo package) - which has great support for vectorized calculations.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::cx_vec Fcpp(const arma::cx_vec& x, double w, double b, double a, double c,
double neta, const arma::vec& S, const arma::vec& T,
const arma::vec& r, Rcpp::IntegerVector P, Rcpp::NumericVector h) {
arma::cx_vec u = x * arma::cx_double(0.0,1.0);
double nu = (1.0/std::pow(neta,2.0)) * (std::sqrt(1.0-2.0*neta)-1.0);
arma::cx_vec A_Q(r.size());
arma::cx_vec B_Q(r.size());
arma::vec steps = arma::round(T*250.0);
for (size_t j = 0; j < steps.size(); j++) {
for (size_t k = 0; k < steps[j]; k++) {
A_Q = A_Q + r*u + w*B_Q -
0.5*arma::log(1.0 - 2.0*a*std::pow(neta,4.0)*B_Q);
B_Q = b*B_Q + u*nu + (1.0/std::pow(neta,2.0)) *
(1.0 - arma::sqrt((1.0 - 2.0*a*std::pow(neta,4.0)*B_Q) *
(1.0 - 2.0*c*B_Q - 2.0*u*neta)));
}
}
arma::vec hP = Rcpp::as<arma::vec>(h[P-1]);
arma::cx_vec F = arma::exp(arma::log(S)*u + A_Q + B_Q*hP);
return F;
}
Just a couple of minor changes to note:
I'm using arma:: functions for vectorized calculations, such as arma::log, arma::exp, arma::round, arma::sqrt, and various overloaded operators (*, +, -); but using std::pow and std::sqrt for scalar calculations. In R, this is abstracted away from us, but here we have to distinguish between the two situations.
Your function F has one loop - for (i in 1:steps) - but the C++ version has two, just due to the differences in loop semantics between the two languages.
Most of the input vectors are arma:: classes (as opposed to using Rcpp::NumericVector and Rcpp::ComplexVector), the exception being P and h, since Rcpp vectors offer R-like element access - e.g. h[P-1]. Also notice that P needs to be offset by 1 (0-based indexing in C++), and then converted to an Armadillo vector (hP) using Rcpp::as<arma::vec>, since your compiler will complain if you try to multiply a cx_vec with a NumericVector (B_Q*hP).
I added a function parameter h - it's not a good idea to rely on the existence of a global variable h, which you were doing in F. If you need to use it in the function body, you should pass it into the function.
I changed the name of your function to Fr, and to make benchmarking a little easier, I just wrapped your double loop that populates the matrix res into the functions Fr and Fcpp:
loop_Fr <- function(mat = res) {
for (i in 1:N) {
for (j in 1:Z) {
mat[i,j]= Re(Fr(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j],h))
}
}
return(mat)
}
loop_Fcpp <- function(mat = res) {
for (i in 1:N) {
for (j in 1:Z) {
mat[i,j]= Re(Fcpp(((delta*(i-1))-(alpha+1)*1i),w,b,a,c,neta,S[j],T[j],r[j],P[j],h))
}
}
return(mat)
}
##
R> all.equal(loop_Fr(),loop_Fcpp())
[1] TRUE
I compared the two functions for N = 100, N = 1000, and N = 100000 (which took forever) - adjusting lambda and res accordingly, but keeping everything else the same. Generally speaking, Fcpp is about 10x faster than Fr on my computer:
N <- 100
lambda <- (2*pi)/(N*delta)
res <- matrix(nrow=N, ncol=Z)
##
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=50L)
Unit: milliseconds
expr min lq median uq max neval
loop_Fr() 142.44694 146.62848 148.97571 151.86318 186.67296 50
loop_Fcpp() 14.72357 15.26384 15.58604 15.85076 20.19576 50
N <- 1000
lambda <- (2*pi)/(N*delta)
res <- matrix(nrow=N, ncol=Z)
##
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=50L)
Unit: milliseconds
expr min lq median uq max neval
loop_Fr() 1440.8277 1472.4429 1491.5577 1512.5636 1565.6914 50
loop_Fcpp() 150.6538 153.2687 155.4156 158.0857 181.8452 50
N <- 100000
lambda <- (2*pi)/(N*delta)
res <- matrix(nrow=N, ncol=Z)
##
R> microbenchmark::microbenchmark(loop_Fr(), loop_Fcpp(),times=2L)
Unit: seconds
expr min lq median uq max neval
loop_Fr() 150.14978 150.14978 150.33752 150.52526 150.52526 2
loop_Fcpp() 15.49946 15.49946 15.75321 16.00696 16.00696 2
Other variables, as presented in your question:
S <- c(906.65,906.65,906.65,906.65,906.65,906.65,906.65)
T <- c(0.1371253,0.1457896,0.1248953,0.1261278,0.1156931,0.0985253,0.1332596)
r <- c(0.013975,0.013975,0.013975,0.013975,0.013975,0.013975,0.013975)
h <- c(0.001332596,0.001248470,0.001251458,0.001242143,0.001257921,0.001235755,0.001238440)
P <- c(3,1,5,2,1,4,2)
w <- 0.001; b <- 0.2; a <- 0.0154; c <- 0.0000052; neta <- (-0.70)
Z <- length(S)
alpha <- 2; delta <- 0.25