sorting columns of Rcpp NumericMatrix for median calculations - r

I've been testing Rcpp and RcppArmadillo for calculating summary stats on big matrices. This was a lot faster (5 or 10 times faster) than the base R colMeans or the the Armadillo on ~4million rows, 45 columns.
colMeansRcpp <- cxxfunction(signature(X_="integer"),
plugin='Rcpp',
body='
Rcpp::IntegerMatrix X = X_;
int ncol = X.ncol(); int nrow = X.nrow();
Rcpp::NumericVector out(ncol);
for(int col = 0; col < ncol; col++){
out[col]=Rcpp::sum(X(_, col));
}
return wrap(out/nrow);
')
I really want to calculate the median and maybe other quantiles for plotting - and because it requires a sort its even more needy of C++ outsourcing. The armadillo seems a bit slow so I wanted to do an in place sort on code similar to above but I just cant get the syntax right... here is what I'm trying..
# OK I'm aware this floor(nrow/2) is not **absolutely** correct
# I'm simplifying here
colMedianRcpp <- cxxfunction(signature(X_="integer"),
plugin='Rcpp',
body='
Rcpp::IntegerMatrix X = clone(X_);
int ncol = X.ncol(); int nrow = X.nrow();
Rcpp::NumericVector out(ncol);
for(int col = 0; col < ncol; col++){
X(_,col)= std::sort((X_,col).begin, (X_,col).end));
out[col]=X(floor(nrow/2), col));
}
return wrap(out);
')
Basically it's the line
X(_,col)= std::sort((X_,col).begin, (X_,col).end));
I don't know how to express "sort a column in place" with this mixture of Rcpp sugar and std C++. Sorry I can see what I'm doing is wrong but a hint on the right syntax would be lovely.
ps Am I right I need to do this clone() so I don't change the R object?
EDIT
I add the RcppArmadillo code and a benchmark comparison to address the answer/comment below. the benchmark was only on 50k rows for a quick reply but I recall it was similar with many more. I realise you are the Rcpp author.. so many thanks for your time!
The thought occurs that perhaps I'm doing something daft with the RcppArmadillo code to make it run so much slower than the base colMeans or Rcpp version?
colMeansRcppArmadillo <- cxxfunction(signature(X_="integer"),
plugin="RcppArmadillo",
body='
arma::mat X = Rcpp::as<arma::mat > (X_);
arma::rowvec MD= arma::mean(X, 0);
return wrap(MD);
')
And the benchmark is ...
(mb = microbenchmark(
+ colMeans(fqSmallMatrix),
+ colMeansRcpp(fqSmallMatrix),
+ colMeansRcppArmadillo(fqSmallMatrix),
+ times=50))
Unit: milliseconds
expr min lq median uq max neval
colMeans(fqSmallMatrix) 10.620919 10.63289 10.640819 10.648882 10.907145 50
colMeansRcpp(fqSmallMatrix) 2.649038 2.66832 2.676709 2.700839 2.841012 50
colMeansRcppArmadillo(fqSmallMatrix) 25.687067 26.23488 33.168589 33.792489 113.832495 50

You can copy the column into a new vector with
NumericVector y = x(_,j);
Complete example:
library(Rcpp)
cppFunction('
NumericVector colMedianRcpp(NumericMatrix x) {
int nrow = x.nrow();
int ncol = x.ncol();
int position = nrow / 2; // Euclidian division
NumericVector out(ncol);
for (int j = 0; j < ncol; j++) {
NumericVector y = x(_,j); // Copy the column -- the original will not be modified
std::nth_element(y.begin(), y.begin() + position, y.end());
out[j] = y[position];
}
return out;
}
')
x <- matrix( sample(1:12), 3, 4 )
x
colMedianRcpp(x)
x # Unchanged

You are not actually showing RcppArmadillo code -- I have been quite happy with the performance of RcppArmadillo code where I needed row/col column subsetting.
You can instantiate Armadillo matrices via Rcpp just about as efficiently (no copy, re-using R object memory) so I would try that.
And you: you want clone() for a distinct copy, and I think you'd get that for free if you use the default RcppArmadillo ctor (rather than the more efficient two-step).
Edit a few hours later
You had left an open question about why your Armadillo was slow. In the meantime, Vincent solved the issue for you but here is a revisited, cleaner solution using your code as well as Vincent's.
Now how it instantiates the Armadillo matrix without copy -- so it is faster. And it also avoids mixing integer and numeric matrices. The code first:
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericVector colMedianRcpp(NumericMatrix x) {
int nrow = x.nrow();
int ncol = x.ncol();
int position = nrow / 2; // Euclidian division
NumericVector out(ncol);
for (int j = 0; j < ncol; j++) {
NumericVector y = x(_,j); // Copy column -- original will not be mod
std::nth_element(y.begin(), y.begin() + position, y.end());
out[j] = y[position];
}
return out;
}
// [[Rcpp::export]]
arma::rowvec colMeansRcppArmadillo(NumericMatrix x){
arma::mat X = arma::mat(x.begin(), x.nrow(), x.ncol(), false);
return arma::mean(X, 0);
}
// [[Rcpp::export]]
NumericVector colMeansRcpp(NumericMatrix X) {
int ncol = X.ncol();
int nrow = X.nrow();
Rcpp::NumericVector out(ncol);
for (int col = 0; col < ncol; col++){
out[col]=Rcpp::sum(X(_, col));
}
return wrap(out/nrow);
}
/*** R
set.seed(42)
X <- matrix(rnorm(100*10), 100, 10)
library(microbenchmark)
mb <- microbenchmark(colMeans(X), colMeansRcpp(X), colMeansRcppArmadillo(X),
colMedianRcpp(X), times=50)
print(mb)
*/
And here is the result on my machine, with the concise Armadillo version about as fast as yours, and median a little slower as it has to do more work:
R> sourceCpp("/tmp/stephen.cpp")
R> set.seed(42)
R> X <- matrix(rnorm(100*10), 100, 10)
R> library(microbenchmark)
R> mb <- microbenchmark(colMeans(X), colMeansRcpp(X), colMeansRcppArmadillo(X),
+ colMedianRcpp(X), times=50)
R> print(mb)
Unit: microseconds
expr min lq median uq max neval
colMeans(X) 9.469 10.422 11.5810 12.421 30.597 50
colMeansRcpp(X) 3.922 4.281 4.5245 5.306 18.020 50
colMeansRcppArmadillo(X) 4.196 4.549 4.9295 5.927 11.159 50
colMedianRcpp(X) 15.615 16.291 16.7290 17.971 27.026 50
R>

Related

Translate outer() from base R to RcppArmadillo

Is there any way to efficiently translate the outer() function for multiplication of two vectors from R base to RcppArmadillo? I attempted to do so but it is not efficient at all.
Take the following example:
library(Rcpp)
library(RcppArmadillo)
library(microbenchmark)
#Outer attempt
cppFunction(depends = "RcppArmadillo",
'
arma::mat outer_rcpp(arma::vec x, arma::vec y) {
int x_length = x.n_elem;
int y_length = y.n_elem;
arma::mat final(x_length, y_length);
// And use loops instead of outer
for(int i = 0; i < x_length; i++) {
final.col(i) = x[i] * y;
}
return(final);
}
'
)
#Test for equal results
a <- rnorm(5)
base <- base::outer(a, a)
rcpp <- outer_rcpp(a, a)
all.equal(base, rcpp)
#Test for speed
b <- rnorm(5000)
microbenchmark(base = base::outer(b, b),
rcpp = outer_rcpp(b, b), times = 10)
The results are 2 times slower using R base. I am sure that this can be done though matrix multiplication, any idea how?
As #thelatemail pointed out in the comments, the outer routine is already using a heavily optimized C routine.
src/library/base/R/outer.R: tcrossprod usage.
src/main/array.c: underlying C routine powering the tcrossprod computation.
Armadillo itself has its own optimization for addressing matrix multiplication using the dgemm and dgemv routines from LAPACK:
armadillo_bits/mul_gemm.hpp: C := alpha*op( A )op( B ) + betaC,
armadillo_bits/mul_gemv.hpp: y := alphaAx + betay, or y := alphaA**Tx + betay,
Playing around with the outerproduct calculations leads to a few optimizations. Mainly, we're opting to move the outer product into armadillo actions instead of loops:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::mat outer_rcpp(const arma::vec& x, const arma::vec& y) {
int x_length = x.n_elem;
int y_length = y.n_elem;
arma::mat final(x_length, y_length);
// And use loops instead of outer
for(int i = 0; i < x_length; i++) {
final.col(i) = x[i] * y;
}
return final;
}
// [[Rcpp::export]]
arma::mat outer_with_armadillo(const arma::vec& x, const arma::vec& y) {
arma::mat final = x*y.t();
return final;
}
// [[Rcpp::export]]
arma::mat outer_with_armadillo_transposed(const arma::vec& x, const arma::rowvec& y) {
arma::mat final = x*y;
return final;
}
Revisiting the benchmarking code, we have:
b = rnorm(5000)
b_tranposed = t(b)
bench_results = microbenchmark::microbenchmark(base = base::outer(b, b),
outer_armadillo_loop = outer_rcpp(b, b),
outer_armadillo_optimized = outer_with_armadillo(b, b),
outer_armadillo_optimized_transposed = outer_with_armadillo_transposed(b, b_tranposed), times = 10)
bench_results
expr
min
lq
mean
median
uq
max
neval
base
132.8601
141.3532
156.9979
146.7993
154.8954
234.2619
10
outer_armadillo_loop
278.4115
279.9204
317.7907
288.4212
329.0769
451.6872
10
outer_armadillo_optimized
272.4348
283.3380
347.7913
304.1181
339.3282
728.2264
10
outer_armadillo_optimized_transposed
269.7855
270.7108
297.9580
279.8099
312.3488
386.4270
10
From the results, the lowest I could achieve is having a pre-transposed b vector from column vector form into row-vector form: (n x 1) * (1 x m)

Allocate Rcpp List of n NumericMatrix

Is there a way to allocate an Rcpp List of length n, where each element of the List will be filled with a NumericMatrix, but the size of each NumericMatrix can change?
I have an idea for doing this using std::list and push_back(), but the size of the list may be quite large and I want to avoid the overhead of creating an extra copy of the list when I return from the function.
The below R code gives an idea of what I hope to do:
myvec = function(n) {
x = vector("list", n)
for (i in seq_len(n)) {
nc = sample(1:3, 1)
nr = sample(1:3, 1)
x[[i]] = matrix(rbinom(nc * nr, size = 1, prob = 0.5),
nrow = nr, ncol = nc)
}
x
}
This could result in something like:
> myvec(2)
[[1]]
[,1]
[1,] 0
[2,] 1
[[2]]
[,1] [,2] [,3]
[1,] 0 1 0
[2,] 0 1 1
Update: based on the comments of #Dirk and #Ralf, I created functions based on Rcpp::List and std::list with a wrap at the end. Speed comparisons don't seem to favor one version over the other, but perhaps there's an inefficiency I'm not aware of.
src = '
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::List myvec(int n) {
Rcpp::RNGScope rngScope;
Rcpp::List x(n);
// Rcpp::IntegerVector choices = {1, 2 ,3};
Rcpp::IntegerVector choices = Rcpp::seq_len(50);
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x(i) = Rcpp::NumericMatrix(nc, nr, entries.begin());
}
return x;
}
// [[Rcpp::export]]
Rcpp::List myvec2(int n) {
Rcpp::RNGScope scope;
std::list< Rcpp::NumericMatrix > x;
// Rcpp::IntegerVector choices = {1, 2 ,3};
Rcpp::IntegerVector choices = Rcpp::seq_len(50);
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x.push_back( Rcpp::NumericMatrix(nc, nr, entries.begin()));
}
return Rcpp::wrap(x);
}
'
sourceCpp(code = src)
Resulting benchmarks on my computer are:
> library(microbenchmark)
> rcpp_list = function() {
+ set.seed(10);myvec(105)
+ }
> std_list = function() {
+ set.seed(10);myvec2(105)
+ }
> microbenchmark(rcpp_list(), std_list(), times = 1000)
Unit: milliseconds
expr min lq mean median uq
rcpp_list() 1.8901 1.92535 2.205286 1.96640 2.22380
std_list() 1.9164 1.95570 2.224941 2.00555 2.32315
max neval cld
7.1569 1000 a
7.1194 1000 a
The fundamental issue that Rcpp objects are R objects governed my R's memory management where resizing is expensive: full copies.
So when I have tasks similar to yours where sizes may change, or are unknown, I often work with different data structures -- the STL gives us plenty -- and only convert to R(cpp) at the return step at the end.
The devil in the detail here (as always). Profile, experiment, ...
Edit: And in the narrower sense of "can we return a List of NumericMatrix objects with varying sizes" the answer is of course we can because that is what List objects do. You can also insert other types.
As Dirk said, it is of course possible to create a list with matrices of different size. To make it a bit more concrete, here a translation of your R function:
#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
Rcpp::List myvec(int n) {
Rcpp::List x(n);
Rcpp::IntegerVector choices = {1, 2 ,3};
for (int i = 0; i < n; ++i) {
int nc = Rcpp::sample(choices, 1).at(0);
int nr = Rcpp::sample(choices, 1).at(0);
Rcpp::NumericVector entries = Rcpp::rbinom(nc * nr, 1, 0.5);
x(i) = Rcpp::NumericMatrix(nc, nr, entries.begin());
}
return x;
}
/***R
myvec(2)
*/
The main difference to the R code are the explicitly named vectors choices and entries, which are only implicit in the R code.

Rcpp max of vector except one element

In Rcpp I want to find the maximum of a vector, but I want to omit one element.
I have working code, but I'm sure my approach is quite bad as it involves the full copy of the vector. Is there a much better way to accomplish what I want?
In R:
vec <- 1:10
ele <- 3
max(vec[-ele])
My (terrible) version in Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double my_fun(NumericVector vec, int ele) {
NumericVector vec_no_ele = clone(vec);
vec_no_ele.erase(ele);
return max(vec_no_ele);
}
Under the hood, max is implemented as a humble for loop. You shouldn't shy away from for loops in c++ since there is much less overhead compared to R. In this case, a for loop does significantly better than using the built in:
// Coatless answer:
// [[Rcpp::export]]
double max_no_copy(NumericVector vec, int ele) {
double temp = vec[ele-1];
vec[ele-1] = vec[0];
double result_max = max(vec);
vec[ele-1] = temp;
return result_max;
}
// humble for loop
// [[Rcpp::export]]
double max_except_for(NumericVector vec, int ele) {
int vs = vec.size();
double res = 0;
for(int i=0; i<vs; i++) {
if( i == ele-1 ) continue;
if(vec[i] > res) res = vec[i];
}
return res;
}
R side:
x <- rnorm(1e8)
x[1000] <- 1e9
microbenchmark(max_except_for(x, 1000), max_no_copy(x, 1000), times=5)
Unit: milliseconds
expr min lq mean median uq max neval cld
max_except_for(x, 1000) 87.58906 93.56962 92.5092 93.59754 93.6262 94.16361 5 a
max_no_copy(x, 1000) 284.46662 292.57627 296.3772 296.78390 300.5345 307.52455 5 b
identical(max_except_for(x, 1000), max_no_copy(x, 1000)) # TRUE
#Spacemen is suggesting the following approach in comments:
Save the value you want to omit in a temp variable. Set that element to zero or some small value or the same as another value in the vector. Compute the max. Reset the element from the temp variable.
This would be implemented like so:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double max_no_copy(NumericVector vec, int ele) {
double temp = vec[ele];
// Use a value already in vector
vec[ele] = vec[0];
// Find max value
double result_max = max(vec);
// Remove NA value
vec[ele] = temp;
return result_max;
}
Test:
vec <- 1:10
ele <- 2 # C++ indices start at 0 not 1. So, subtract.
max_no_copy(vec, ele)
# [1] 10
Benchmark to be added later...
The answer from #coatless is great and builds on the other generous commenters above. However, it can be further generalized. #coatless uses the value in vec[0] as the placeholder for the value to be omitted, but this fails when the value to be omitted is element 0!
Here's a slightly generalized solution, where I use the adjacent element to ele as the placeholder, and I check that ele is in the index of vec and that vec.length() is greater than 1:
// calculate the max of a vector after omitting one element
double max_except(NumericVector vec, int ele) {
if (vec.length() == 1) stop("vec too short");
if (ele < 0 | ele > vec.length()-1) stop("ele out of range");
double temp = vec[ele];
int idx = (ele > 0) ? ele-1 : ele+1;
vec[ele] = vec[idx];
double res = max(vec);
vec[ele] = temp;
return res;
}

Fibonacci number in R vs Rcpp

I was just trying to check the execution speed of Fiboncci number generation in R vs Rcpp. To my surprise, my R function was faster(also, linearly growing) than my Rcpp function. What is wrong here.
The R code:
fibo = function (n){
x = rep(0, n)
x[1] = 1
x[2] = 2
for(i in 3:n){
x[i] = x[i-2] + x[i-1]
}
return(x)
}
The Rcpp code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector fibo_sam(int n){
IntegerVector x;
x.push_back(1);
x.push_back(2);
for(int i =2; i < n; i++){
x.push_back(x[i - 2] + x[i-1]);
}
return(x);
}
The problem with your Rcpp code is that you are growing the vector instead of allocating the size at the beginning. Try with:
// [[Rcpp::export]]
IntegerVector fibo_sam2(int n) {
IntegerVector x(n);
x[0] = 1;
x[1] = 2;
for (int i = 2; i < n; i++){
x[i] = x[i-2] + x[i-1];
}
return(x);
}
Benchmark:
Unit: microseconds
expr min lq mean median uq max neval cld
fibo(1000) 99.989 102.6375 157.42543 103.962 106.9415 4806.395 100 a
fibo_sam(1000) 493.320 511.8615 801.39046 534.044 590.4945 2825.168 100 b
fibo_sam2(1000) 2.980 3.3110 10.18763 3.642 4.3040 573.443 100 a
PS1: check your first values
PS2: beware large numbers (see this)

Rcpp function for adding elements of a vector

I have a very long vector of parameters (approximately 4^10 elements) and a vector of indices. My aim is to add together all of the values of the parameters that are indexed in the indices vector.
For instance, if I had paras = [1,2,3,4,5,5,5] and indices = [3,3,1,6] then I would want to find the cumulative sum of the third value (3) twice, the first value (1) and the sixth (5), to get 12. There is additionally the option of warping the parameter values according to their location.
I am trying to speed up an R implementation, as I am calling it millions of times.
My current code always returns NA, and I can't see where it is going wrong
Here's the Rcpp function:
double dot_prod_c(NumericVector indices, NumericVector paras,
NumericVector warp = NA_REAL) {
int len = indices.size();
LogicalVector indices_ok;
for (int i = 0; i < len; i++){
indices_ok.push_back(R_IsNA(indices[i]));
}
if(is_true(any(indices_ok))){
return NA_REAL;
}
double counter = 0;
if(NumericVector::is_na(warp[1])){
for (int i = 0; i < len; i++){
counter += paras[indices[i]];
}
} else {
for (int i = 0; i < len; i++){
counter += paras[indices[i]] * warp[i];
}
}
return counter;
}
And here is the working R version:
dot_prod <- function(indices, paras, warp = NA){
if(is.na(warp[1])){
return(sum(sapply(indices, function(ind) paras[ind + 1])))
} else {
return(sum(sapply(1:length(indices), function(i){
ind <- indices[i]
paras[ind + 1] * warp[i]
})))
}
}
Here is some code for testing, and benchmarking using the microbenchmark package:
# testing
library(Rcpp)
library(microbenchmark)
parameters <- list()
indices <- list()
indices_trad <- list()
set.seed(2)
for (i in 4:12){
size <- 4^i
window_size <- 100
parameters[[i-3]] <- runif(size)
indices[[i-3]] <- floor(runif(window_size)*size)
temp <- rep(0, size)
for (j in 1:window_size){
temp[indices[[i-3]][j] + 1] <- temp[indices[[i-3]][j] + 1] + 1
}
indices_trad[[i-3]] <- temp
}
microbenchmark(
x <- sapply(1:9, function(i) dot_prod(indices[[i]], parameters[[i]])),
x_c <- sapply(1:9, function(i) dot_prod_c(indices[[i]], parameters[[i]])),
x_base <- sapply(1:9, function(i) indices_trad[[i]] %*% parameters[[i]])
)
all.equal(x, x_base) # is true, does work
all.equal(x_c, x_base) # not true - C++ version returns only NAs
I was having a little trouble trying to interpret your overall goal through your code, so I'm just going to go with this explanation
For instance, if I had paras = [1,2,3,4,5,5,5] and indices = [3,3,1,6]
then I would want to find the cumulative sum of the third value (3)
twice, the first value (1) and the sixth (5), to get 12. There is
additionally the option of warping the parameter values according to
their location.
since it was most clear to me.
There are some issues with your C++ code. To start, instead of doing this - NumericVector warp = NA_REAL - use the Rcpp::Nullable<> template (shown below). This will solve a few problems:
It's more readable. If you're not familiar with the Nullable class, it's pretty much exactly what it sounds like - an object that may or may not be null.
You won't have to make any awkward initializations, such as NumericVector warp = NA_REAL. Frankly I was surprised that the compiler accepted this.
You won't have to worry about accidentally forgetting that C++ uses zero-based indexing, unlike R, as in this line: if(NumericVector::is_na(warp[1])){. That has undefined behavior written all over it.
Here's a revised version, going off of your quoted description of the problem above:
#include <Rcpp.h>
typedef Rcpp::Nullable<Rcpp::NumericVector> nullable_t;
// [[Rcpp::export]]
double DotProd(Rcpp::NumericVector indices, Rcpp::NumericVector params, nullable_t warp_ = R_NilValue) {
R_xlen_t i = 0, n = indices.size();
double result = 0.0;
if (warp_.isNull()) {
for ( ; i < n; i++) {
result += params[indices[i]];
}
} else {
Rcpp::NumericVector warp(warp_);
for ( ; i < n; i++) {
result += params[indices[i]] * warp[i];
}
}
return result;
}
You had some elaborate code to generate sample data. I didn't take the time to go through this because it wasn't necessary, nor was the benchmarking. You stated yourself that the C++ version wasn't producing the correct results. Your first priority should be to get your code working on simple data. Then feed it some more complex data. Then benchmark. The revised version above works on simple data:
args <- list(
indices = c(3, 3, 1, 6),
params = c(1, 2, 3, 4, 5, 5, 5),
warp = c(.25, .75, 1.25, 1.75)
)
all.equal(
DotProd(args[[1]], args[[2]]),
dot_prod(args[[1]], args[[2]]))
#[1] TRUE
all.equal(
DotProd(args[[1]], args[[2]], args[[3]]),
dot_prod(args[[1]], args[[2]], args[[3]]))
#[1] TRUE
It's also faster than the R version on this sample data. I have no reason to believe it wouldn't be for larger, more complex data either - there's nothing magical or particularly efficient about the *apply functions; they are just more idiomatic / readable R.
microbenchmark::microbenchmark(
"Rcpp" = DotProd(args[[1]], args[[2]]),
"R" = dot_prod(args[[1]], args[[2]]))
#Unit: microseconds
#expr min lq mean median uq max neval
#Rcpp 2.463 2.8815 3.52907 3.3265 3.8445 18.823 100
#R 18.869 20.0285 21.60490 20.4400 21.0745 66.531 100
#
microbenchmark::microbenchmark(
"Rcpp" = DotProd(args[[1]], args[[2]], args[[3]]),
"R" = dot_prod(args[[1]], args[[2]], args[[3]]))
#Unit: microseconds
#expr min lq mean median uq max neval
#Rcpp 2.680 3.0430 3.84796 3.701 4.1360 12.304 100
#R 21.587 22.6855 23.79194 23.342 23.8565 68.473 100
I omitted the NA checks from the example above, but that too can be revised into something more idiomatic by using a little Rcpp sugar. Previously, you were doing this:
LogicalVector indices_ok;
for (int i = 0; i < len; i++){
indices_ok.push_back(R_IsNA(indices[i]));
}
if(is_true(any(indices_ok))){
return NA_REAL;
}
It's a little aggressive - you are testing a whole vector of values (with R_IsNA), and then applying is_true(any(indices_ok)) - when you could just break prematurely and return NA_REAL on the first instance of R_IsNA(indices[i]) resulting in true. Also, the use of push_back will slow down your function quite a bit - you would have been better off initializing indices_ok to the known size and filling it by index access in your loop. Nevertheless, here's one way to condense the operation:
if (Rcpp::na_omit(indices).size() != indices.size()) return NA_REAL;
For completeness, here's a fully sugar-ized version which allows you to avoid loops entirely:
#include <Rcpp.h>
typedef Rcpp::Nullable<Rcpp::NumericVector> nullable_t;
// [[Rcpp::export]]
double DotProd3(Rcpp::NumericVector indices, Rcpp::NumericVector params, nullable_t warp_ = R_NilValue) {
if (Rcpp::na_omit(indices).size() != indices.size()) return NA_REAL;
if (warp_.isNull()) {
Rcpp::NumericVector tmp = params[indices];
return Rcpp::sum(tmp);
} else {
Rcpp::NumericVector warp(warp_), tmp = params[indices];
return Rcpp::sum(tmp * warp);
}
}
/*** R
all.equal(
DotProd3(args[[1]], args[[2]]),
dot_prod(args[[1]], args[[2]]))
#[1] TRUE
all.equal(
DotProd3(args[[1]], args[[2]], args[[3]]),
dot_prod(args[[1]], args[[2]], args[[3]]))
#[1] TRUE
*/

Resources