How to use OpenMP in Rcpp when calling functions from headers? - r

I have created many functions that I want to called in my current .cpp script. As usually, I do it by calling the headers (eg: my_function.h). The issue is that when I want to use OpenMP, my R session blows-up.
#include <Rcpp.h>
#include "my_function.h"
using namespace Rcpp;
#ifdef _OPENMP
#include <omp.h>
#endif
NumericMatrix m_shortwave_extra(NumericMatrix latitude,
NumericMatrix longitude,
double time_zone,
double year,
double month,
double day,
double time,
NumericMatrix slope,
NumericMatrix orientation,
double S = 1364.0,
int threads = 1){
int n_it = latitude.ncol();
int n_row = latitude.nrow();
NumericMatrix shortwave_out(n_row, n_it);
#ifdef _OPENMP
if ( threads > 0 )
omp_set_num_threads( threads );
#endif
#pragma omp parallel for
for(int i = 0; i < n_it; i++){
shortwave_out(_, i) =
my_function (latitude(_, i),
longitude(_, i),
time_zone,
year,
month,
day,
time,
slope(_, i),
orientation(_, i),
S);
}
return shortwave_out;
}
I would like to know what I am doing wrong when using OpenMP (the routine works fine when I am not parallelizing).

Here is how I do in one of my packages. First, here are the contents of the Makevars file:
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS)
CXX_STD = CXX11
PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
To set the number of threads (unsigned int nthreads in my code), I do:
#ifdef _OPENMP
#pragma omp parallel for num_threads(nthreads)
#endif
this directive is then followed by a for loop.
This is all I have related to openmp in my code, and this works.
I think you don't need to do a check for threads > 0. Just use threads = 1.

Related

how to create a Rcpp NumericVector from Eigen::Tensor without copying underlying data

If I create a large Tensor in Eigen, and I like to return the Tensor back to R as multi-dimension array. I know how to do it with data copy like below. Question: is it possible to do it without the data-copy step?
#include <Rcpp.h>
#include <RcppEigen.h>
#include <unsupported/Eigen/CXX11/Tensor>
// [[Rcpp::depends(RcppEigen)]]
using namespace Rcpp;
template <typename T>
NumericVector copyFromTensor(const T& x)
{
int n = x.size();
NumericVector ans(n);
IntegerVector dim(x.NumDimensions);
for (int i = 0; i < x.NumDimensions; ++i) {
dim[i] = x.dimension(i);
}
memcpy((double*)ans.begin(), (double*)x.data(), n * sizeof(double));
ans.attr("dim") = dim;
return ans;
}
// [[Rcpp::export]]
NumericVector getTensor() {
Eigen::Tensor<double, 3> x(4, 3, 1);
x.setRandom();
return copyFromTensor(x);
}
/*** R
getTensor()
*/
As a general rule you can zero-copy one the way into your C++ code with data coming from R and already managed by R.
On the way out of your C++ code with data returning to R anything that is not created used the R allocator has to be copied.
Here your object x is a stack-allocated so you need a copy. See Writing R Extensions about the R allocator; Eigen may let you use it when you create a new Tensor object. Not a trivial step. I think I would just live with the copy.

RcppArmadillo's sample() is ambiguous after updating R

I commonly work with a short Rcpp function that takes as input a matrix where each row contains K probabilities that sum to 1. The function then randomly samples for each row an integer between 1 and K corresponding to the provided probabilities. This is the function:
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadilloExtensions/sample.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
int n = x.nrow();
IntegerVector result(n);
for ( int i = 0; i < n; ++i ) {
result[i] = RcppArmadillo::sample(choice_set, 1, false, x(i, _))[0];
}
return result;
}
I recently updated R and all packages. Now I cannot compile this function anymore. The reason is not clear to me. Running
library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp("sample_matrix.cpp")
throws the following error:
error: call of overloaded 'sample(Rcpp::IntegerVector&, int, bool, Rcpp::Matrix<14>::Row)' is ambiguous
This basically tells me that my call to RcppArmadillo::sample() is ambiguous. Can anyone enlighten me as to why this is the case?
There are two things happening here, and two parts to your problem and hence the answer.
The first is "meta": why now? Well we had a bug let in the sample() code / setup which Christian kindly fixed for the most recent RcppArmadillo release (and it is all documented there). In short, the interface for the very probability argument giving you trouble here was changed as it was not safe for re-use / repeated use. It is now.
Second, the error message. You didn't say what compiler or version you use but mine (currently g++-9.3) is actually pretty helpful with the error. It is still C++ so some interpretative dance is needed but in essence it clearly stating you called with Rcpp::Matrix<14>::Row and no interface is provided for that type. Which is correct. sample() offers a few interface, but none for a Row object. So the fix is, once again, simple. Add a line to aid the compiler by making the row a NumericVector and all is good.
Fixed code
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
int n = x.nrow();
IntegerVector result(n);
for ( int i = 0; i < n; ++i ) {
Rcpp::NumericVector z(x(i, _));
result[i] = RcppArmadillo::sample(choice_set, 1, false, z)[0];
}
return result;
}
Example
R> Rcpp::sourceCpp("answer.cpp") # no need for library(Rcpp)
R>

using global SEXP to store XPtr gives garbage values

I am building a package which solves Ordinary Differential Equations (ODEs) using the CVODE C routine (part of the SUNDIALS C library).
The package works if the user supplies a function which calculates derivatives and has the following form
#include <Rcpp.h>
using namespace Rcpp;
#include <cvode/cvode.h> /* prototypes for CVODE fcts., consts. */
#include <nvector/nvector_serial.h> /* serial N_Vector types, fcts., macros */
int test (realtype t, N_Vector y, N_Vector ydot, void *user_data){
// test function
NV_Ith_S(ydot,0) = 1*NV_Ith_S(y,0);
NV_Ith_S(ydot,1) = 2*NV_Ith_S(y,1);
NV_Ith_S(ydot,2) = 3*NV_Ith_S(y,2);
return(0);
}
typedef int (*funcPtr)(realtype t, N_Vector y, N_Vector ydot, void *user_data);
// [[Rcpp::export]]
XPtr<funcPtr> putFunPtrInXPtr() {
// return(XPtr<funcPtr> (new funcPtr(&test)));
XPtr<funcPtr> testptr(new funcPtr(&test), false);
return testptr;
A function pointer, i.e. my_fun <- putFunPtrInXPtr() is formed in R and my_fun is provided to the cvode function from the package (cvode inputs my_fun as an SEXP, see code here). This works, i.e. gives the right results (see detailed instructions here). However, this requires that user has SUNDIALS installed on their system (to access cvode.h and nvector_serial.h).
I am trying to make the package so the user does not need to have SUNDIALS installed. So, the function to get derivatives (and generate function pointer) will be as below
#include <Rcpp.h>
using namespace Rcpp;
//---------------------------------------------------------------------------------
typedef NumericVector (*funcPtr1) (double t, NumericVector y, NumericVector ydot);
//---------------------------------------------------------------------------------
// [[Rcpp::export]]
NumericVector test1 (double t, NumericVector y, NumericVector ydot){
ydot[0] = 1 * y[0];
ydot[1] = 2 * y[1];
ydot[2] = 3 * y[2];
return ydot;
}
// [[Rcpp::export]]
XPtr<funcPtr1> putFunPtrInXPtr1() {
XPtr<funcPtr1> testptr1(new funcPtr1(&test1), false);
return testptr1;
}
On the R side, my_fun1 <- putFunPtrInXPtr1() will be run and my_fun1 will be provided to cvode_test (a test function defined in package to be able to handle derivative functions defined using NumericVector only.
In my package to convert function pointer type from XPtr<funcPtr1>to XPtr<funcPtr>, I do the following
1) A global SEXP (sexp_g) is defined outside any function
2) In cvode_test the input SEXP is assigned to sexp_g
3) Finally, a function fun_test1 is defined as follows, which converts N_Vector to NumericVector, uses the function in sexp_g to get derivatives and then puts them again in an N_Vector, i.e.
typedef int (*funcPtr_test)(double time, NumericVector y, NumericVector ydot);
SEXP sexp_g; // declare a global SEXP
int fun_test1(realtype t, N_Vector y, N_Vector ydot, void* user_data){
// convert y to NumericVector y1
int y_len = NV_LENGTH_S(y);
NumericVector y1(y_len); // filled with zeros
for (int i = 0; i < y_len; i++){
y1[i] = NV_Ith_S(y,i);
}
// use function pointer to get the derivatives
XPtr<funcPtr_test> xpfun(sexp_g);
funcPtr_test fun_test = *xpfun;
NumericVector ydot1(y1.length());
ydot1 = fun_test(t, y1, ydot1);
// convert ydot1 to N_Vector ydot
// N_Vector ydot; ydot = NULL;
ydot = N_VNew_Serial(ydot1.length());
for (int i = 0; i<ydot1.length(); i++){
NV_Ith_S(ydot, i) = ydot1[i];
}
return (0);
}
Finally, in cvode_test, this fun_test1 is used as follows
flag = CVodeInit(cvode_mem, fun_test1, T0, y0);
This also compiles but the issue is when I supply my_fun1 (i.e., a pointer to test1) to cvode_test for integration, I get garbage values back. I am not sure what is going wrong here, I have read a few articles about protecting SEXP (i.e., the one here) but I don't know how to implement it here.
Any help regarding what is wrong here, and if there is a better approach than a global SEXP variable will be helpful. Since the rhs function for CVOdeInit has to have the following signature
int RHS_function (N_Vector, N_Vector, void*);
I am finding it impossible to insert information in a function defined as above from a function defined outside the package using a different signature (using an SEXP), other than using a global variable and have been struggling with this problem for quite some time. Any help would be much appreciated!
A different approach that I have not tried is 1) declaring XPtr<funPtr_test> outside any function, 2) unwrapping SEXP to XPtr<funcPtr_test> inside the cvode_test and assigning it to the pointer declared outside the function. But I am struggling with the syntax to declare a function pointer of the type XPtr<funcPtr_test>.
The full code for cvode and cvode_test can be found here
Thanks

Cannot convert Rcpp::sugar::Plus_Vector_Primitive to Rcpp::traits::storage_type

I am basically trying to translate some R code into cpp using Rcpp. I got the following errors in the code below:
error: cannot convert ‘Rcpp::sugar::Plus_Vector_Primitive<14, true, Rcpp::stats::D2<14, true, Rcpp::Vector<14, Rcpp::PreserveStorage> > >’ to ‘Rcpp::traits::storage_type<14>::type {aka double}’ in assignment
Here is the code
#include <RcppArmadillo.h>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export(".loop_exp")]]
void mm_mult(const arma::vec& helpa, const arma::mat& helpb, const arma::vec& helpc,
const Rcpp::NumericVector& t1, const arma::vec& t2, int J, Rcpp::NumericVector& prob)
{
int j;
for (j = 1; J <= J; j++)
{
arma::mat t = (helpb.row(j)).t() * (t2);
double tt = t[0,0];
prob[j] = (helpa[j] + dnorm(t1, tt, helpc[j])); <---- here is the error
}
return;
}
I guess this is a type casting error, but basically I cannot find a good reference.. Could anyone give me some help on this issue? Thanks a lot!
The reason is that the dnorm "syntaxic sugar" has a signature NumericVector dnorm( NumericVector, double, double ).
As it returns a NumericVector, you have to convert it to a double value yourself.
A quick and easy (and not very robust) way is to subset the returned vector to get only its first element. In your example:
prob[j] = (helpa[j] + dnorm(t1, tt, helpc[j])[0]); // Note the "[0]"
Otherwise, there are a couple of other potential problems in your code: you shouldn't #include <Rcpp.h> as it is already done with #include <RcppArmadillo.h> --- also, your loop ending condition, J <= J, looks suspicious to me...
Hope this helps :)

RcppArmadillo::sample function causes RStudio and R to crash

While using RcppArmadillo::sample function I discovered that using a large input vector causes RStudio to crash. I provide the entire code below:
#include<iostream>
#include <armadillo>
#include <RcppArmadilloExtensions/sample.h>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace std;
using namespace Rcpp;
using namespace arma;
//[[Rcpp::export]]
IntegerVector test_func(int N) {
IntegerVector frame = Range(1, N);
NumericVector wts = runif(N, 0, 1);
NumericVector Wts = wts / sum(wts);
IntegerVector y = RcppArmadillo::sample(frame, N,TRUE, Wts );
return y;
}
Calling test_func(N=100) produces the right results. But N greater that 200, for instance test_func(N=210), crashes RStudio as well as RConsole. Is there a mistake I am making?
I cannot replicate this. On either a straight R ression or inside RStudio it just works for me.
I made small corrections to your code:
// the following header also include Rcpp and Armadillo headers
#include <RcppArmadilloExtensions/sample.h>
//[[Rcpp::depends(RcppArmadillo)]]
//[[Rcpp::export]]
Rcpp::IntegerVector test_func(int N) {
Rcpp::IntegerVector frame = Rcpp::Range(1, N);
Rcpp::NumericVector wts = Rcpp::runif(N, 0.0, 1.0);
return Rcpp::RcppArmadillo::sample(frame, N, true, wts / Rcpp::sum(wts));
}
but none of these should be material.
Note how the code of the sample() function throws an excpetion if N gets big:
if (walker_test < 200) {
ProbSampleReplace<IntegerVector>(index, nOrig, size, prob);
} else {
throw std::range_error("Walker Alias method not implemented. [...]");
}
so I think you may be seeing a garden variety error of a mismatch between R, Rcpp, RcppArmadillo. What platform are you on? For it is Linux where packages are recompiled.

Resources